Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome anno...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier Ltd.
2013
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/34337/ http://irep.iium.edu.my/34337/ http://irep.iium.edu.my/34337/ http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf |
id |
iium-34337 |
---|---|
recordtype |
eprints |
spelling |
iium-343372015-06-01T03:29:28Z http://irep.iium.edu.my/34337/ Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators Htike@Muhammad Yusof, Zaw Zaw Win, Shoon Lei Q Science (General) The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17 % in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes. Elsevier Ltd. 2013-11-09 Article PeerReviewed application/pdf en http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf Htike@Muhammad Yusof, Zaw Zaw and Win, Shoon Lei (2013) Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators. Procedia Computer Science, 23. pp. 60-67. ISSN 1877-0509 http://www.sciencedirect.com/science/article/pii/S1877050913011447 10.1016/j.procs.2013.10.009 |
repository_type |
Digital Repository |
institution_category |
Local University |
institution |
International Islamic University Malaysia |
building |
IIUM Repository |
collection |
Online Access |
language |
English |
topic |
Q Science (General) |
spellingShingle |
Q Science (General) Htike@Muhammad Yusof, Zaw Zaw Win, Shoon Lei Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators |
description |
The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17 % in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes. |
format |
Article |
author |
Htike@Muhammad Yusof, Zaw Zaw Win, Shoon Lei |
author_facet |
Htike@Muhammad Yusof, Zaw Zaw Win, Shoon Lei |
author_sort |
Htike@Muhammad Yusof, Zaw Zaw |
title |
Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators |
title_short |
Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators |
title_full |
Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators |
title_fullStr |
Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators |
title_full_unstemmed |
Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators |
title_sort |
recognition of promoters in dna sequences using weightily averaged one-dependence estimators |
publisher |
Elsevier Ltd. |
publishDate |
2013 |
url |
http://irep.iium.edu.my/34337/ http://irep.iium.edu.my/34337/ http://irep.iium.edu.my/34337/ http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf |
first_indexed |
2023-09-18T20:49:30Z |
last_indexed |
2023-09-18T20:49:30Z |
_version_ |
1777409899338137600 |