Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators

The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome anno...

Full description

Bibliographic Details
Main Authors: Htike@Muhammad Yusof, Zaw Zaw, Win, Shoon Lei
Format: Article
Language:English
Published: Elsevier Ltd. 2013
Subjects:
Online Access:http://irep.iium.edu.my/34337/
http://irep.iium.edu.my/34337/
http://irep.iium.edu.my/34337/
http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf
id iium-34337
recordtype eprints
spelling iium-343372015-06-01T03:29:28Z http://irep.iium.edu.my/34337/ Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators Htike@Muhammad Yusof, Zaw Zaw Win, Shoon Lei Q Science (General) The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17 % in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes. Elsevier Ltd. 2013-11-09 Article PeerReviewed application/pdf en http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf Htike@Muhammad Yusof, Zaw Zaw and Win, Shoon Lei (2013) Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators. Procedia Computer Science, 23. pp. 60-67. ISSN 1877-0509 http://www.sciencedirect.com/science/article/pii/S1877050913011447 10.1016/j.procs.2013.10.009
repository_type Digital Repository
institution_category Local University
institution International Islamic University Malaysia
building IIUM Repository
collection Online Access
language English
topic Q Science (General)
spellingShingle Q Science (General)
Htike@Muhammad Yusof, Zaw Zaw
Win, Shoon Lei
Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
description The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17 % in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes.
format Article
author Htike@Muhammad Yusof, Zaw Zaw
Win, Shoon Lei
author_facet Htike@Muhammad Yusof, Zaw Zaw
Win, Shoon Lei
author_sort Htike@Muhammad Yusof, Zaw Zaw
title Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_short Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_full Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_fullStr Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_full_unstemmed Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_sort recognition of promoters in dna sequences using weightily averaged one-dependence estimators
publisher Elsevier Ltd.
publishDate 2013
url http://irep.iium.edu.my/34337/
http://irep.iium.edu.my/34337/
http://irep.iium.edu.my/34337/
http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf
first_indexed 2023-09-18T20:49:30Z
last_indexed 2023-09-18T20:49:30Z
_version_ 1777409899338137600