Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model
SNP-SNP interactions have been recognized to be basically important for understanding genetic causes of complex disease traits. Logic regression is an effective methods for identifying SNP-SNP interactions associated with risk of complex disease. However, identifying SNP-SNP interactions are computa...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Penerbit Universiti Kebangsaan Malaysia
2017
|
Online Access: | http://journalarticle.ukm.my/11370/ http://journalarticle.ukm.my/11370/ http://journalarticle.ukm.my/11370/1/13%20Unitsa.pdf |
id |
ukm-11370 |
---|---|
recordtype |
eprints |
spelling |
ukm-113702018-02-15T05:11:06Z http://journalarticle.ukm.my/11370/ Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model Unitsa Sangket, Surakameth Mahasirimongkol, Pichaya Tandayya, Surasak Sangkhathat, Wasun Chantratita, Qi, Liu Yasui, Yutaka SNP-SNP interactions have been recognized to be basically important for understanding genetic causes of complex disease traits. Logic regression is an effective methods for identifying SNP-SNP interactions associated with risk of complex disease. However, identifying SNP-SNP interactions are computationally challenging and may take hours, weeks and months to complete. Although parallel computing is a powerful method to accelerate computing time, it is arduous for users to apply this method to logic regression analyses of SNP-SNP interactions because it requires advanced programming skills to correctly partition and distribute data, control and monitor tasks across multi-core CPUs or several computers, and merge output files. In this paper, we present a novel R-library called SNPInt to automatically speed up analyses of SNP-SNP interactions of genome-wide association (GWA) studies using parallel computing without the advanced programming skills. The Crohn’s disease GWA studies dataset from the Wellcome Trust Case Control Consortium (WTCCC) that includes 4,680 individuals with 500,000 SNPs’ genotypes was analyzed using logic regression on a computer cluster to evaluate SNPInt performance. The results from SNPInt with any number of CPUs are the same as the results from non-parallel approach, and SNPInt library quite accelerated the logic regression analysis. For instance, with two hundred genes and twenty permutation rounds, the computing time was continuously decreased from 7.3 days to only 0.9 day when SNPInt applied eight CPUs. Executing analyses of SNP-SNP interactions using the SNPInt library is an effective way to boost performance, and simplify the parallelization of analyses of SNP-SNP interactions. Penerbit Universiti Kebangsaan Malaysia 2017-09 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/11370/1/13%20Unitsa.pdf Unitsa Sangket, and Surakameth Mahasirimongkol, and Pichaya Tandayya, and Surasak Sangkhathat, and Wasun Chantratita, and Qi, Liu and Yasui, Yutaka (2017) Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model. Sains Malaysiana, 46 (9). pp. 1449-1455. ISSN 0126-6039 http://www.ukm.my/jsm/english_journals/vol46num9_2017/contentsVol46num9_2017.html |
repository_type |
Digital Repository |
institution_category |
Local University |
institution |
Universiti Kebangasaan Malaysia |
building |
UKM Institutional Repository |
collection |
Online Access |
language |
English |
description |
SNP-SNP interactions have been recognized to be basically important for understanding genetic causes of complex disease traits. Logic regression is an effective methods for identifying SNP-SNP interactions associated with risk of complex disease. However, identifying SNP-SNP interactions are computationally challenging and may take hours, weeks and months to complete. Although parallel computing is a powerful method to accelerate computing time, it is arduous for users to apply this method to logic regression analyses of SNP-SNP interactions because it requires advanced programming skills to correctly partition and distribute data, control and monitor tasks across multi-core CPUs or several computers, and merge output files. In this paper, we present a novel R-library called SNPInt to automatically speed up analyses of SNP-SNP interactions of genome-wide association (GWA) studies using parallel computing without the advanced programming skills. The Crohn’s disease GWA studies dataset from the Wellcome Trust Case Control Consortium (WTCCC) that includes 4,680 individuals with 500,000 SNPs’ genotypes was analyzed using logic regression on a computer cluster to evaluate SNPInt performance. The results from SNPInt with any number of CPUs are the same as the results from non-parallel approach, and SNPInt library quite accelerated the logic regression analysis. For instance, with two hundred genes and twenty permutation rounds, the computing time was continuously decreased from 7.3 days to only 0.9 day when SNPInt applied eight CPUs. Executing analyses of SNP-SNP interactions using the SNPInt library is an effective way to boost performance, and simplify the parallelization of analyses of SNP-SNP interactions. |
format |
Article |
author |
Unitsa Sangket, Surakameth Mahasirimongkol, Pichaya Tandayya, Surasak Sangkhathat, Wasun Chantratita, Qi, Liu Yasui, Yutaka |
spellingShingle |
Unitsa Sangket, Surakameth Mahasirimongkol, Pichaya Tandayya, Surasak Sangkhathat, Wasun Chantratita, Qi, Liu Yasui, Yutaka Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model |
author_facet |
Unitsa Sangket, Surakameth Mahasirimongkol, Pichaya Tandayya, Surasak Sangkhathat, Wasun Chantratita, Qi, Liu Yasui, Yutaka |
author_sort |
Unitsa Sangket, |
title |
Parallelization of logic regression analysis on SNP-SNP
interactions of a Crohn’s disease dataset model |
title_short |
Parallelization of logic regression analysis on SNP-SNP
interactions of a Crohn’s disease dataset model |
title_full |
Parallelization of logic regression analysis on SNP-SNP
interactions of a Crohn’s disease dataset model |
title_fullStr |
Parallelization of logic regression analysis on SNP-SNP
interactions of a Crohn’s disease dataset model |
title_full_unstemmed |
Parallelization of logic regression analysis on SNP-SNP
interactions of a Crohn’s disease dataset model |
title_sort |
parallelization of logic regression analysis on snp-snp
interactions of a crohn’s disease dataset model |
publisher |
Penerbit Universiti Kebangsaan Malaysia |
publishDate |
2017 |
url |
http://journalarticle.ukm.my/11370/ http://journalarticle.ukm.my/11370/ http://journalarticle.ukm.my/11370/1/13%20Unitsa.pdf |
first_indexed |
2023-09-18T20:00:06Z |
last_indexed |
2023-09-18T20:00:06Z |
_version_ |
1777406791411302400 |