Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model

SNP-SNP interactions have been recognized to be basically important for understanding genetic causes of complex disease traits. Logic regression is an effective methods for identifying SNP-SNP interactions associated with risk of complex disease. However, identifying SNP-SNP interactions are computa...

Full description

Bibliographic Details
Main Authors: Unitsa Sangket, Surakameth Mahasirimongkol, Pichaya Tandayya, Surasak Sangkhathat, Wasun Chantratita, Qi, Liu, Yasui, Yutaka
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2017
Online Access:http://journalarticle.ukm.my/11370/
http://journalarticle.ukm.my/11370/
http://journalarticle.ukm.my/11370/1/13%20Unitsa.pdf
id ukm-11370
recordtype eprints
spelling ukm-113702018-02-15T05:11:06Z http://journalarticle.ukm.my/11370/ Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model Unitsa Sangket, Surakameth Mahasirimongkol, Pichaya Tandayya, Surasak Sangkhathat, Wasun Chantratita, Qi, Liu Yasui, Yutaka SNP-SNP interactions have been recognized to be basically important for understanding genetic causes of complex disease traits. Logic regression is an effective methods for identifying SNP-SNP interactions associated with risk of complex disease. However, identifying SNP-SNP interactions are computationally challenging and may take hours, weeks and months to complete. Although parallel computing is a powerful method to accelerate computing time, it is arduous for users to apply this method to logic regression analyses of SNP-SNP interactions because it requires advanced programming skills to correctly partition and distribute data, control and monitor tasks across multi-core CPUs or several computers, and merge output files. In this paper, we present a novel R-library called SNPInt to automatically speed up analyses of SNP-SNP interactions of genome-wide association (GWA) studies using parallel computing without the advanced programming skills. The Crohn’s disease GWA studies dataset from the Wellcome Trust Case Control Consortium (WTCCC) that includes 4,680 individuals with 500,000 SNPs’ genotypes was analyzed using logic regression on a computer cluster to evaluate SNPInt performance. The results from SNPInt with any number of CPUs are the same as the results from non-parallel approach, and SNPInt library quite accelerated the logic regression analysis. For instance, with two hundred genes and twenty permutation rounds, the computing time was continuously decreased from 7.3 days to only 0.9 day when SNPInt applied eight CPUs. Executing analyses of SNP-SNP interactions using the SNPInt library is an effective way to boost performance, and simplify the parallelization of analyses of SNP-SNP interactions. Penerbit Universiti Kebangsaan Malaysia 2017-09 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/11370/1/13%20Unitsa.pdf Unitsa Sangket, and Surakameth Mahasirimongkol, and Pichaya Tandayya, and Surasak Sangkhathat, and Wasun Chantratita, and Qi, Liu and Yasui, Yutaka (2017) Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model. Sains Malaysiana, 46 (9). pp. 1449-1455. ISSN 0126-6039 http://www.ukm.my/jsm/english_journals/vol46num9_2017/contentsVol46num9_2017.html
repository_type Digital Repository
institution_category Local University
institution Universiti Kebangasaan Malaysia
building UKM Institutional Repository
collection Online Access
language English
description SNP-SNP interactions have been recognized to be basically important for understanding genetic causes of complex disease traits. Logic regression is an effective methods for identifying SNP-SNP interactions associated with risk of complex disease. However, identifying SNP-SNP interactions are computationally challenging and may take hours, weeks and months to complete. Although parallel computing is a powerful method to accelerate computing time, it is arduous for users to apply this method to logic regression analyses of SNP-SNP interactions because it requires advanced programming skills to correctly partition and distribute data, control and monitor tasks across multi-core CPUs or several computers, and merge output files. In this paper, we present a novel R-library called SNPInt to automatically speed up analyses of SNP-SNP interactions of genome-wide association (GWA) studies using parallel computing without the advanced programming skills. The Crohn’s disease GWA studies dataset from the Wellcome Trust Case Control Consortium (WTCCC) that includes 4,680 individuals with 500,000 SNPs’ genotypes was analyzed using logic regression on a computer cluster to evaluate SNPInt performance. The results from SNPInt with any number of CPUs are the same as the results from non-parallel approach, and SNPInt library quite accelerated the logic regression analysis. For instance, with two hundred genes and twenty permutation rounds, the computing time was continuously decreased from 7.3 days to only 0.9 day when SNPInt applied eight CPUs. Executing analyses of SNP-SNP interactions using the SNPInt library is an effective way to boost performance, and simplify the parallelization of analyses of SNP-SNP interactions.
format Article
author Unitsa Sangket,
Surakameth Mahasirimongkol,
Pichaya Tandayya,
Surasak Sangkhathat,
Wasun Chantratita,
Qi, Liu
Yasui, Yutaka
spellingShingle Unitsa Sangket,
Surakameth Mahasirimongkol,
Pichaya Tandayya,
Surasak Sangkhathat,
Wasun Chantratita,
Qi, Liu
Yasui, Yutaka
Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model
author_facet Unitsa Sangket,
Surakameth Mahasirimongkol,
Pichaya Tandayya,
Surasak Sangkhathat,
Wasun Chantratita,
Qi, Liu
Yasui, Yutaka
author_sort Unitsa Sangket,
title Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model
title_short Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model
title_full Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model
title_fullStr Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model
title_full_unstemmed Parallelization of logic regression analysis on SNP-SNP interactions of a Crohn’s disease dataset model
title_sort parallelization of logic regression analysis on snp-snp interactions of a crohn’s disease dataset model
publisher Penerbit Universiti Kebangsaan Malaysia
publishDate 2017
url http://journalarticle.ukm.my/11370/
http://journalarticle.ukm.my/11370/
http://journalarticle.ukm.my/11370/1/13%20Unitsa.pdf
first_indexed 2023-09-18T20:00:06Z
last_indexed 2023-09-18T20:00:06Z
_version_ 1777406791411302400