Benchmarking in silico tools for the functional assessment of DNA variants using a set of strictly pharmacogenetic variants

Predictive algorithms are important tools for translating genomic data into meaningful functional annotations. In this work, we benchmarked the performance of eight prediction methods using a set of strictly pharmacogenetic variants. We first compiled a set of damaging or neutral variants that affec...

Full description

Bibliographic Details
Main Authors: Chua, Eng Wee, Goh, Chian Siang
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2019
Online Access:http://journalarticle.ukm.my/14378/
http://journalarticle.ukm.my/14378/
http://journalarticle.ukm.my/14378/1/10%20Eng%20Wee%20Chua.pdf
Description
Summary:Predictive algorithms are important tools for translating genomic data into meaningful functional annotations. In this work, we benchmarked the performance of eight prediction methods using a set of strictly pharmacogenetic variants. We first compiled a set of damaging or neutral variants that affected pharmacogenes from two online databases. We then cross-checked their functional impacts against the predictions given by the chosen tools. Of the eight methods, SIFT (Sorting Intolerant From Tolerant), Mutation Assessor, and CADD (Combined Annotation Dependent Depletion) were the top performers in predicting the functional relevance of a variant. The performance of SIFT surpassed that of CADD despite its much simpler algorithm, correctly identifying 66.91% of the damaging variants and 84.38% of the neutral variants. SIFT assumes that important DNA bases within a gene are conserved and not amenable to substitution. Overall, none of the prediction methods struck a balance between sensitivity and specificity. For instance, we noted that CADD was very sensitive in detecting the damaging variants (89.21%); however, it also mispredicted a large fraction of the neutral variants (43.75%). We then trialled a consensus approach whereby the functional significance of a variant is defined by agreement between at least three prediction methods. The approach performed better than all the tools deployed alone, detecting 84.17% of the deleterious variants and 70.97% of the neutral variants. A prediction method that integrates an assortment of algorithms, each assigned an empirically optimised weighting, may be established in the future for the functional assessment of pharmacogenetic variants.