Shrinkage estimation of covariance matrix in hotelling’s T2 for differentially expressed gene sets / Suryaefiza Karjanto

The microarray technology performs simultaneous analysis of thousands of genes in a massively parallel manner in one experiment, hence providing valuable knowledge on gene interaction and function. The understanding of microarray data has led to the development of new methods in statistics such...

Full description

Bibliographic Details
Main Author: Karjanto, Suryaefiza
Format: Book Section
Language:English
Published: Institute of Graduate Studies, UiTM 2017
Subjects:
Online Access:http://ir.uitm.edu.my/id/eprint/18970/
http://ir.uitm.edu.my/id/eprint/18970/1/ABS_SURYAEFIZA%20KARJANTO%20TDRA%20VOL%2012%20IGS%2017.pdf
Description
Summary:The microarray technology performs simultaneous analysis of thousands of genes in a massively parallel manner in one experiment, hence providing valuable knowledge on gene interaction and function. The understanding of microarray data has led to the development of new methods in statistics such as detection of differentially expressed genes. The microarray analysis was first employed for individual or single gene, but recently it has been applied to a gene set or a group of the gene. The relationship between genes in gene set is analysed using Hotelling’s T2 as a multivariate test statistic. However, the test cannot be applied when the number of samples is larger than the number of variables which is uncommon in the microarray. Since the microarray dataset typically consists of tens of thousands of genes from just dozens of samples due to various constraints, the sample covariance matrix is not positive definite and singular, thus it cannot be inverted. Thus, in this study, we proposed shrinkage approaches to estimating the covariance matrix in Hotelling’s T2 particularly to cater high dimensionality problem in microarray data. The Hotelling’s T2 statistic was combined with the shrinkage approach as an alternative estimation to estimate the covariance matrix in detect significant gene sets. The proposed shrinkage estimation approach is about taking a weighted average of the sample covariance matrix and a structured matrix or shrinkage target as shrinkage of the sample covariance matrix towards a target matrix of the same dimensions while the shrinkage intensity is the weight that the shrinkage target receives. Three shrinkage covariance methods were proposed in this study and are referred as ShrinkA, ShrinkB and ShrinkC..