A new robust estimator to detect outliers for multivariate data

Mahalanobis distance (MD) is a classical method to detect outliers for multivariate data. However, classical mean and covariance matrix in MD suffered from masking and swamping effect if the data contain outliers. Due to this problem, many studies used robust estimator instead of the classical estim...

Full description

Bibliographic Details
Main Authors: Sharifah Sakinah, Syed Abd Mutalib, Siti Zanariah, Satari, Wan Nur Syahidah, Wan Yusoff
Format: Conference or Workshop Item
Language:English
Published: IOP Publishing 2019
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/27847/
http://umpir.ump.edu.my/id/eprint/27847/
http://umpir.ump.edu.my/id/eprint/27847/1/A%20new%20robust%20estimator%20to%20detect%20outliers%20for%20multivariate%20data.pdf
Description
Summary:Mahalanobis distance (MD) is a classical method to detect outliers for multivariate data. However, classical mean and covariance matrix in MD suffered from masking and swamping effect if the data contain outliers. Due to this problem, many studies used robust estimator instead of the classical estimator of mean and covariance matrix. In this study, a new robust estimator, namely, Test on Covariance (TOC) is proposed to detect outliers in multivariate data. The performance of TOC is compared with the existing robust estimators which are Fast Minimum Covariance Determinant (FMCD), Minimum Vector Variance (MVV), Covariance Matrix Equality (CME) and Index Set Equality (ISE). The probability that all the planted outliers are successfully detected (pout), probability of masking (pmask) and probability of swamping (pswamp) are computed for each estimator via simulation study. It is found that the TOC is applicable and a promising approach to detect the outliers for multivariate data.