Testing High-Dimensional Nonparametric Behrens-Fisher Problem

MENG Zhen, LI Na, YUAN Ao

系统科学与复杂性(英文) ›› 2022, Vol. 35 ›› Issue (3) : 1098-1115.

PDF(280 KB)
PDF(280 KB)
系统科学与复杂性(英文) ›› 2022, Vol. 35 ›› Issue (3) : 1098-1115. DOI: 10.1007/s11424-021-0257-3

Testing High-Dimensional Nonparametric Behrens-Fisher Problem

    MENG Zhen1,2, LI Na1,2, YUAN Ao3
作者信息 +

Testing High-Dimensional Nonparametric Behrens-Fisher Problem

    MENG Zhen1,2, LI Na1,2, YUAN Ao3
Author information +
文章历史 +

摘要

For high-dimensional nonparametric Behrens-Fisher problem in which the data dimension is larger than the sample size, the authors propose two test statistics in which one is U-statistic Rankbased Test (URT) and another is Cauchy Combination Test (CCT). CCT is analogous to the maximumtype test, while URT takes into account the sum of squares of differences of ranked samples in different dimensions, which is free of shapes of distributions and robust to outliers. The asymptotic distribution of URT is derived and the closed form for calculating the statistical significance of CCT is given. Extensive simulation studies are conducted to evaluate the finite sample power performance of the statistics by comparing with the existing method. The simulation results show that our URT is robust and powerful method, meanwhile, its practicability and effectiveness can be illustrated by an application to the gene expression data.

Abstract

For high-dimensional nonparametric Behrens-Fisher problem in which the data dimension is larger than the sample size, the authors propose two test statistics in which one is U-statistic Rankbased Test (URT) and another is Cauchy Combination Test (CCT). CCT is analogous to the maximumtype test, while URT takes into account the sum of squares of differences of ranked samples in different dimensions, which is free of shapes of distributions and robust to outliers. The asymptotic distribution of URT is derived and the closed form for calculating the statistical significance of CCT is given. Extensive simulation studies are conducted to evaluate the finite sample power performance of the statistics by comparing with the existing method. The simulation results show that our URT is robust and powerful method, meanwhile, its practicability and effectiveness can be illustrated by an application to the gene expression data.

关键词

Cauchy combination test / nonparametric Behrens-Fisher problem / rank-based test / Ustatistic

Key words

Cauchy combination test / nonparametric Behrens-Fisher problem / rank-based test / Ustatistic

引用本文

导出引用
MENG Zhen, LI Na, YUAN Ao. Testing High-Dimensional Nonparametric Behrens-Fisher Problem. 系统科学与复杂性(英文), 2022, 35(3): 1098-1115 https://doi.org/10.1007/s11424-021-0257-3
MENG Zhen, LI Na, YUAN Ao. Testing High-Dimensional Nonparametric Behrens-Fisher Problem. Journal of Systems Science and Complexity, 2022, 35(3): 1098-1115 https://doi.org/10.1007/s11424-021-0257-3

参考文献

[1] Ozaki K, Functional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarction, Nature Genetics, 2002, 32(4):650-654.
[2] Klein R J, Zeiss C, Chew E Y, et al., Complement factor H polymorphism in age-related macular degeneration, Science, 2005, 308(5720):385-389.
[3] Potthoff R F, Use of the Wilcoxon statistic for a generalized Behrens-Fisher problem, Annals of Mathematical Statistics, 1963, 34:1596-1599.
[4] Xie T, Cao R, and Yu P, Rank-based test for partial functional linear regression models, Journal of Systems Science and Complexity, 2020, 33(5):1571-1584.
[5] Brunner E, Munzel U, and Puri M L, The multivariate nonparametric Behrens-Fisher problem, Journal of Statistical Planning and Inference, 2002, 108:37-53.
[6] O'Brien P C, Procedures for comparing samples with multiple endpoints, Biometrics, 1984, 40:1079-1087.
[7] Huang P, Tilley B C, Woolson R F, et al., Adjusting O'Brien's test to control type I error for the generalized nonparametric Behrens-Fisher problem, Biometrics, 2005, 61:532-539.
[8] Liu A, Li Q, Liu C, et al., A rank-based test for comparison of multidimensional outcomes, Journal of the American Statistical Association, 2010, 105:578-587.
[9] Li Z, Cao F, Zhang J, et al., Summation of absolute value test for multiple outcome comparison with moderate effect, Journal of Systems Science and Complexity, 2013, 26(3):462-469.
[10] Bonferroni C E, Teoria statistica delle classi e calcolo delle probabilità, Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze, 1936, 8:3-62.
[11] Mann H B and Whitney D R, On a test of whether one of two random variables is stochastically larger than the other, Annals of Mathematical Statistics, 1947, 18(1):50-60.
[12] Liu Y and Xie J, Cauchy combination test:A powerful test with analytic p-value calculation under arbitrary dependency structures, Journal of the American Statistical Association, 2019, 115:393-402.
[13] Bu D L, Yang Q L, Meng Z, et al., Truncated tests for combining evidence of summary statistics, Genetic Epidemiology, 2020, 44:687-701.
[14] Yankner B A, A century of cognitive decline, Nature, 2000, 404(6774):125.
[15] Lu T, Pan Y, Kao S, et al., Gene regulation and DNA damage in the ageing human brain, Nature, 2004, 429:883-891.
[16] Li Z B, Liu A, Li Z, et al., Rank-based tests for comparison of multiple endpoints among several populations, Statistics and Its Interface, 2014, 7(1):9-18.
[17] Li J, Zhang W, Zhang S, et al., A theoretic study of a distance-based regression model, Science in China Series A Mathematics, 2019, 62:979-998.
[18] Wang J, Li J, Xiong W, et al., Group analysis of distance matrices, Genetic Epidemiology, 2020,44:620-628.
[19] Koroljuk V S and Borovskich Yu V, Theory of U-Statistics, Kluwer Academic Publishers, The Netherlands, 1994.
[20] Hoeffding W and Robbins H, The central limit theorem for dependent random variables, Duke Mathematics Journal, 1948, 15:773-780.
[21] Diananda P H, The central limit theorem for m-dependent variables, Mathematical Proceedings of the Cambridge Philosophical Society, 1955, 51:92-95.
[22] Orey S A, Central limit theorems for m-dependent random variables, Duke Mathematics Journal, 1958, 25:543-546.
[23] Berk K N, A central limit theorem for m-dependent random variables with unbounded m, Annals of Probability, 1973, 1:352-354.
[24] Romano J P and Wolf M, A more general central limit theorem for m-dependent random variables with unbounded m, Statistics and Probability Letters, 2000, 47:115-124.

基金

This paper was supported by Beijing Natural Science Foundation under Grant No. Z180006 and the National Nature Science Foundation of China under Grant No. 11722113.
PDF(280 KB)

201

Accesses

0

Citation

Detail

段落导航
相关文章

/