On the distribution of the largest eigenvalue in principal components analysis

被引：1280

作者：

Johnstone, IM ^{[1
]}

机构：

[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA

来源：

ANNALS OF STATISTICS | 2001年 / 29卷 / 02期

关键词：

Karhunen-Loeve transform; empirical orthogonal functions; largest eigenvalue; largest singular value; Laguerre ensemble; Laguerre polynomial; Wishart distribution; Plancherel-Rotach asymptotics; Painleve equation; Tracy-Widom distribution; random matrix theory; Fredholm determinant; Liouville-Green method;

D O I：

10.1214/aos/1009210544

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Let x((1)) denote the square of the largest singular value of an n x p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x((1)) is the largest principal component variance of the covariance matrix X'X, or the largest eigenvalue of a p-variate Wishart distribution on n degrees of freedom with identity covariance. Consider the limit of large p and n with n/p = gamma greater than or equal to 1. When centered by mu (p) = (rootn - 1 + rootp)(2) and scaled by sigma (p) = (rootn - 1 + rootp)(1/rootn - 1 + 1/rootp)(1/3), the distribution of x((1)) approaches the Tracy-Widom law of order 1, which is defined in terms of the Painleve II differential equation and can be numerically evaluated and tabulated in software. Simulations show the approximation to be informative for n and p as small as 5. The limit is derived via a corresponding result for complex Wishart matrices using methods from random matrix theory. The result suggests that some aspects of large p multivariate distribution theory may be easier to apply in practice than their fixed p counterparts.

引用

页码：295 / 327

页数：33

共 49 条

[1] Longest increasing subsequences: From patience sorting to the Baik-Deift-Johansson theorem
Aldous, D
Diaconis, P
[J]. BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 1999, 36 (04) : 413 - 432
[2] ASYMPTOTIC THEORY FOR PRINCIPAL COMPONENT ANALYSIS
ANDERSON, TW
[J]. ANNALS OF MATHEMATICAL STATISTICS, 1963, 34 (01): : 122 - &
[3] Anderson TW, 1996, STAT SCI, V11, P20
[4] Bai ZD, 1999, STAT SINICA, V9, P611
[5] On the distribution of the length of the longest increasing subsequence of random permutations
Baik, J
Deift, P
Johansson, K
[J]. JOURNAL OF THE AMERICAN MATHEMATICAL SOCIETY, 1999, 12 (04) : 1119 - 1178
[6] Random matrix ensembles with an effective extensive external charge
Baker, TH
Forrester, PJ
Pearce, PA
[J]. JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1998, 31 (29): : 6087 - 6101
[7] Distribution functions for random variables for ensembles of positive Hermitian matrices
Basor, EL
[J]. COMMUNICATIONS IN MATHEMATICAL PHYSICS, 1997, 188 (02) : 327 - 350
[8] SOME NON-CENTRAL DISTRIBUTION PROBLEMS IN MULTIVARIATE-ANALYSIS
CONSTANTINE, AG
[J]. ANNALS OF MATHEMATICAL STATISTICS, 1963, 34 (04): : 1270 - &
[9] DEIFT P, 1999, NOT AM MATH SOC, V47, P631
[10] Deift P. A., 1999, Courant Lecture Notes, V3

← 1 2 3 4 5 →