Stationary process approximation for the analysis of large spatial datasets

被引:652
作者
Banerjee, Sudipto [1 ]
Gelfand, Alan E. [2 ]
Finley, Andrew O. [3 ]
Sang, Huiyan [2 ]
机构
[1] Univ Minnesota, Sch Publ Hlth, Div Biostat, Minneapolis, MN 55455 USA
[2] Duke Univ, Durham, NC 27706 USA
[3] Michigan State Univ, E Lansing, MI 48824 USA
关键词
co-regionalization; Gaussian processes; hierarchical modelling; kriging; Markov chain Monte Carlo methods; multivariate spatial processes; space-time processes;
D O I
10.1111/j.1467-9868.2008.00663.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
With scientific data available at geocoded locations, investigators are increasingly turning to spatial process models for carrying out statistical inference. Over the last decade, hierarchical models implemented through Markov chain Monte Carlo methods have become especially popular for spatial modelling, given their flexibility and power to fit models that would be infeasible with classical methods as well as their avoidance of possibly inappropriate asymptotics. However, fitting hierarchical spatial models often involves expensive matrix decompositions whose computational complexity increases in cubic order with the number of spatial locations, rendering such models infeasible for large spatial data sets. This computational burden is exacerbated in multivariate settings with several spatially dependent response variables. It is also aggravated when data are collected at frequent time points and spatiotemporal process models are used. With regard to this challenge, our contribution is to work with what we call predictive process models for spatial and spatiotemporal data. Every spatial (or spatiotemporal) process induces a predictive process model (in fact, arbitrarily many of them). The latter models project process realizations of the former to a lower dimensional subspace, thereby reducing the computational burden. Hence, we achieve the flexibility to accommodate non-stationary, non-Gaussian, possibly multivariate, possibly spatiotemporal processes in the context of large data sets. We discuss attractive theoretical properties of these predictive processes. We also provide a computational template encompassing these diverse settings. Finally, we illustrate the approach with simulated and real data sets.
引用
收藏
页码:825 / 848
页数:24
相关论文
共 47 条
[1]  
[Anonymous], 2006, SOFTW ENVIRON TOOLS
[2]  
[Anonymous], 1993, J AGR BIOL ENVIR ST
[3]  
Banerjee S., 2004, Hierarchical modeling and analysis for spatial data
[4]   Sequential, Bayesian geostatistics:: A principled method for large data sets [J].
Cornford, D ;
Csató, L ;
Opper, M .
GEOGRAPHICAL ANALYSIS, 2005, 37 (02) :183-199
[5]   Fixed rank kriging for very large spatial data sets [J].
Cressie, Noel ;
Johannesson, Gardar .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :209-226
[6]  
CSATO L, 2002, THESIS ASTON U BIRMI
[7]   Nonconjugate Bayesian estimation of covariance matrices and its use in hierarchical models [J].
Daniels, MJ ;
Kass, RE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (448) :1254-1263
[8]   Bayesian geostatistical design [J].
Diggle, P ;
Lophaven, S .
SCANDINAVIAN JOURNAL OF STATISTICS, 2006, 33 (01) :53-64
[9]  
Diggle PJ, 2007, SPRINGER SER STAT, P1, DOI 10.1007/978-0-387-48536-2
[10]   Model-based geostatistics [J].
Diggle, PJ ;
Tawn, JA ;
Moyeed, RA .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1998, 47 :299-326