Beginner's Guide to Spatial, Temporal and Spatial-Temporal Ecological Data Analysis with R-INLA (2017)
Zuur, Ieno, Saveliev
This book consists of two volumes.
In Volume I, we explain how to apply linear regression models, generalised linear models (GLM), and generalised linear mixed-effects models (GLMM) to spatial, temporal, and spatial-temporal data. The models that will be employed use the Gaussian and gamma distributions for continuous data, the Poisson and negative binomial distributions for count data, the Bernoulli distribution for absence–presence data, and the binomial distribution for proportional data.
In Volume II we apply zero-inflated models and generalised additive (mixed-effects) models to spatial and spatial-temporal data. Volume II is entitled GAM and Zero-Inflated Models.
Volume I: Table of Contents
Volume I: Pdf of Chapter 1: Overview of this book
Volume II: Preface and Table of Contents
Volume II: Pdf of Chapter 17: Introduction to Volume II
Outline of Volume I
In Chapter 2 we discuss an important topic: dependency. Ignoring this means that we have pseudoreplication. We present a series of examples and discuss how dependency can manifest itself.
We briefly discuss frequentist tools that are available for the analysis of temporal and spatial data in Chapters 3 and 4, and we will conclude that their application is rather limited, especially if non-Gaussian distributions are required. We will therefore consider alternative models, but these require Bayesian techniques.
In Chapter 5 we discuss linear mixed-effects models to analyse hierarchical (i.e. clustered or nested) data, and in Chapter 6 we outline how we add spatial and spatial-temporal dependency to regression models via spatial (and/or temporal) correlated random effects.
In Chapter 7 we introduce Bayesian analysis, Markov chain Monte Carlo techniques (MCMC), and Integrated Nested Laplace Approximation (INLA). INLA allows us to apply models to spatial, temporal, or spatial-temporal data.
In Chapters 8 through 16 we present a series of INLA examples. We start by applying linear regression and mixed-effects models in INLA (Chapters 8 and 9), followed by GLM examples in Chapter 10. In Chapters 11 through 13 we show how to apply GLM models on spatial data. In Chapter 14 we discuss time-series techniques and how to implement them in INLA. Finally, in Chapters 15 and 16 we analyse spatial-temporal models in INLA.
Outline of Volume II
In Chapter 18 we will explain how to deal with zero-inflated data. We introduce so-called zero-inflated Poisson (ZIP) models, zero-inflated negative binomial (ZINB) models, zero-altered Poisson (ZAP) models and zero-altered negative binomial (ZINB) models.
In Chapter 19 we extend the ZIP, ZINB, ZAP and ZANB models with spatial correlation. Both these chapters use a skate data set from South America. In the appendix accompanying Chapter 19 we also explain how to manipulate maps and create spatial polygons (e.g. for coastlines).
In Chapter 20 we revisit a data set with which we have been battling since 2006. It is about begging behaviour of owl nestlings. In Zuur (2009a) we applied linear mixed-effects models on it, and in Zuur et al. (2012a) we analysed it with a zero-inflated GLMM. Thanks to R-INLA we finally cracked this data set and apply a zero-inflated GAMM.
In Chapter 21 we analyse sandeel count data. This work came out of a consultancy project that we carried out for Wageningen Marine Research (The Netherlands) in 2017. Although the setup of the experiment is simple (approximately 400 sites sampled once per year, for 4 years), analysing these data and writing this chapter took about 30 days. This should give you an idea about the complexity of the statistical tools (zero-inflated GAMMs + spatial-temporal correlation) that we discuss in this book.
Chapter 22 is about zero-inflated bird densities sampled in the Labrador Sea, located between the Labrador Peninsula (Eastern Canada) and Greenland. This chapter is about the analysis of zero-inflated continuous data with spatial correlation. A zero-altered gamma model with spatial correlation is used.
In Chapter 23 we analyse coral reef data sampled around an island. A lot of misery comes together in this chapter: smoothers, zero-inflation and spatial dependency that should not cross land as benthic species that live in a coral reef do not walk over land! We will discuss barrier models (Bakka et al. 2018) which ensure that spatial correlation seeps around a barrier (in this case an island).
Up to Chapter 23 all data sets analysed were geostatistical data and not areal or lattice data. The reason for this is that most ecological data is geostatistical. In Chapter 24 we analyse aggregated tornado data in 102 counties in Illinois. This is areal data. We will use various CAR models (e.g. iCAR, BYM, BYM2) for zero-inflated spatial and spatial-temporal correlated data.
Data and R code VOLUME I
All data is freely available. All the R code is provided as well, except that a password is needed to open the zip files. The password is given in the Preface of Volume I (see page vi). In some chapters we are sourcing our support file HighstatLibV10.R.
- Chapter 1: Overview of This Book
- No data is used.
- No R code is used
- Sample chapter: Pages 1-4
- Chapter 2: Recognising statistical dependency
- Data used: IrishPh.zip (had to zip it to avoid file format conversion). Snow.csv file (use Snow.xls in case you have trouble opening the csv file. In this case, convert the xls file to csv yourself).
- R code used: Chapter2.R.zip
- Sample pages: Pages 5-6
- Chapter 3: Time series and GLS
- Data used: Ospreys.csv and Phenology_Data_Antarcticbirds_AFZ1.csv
- R code used: Chapter3.R.zip
- Chapter 4: Spatial data and GLS
- Data used: IrishPh.txt see above
- R code used: Chapter4.R.zip
- Chapter 5: Linear mixed-effects models and dependency
- Data used: White_Stork_Growth_20112012_V2.csv
- R code used: Chapter5.R.zip
- Sample pages: Pages 61-62
- Chapter 6: Modelling space explicitly
- Data used: No data files
- R code used: Chapter6.R.zip
- Chapter 7: Introduction to Bayesian statistics
- Data used: IrishPh.txt see above
- R code used: Chapter7.R.zip and MCMCSupportHighstatV4.R
- Sample pages: Pages 83-84
- Chapter 8: Multiple linear regression in R-INLA
- Data used: Chimps.txt
- R code used: Chapter8.R.zip
- Sample pages: Pages 115-116
- Chapter 9: Mixed effects modelling in R-INLA to analyse otolith data
- Data used: OTODATA.csv
- R code used: Chapter9.R.zip
- Sample pages: Pages 139-140
- Chapter 10: Poisson, negative binomial, binomial and gamma GLMs in R-INLA
- Data used: The files Turcoparasitos.txt, Crocodiles.txt, DrugsMites.txt and Procambarus.txt are in the file DataChapter10.zip.
- R code used: Chapter10.R.zip
- Sample pages: Pages 165-166 and page 182
- Chapter 11: Matérn correlation and SPDE
- Data used: none
- R code used: Chapter11.R.zip
- Chapter 12: Linear regression model with spatial dependency for the Irish pH data
- Data used: See Chapter 2
- R code used: Chapter12.R.zip
- Sample pages: Pages 205-206, pages 209-219 and pages 230-231
- Chapter 13: Spatial Poisson models applied to plant diversity
- Data used: LaPalma.txt
- R code used: Chapter13.R.zip and lapalmashapefile.zip
- Sample pages: Pages 239-240
- Chapter 14: Time-series analysis in R-INLA
- Data used: sockeye.csv, PolarBearsV2.txt (zipped), Spermwhales.txt (zipped), HawaiiBirdsV2.txt.zip (zipped) and IceCoresV2.csv
- R code used: Chapter14.R.zip
- Sample pages: Pages 267-268
- Chapter 15: Spatial-temporal models for orange crowned warblers count data
- Data used: OrangedCrownedWarblers.txt (zipped)
- R code used: Chapter15.R.zip and spde-tutorial-functions.R (this file was taken from http://www.r-inla.org/)
- Sample pages: Pages 307-308, page 317 and page 326
- Chapter 16: Spatial-temporal Bernoulli models for coral disease data
- Data used: CoralDisease.csv
- R code used: Chapter16.R.zip
- Sample pages: Pages 225-336 and page 346 (Dr. Christoph Kopp gave me some R code to convert the 76 spatial random fields into a mp4 video file)
Data and R code VOLUME II
All data is freely available. All the R code is provided as well, except that a password is needed to open the zip files. The password is given in the Preface of Volume II. In some chapters we are sourcing our support file HighstatLibV11.R.
- Chapter 17: Introduction to Volume II
- Sample pages: Pages 363-364
- Chapter 18: Zero-inflated models for count data in R-INLA
- Data used: Skate2.txt.zip. This file is zipped to avoid ftp problems. Just unzip it.
- R code used: Chapter18.R.zip. The password is given in the Preface of Volume II.
- Sample pages: Page 365
- Chapter 19: Spatial correlated and zero-inflated skate data
- Data used: Skate2.txt.zip. This file is zipped to avoid ftp problems. Just unzip it.
- R code used: Chapter19.R.zip. The password is given in the Preface of Volume II.
- Sample pages: Pages 403-404
- Chapter 20: GAM with correlation and zero-inflation in R-INLA using owl data
- Data used: Owls2.zip. This file is zipped to avoid ftp problems. Just unzip it.
- R code used: Chapter20.R.zip. The password is given in the Preface of Volume II.
- Sample pages: Page 439
- Chapter 21: GAM for zero-inflated and spatial-temporal correlated sandeel data
- Data used: Sandeel.csv
- R code used: Chapter21.R.zip. The password is given in the Preface of Volume II.
- Sample pages: Pages 499-501
- Chapter 22: Zero-inflated continuous seabird data
- Data used: LabradorSeaBirdsV2.csv
- R code used: Chapter22.R.zip. The password is given in the Preface of Volume II.
- Sample pages: Pages 551-552
- Chapter 23: Coral reef data and barrier models
- Data used: CoralReef_Tutuila_V2.csv
- R code used: Chapter23.R.zip and Chapter23_Section26_6.R.zip. The password is given in the Preface of Volume II.
- Sample pages: Page 595
- Chapter 24: Analysis of areal tornado data
- Data used: Illinois_popNtor2.csv and Shapefiles.zip
- R code used: Chapter24.R.zip and CountyStatisticsSupport.R. The password is given in the Preface of Volume II.
- Sample pages: Page 639, Pages 659 and 660, and Page 685
- Appendix A: Creating spatial polygons
- Data used: See Chapter 19.
- R code used: See Chapter 19.
- Appendix B: Other models that were considered for the skate data
- Data used: See Chapter 19.
- R code used: See Chapter 19.