This is Volume 3 of our book series, 'The World of Zero-Inflated Models'. The central theme of this book is the multivariate extensions of generalised linear models (GLM) and generalised linear mixed-effects models (GLMM).
Although this book is published under the umbrella of 'The World of Zero-Inflated Models', it also provides a good introduction to ordinary multivariate GLMM and GLLVM.
In Volumes 1 and 2 (Zuur and Ieno 2021; 2024), we analysed univariate response variables as a function of multiple covariates. In most chapters, the original datasets consisted of multiple response variables, but we typically converted them into a diversity index, such as species richness or total abundance, or focused on one specific species or variable.
For example, in Chapter 5 of Zuur and Ieno (2021), we analysed the abundance of the parasite species C. australe in fish, though this species is just one of many. In Chapter 6, we applied zero-inflated models to the abundance of mistletoe (Loranthus europaeus Jacq.), a known tree pest. The original paper by Matula et al. (2015) also analysed the diameter of the mistletoe plants and the absence or presence of mistletoe. In Chapter 8, we applied Tweedie GLMs to lobster biomass data, although the original study by Long (2017) considered abundance data as well. Chapter 11 of Zuur and Ieno (2024) focused on analysing a specific morphometric measurement of turtle hatchlings, namely the carapace width, whereas Bodensteiner et al. (2019) examined four additional response variables from the same hatchling: carapace length, plastron length, mass, and incubation duration. In Chapter 10, we analysed the proportion of time caribou spend feeding during a 30-minute focal period. However, the study by Lesmerises et al. (2017) also recorded time spent lying, walking, searching for food, running, trotting, being vigilant, standing, grooming, engaging in social interactions, and other miscellaneous activities. It presented results for feeding and vigilance.
In most of these examples, the response variables are correlated. There are several reasons why response variables might be correlated beyond direct cause-effect relationships or mutually exclusive activities. Here are some additional reasons:
- Shared underlying factors: Different response variables might be influenced by the same underlying environmental or biological factors. For instance, both feeding time and vigilance in caribou could be influenced by the availability of food and the presence of predators.
- Temporal or spatial proximity: Variables might be correlated due to occurring at the same time or in the same location. For example, if certain behaviours tend to happen during specific times of the day, variables measured during those times might show correlation.
- Biological constraints: Organisms often face biological limitations that cause correlations between different traits or behaviours. For instance, physiological needs might limit how much time an animal can spend on certain activities, creating correlations between them.
- Behavioural syndromes: Animals might exhibit consistent behaviour patterns across different contexts, known as behavioural syndromes. For example, an animal that is generally more active might spend more time both walking and feeding, leading to a positive correlation between these activities.
- Environmental conditions: Correlation can arise due to shared responses to environmental conditions. For example, during harsh weather, an animal might reduce overall activity, affecting multiple behaviours similarly.
And as another example, consider a field study investigating the presence of ISA (Infectious Salmon Anemia) and sea lice infestation in salmon around fish farms. While there is no evidence of a direct causal link between these two diseases in salmon, factors such as stress, environmental conditions, and management practices may contribute to the presence of both diseases. If the correlation is strong, then applying a univariate analysis to each disease dataset may duplicate the findings.
Instead of applying multiple univariate GLMMs, this book discusses multivariate GLMMs for datasets with a relatively small number of response variables and generalised linear latent variable models (GLLVMs) for datasets with a relatively large number of response variables.
References
Bodensteiner, B. L., Warner, D. A., Iverson, J. B., Milne‐Zelman, C. L., Mitchell, T. S., Refsnider, J. M., and Janzen, F. J. (2019). Geographic variation in thermal sensitivity of early life traits in a widespread reptile. Ecology and Evolution, 9(5):2791. Publisher: Wiley.
Lesmerises, F., Johnson, C. J., and St-Laurent, M.-H. (2017). Refuge or predation risk? Alternate ways to perceive hiker disturbance based on maternal state of female caribou. Ecology and Evolution, 7(3):845–854.
Long, S. (2017). Short-term impacts and value of a periodic no take zone (NTZ) in a community-managed small-scale lobster fishery, Madagascar. PLOS ONE, 12(5):e0177858.
Matula, R., Svátek, M., Pálková, M., Volařík, D., and Vrška, T. (2015). Mistletoe Infection in an oak forest is influenced by competition and host size. PLOS ONE, 10(5):e0127055.