A best-practice guide to predicting plant traits from leaf-level hyperspectral data using partial least squares regression

Abstract

Partial least squares regression (PLSR) modelling is a statistical technique for correlating datasets, and involves the fitting of a linear regression between two matrices. One application of PLSR enables leaf traits to be estimated from hyperspectral optical reflectance data, facilitating rapid, high-throughput, non-destructive plant phenotyping. This technique is of interest and importance in a wide range of contexts including crop breeding and ecosystem monitoring. The lack of a consensus in the literature on how to perform PLSR means that interpreting model results can be challenging, applying existing models to novel datasets can be impossible, and unknown or undisclosed assumptions can lead to incorrect or spurious predictions. We address this lack of consensus by proposing best practices for using PLSR to predict plant traits from leaf-level hyperspectral data, including a discussion of when PLSR is applicable, and recommendations for data collection. We provide a tutorial to demonstrate how to develop a PLSR model, in the form of an R script accompanying this manuscript. This practical guide will assist all those interpreting and using PLSR models to predict leaf traits from spectral data, and advocates for a unified approach to using PLSR for predicting traits from spectra in the plant sciences.

Journal Article
Year of Publication
2021
Author
Journal
Journal of Experimental Botany
Volume
72
Issue
18
Number of Pages
6175 - 6189
Date Published
Mar-06-2022
DOI
10.1093/jxb/erab295
Download citation