论文部分内容阅读
土壤速效磷是影响作物生长发育的重要养分指标。光谱分析技术具有快速无损等特点,对速效磷的定量预测具有较好的应用前景。高光谱带宽窄,分辨率高,但也存在数据冗余和共线性等问题,易导致建立的回归模型出现过拟合现象,且泛化性能较差。本文针对皖北砂姜黑土145个样本开展研究,提出了利用偏最小二乘回归算法(PLS-R)对土壤可见近红外高光谱数据(400-1000 nm)进行数据降维和特征提取,根据交叉验证和变量投影重要性分别得到潜在变量和特征波长;再分别输入BP神经网络(BPNN)进行训练,得到回归分析模型对速效磷进行定量预测。结果表明:与利用全部波长数据建模的预测结果(校正集和验证集的相对分析误差MRPD分别为10.27和2.09)相比,利用9个特征波长建立的回归模型校正集MRPD为2.66,预测精度明显降低,而验证集MRPD为2.05,近似达到利用全部波长数据建模的预测效果;利用5个潜在变量建立回归模型,校正集和验证集的MRPD分别为3.10和2.29,其中验证集相对于全部波长建模的预测精度提高了9.60%。因此,基于PLS-BPNN算法进行回归建模可以有效降低高光谱数据冗余和共线性的影响,提高模型的泛化能力,且利用潜在变量进行回归建模能提高模型预测精度。
Soil available phosphorus is an important nutrient indicator of crop growth and development. Spectral analysis has the characteristics of rapid and non-destructive, and has a good application prospect for the quantitative prediction of available phosphorus. Hyperspectral bandwidth is narrow and resolution is high, but there are also some problems such as data redundancy and collinearity, which lead to the over-fitting of the regression model and the poor generalization performance. In this paper, 145 samples of sand-loam soil in northern Anhui were studied, and the data reduction and feature extraction of visible near-infrared hyperspectral data (400-1000 nm) using PLS-R was proposed. According to the cross-validation And the importance of variable projection respectively to get latent variables and characteristic wavelengths; then input BP neural network (BPNN) respectively for training, and get the regression analysis model to quantitatively predict available phosphorus. The results show that the MRPD of the regression model established with nine characteristic wavelengths is 2.66 compared with the prediction results using all the wavelength data modeling (MRPD of the calibration set and validation set are 10.27 and 2.09, respectively). The prediction accuracy And the MRPD of the validation set is 2.05, which approximate the prediction effect of using all the wavelength data modeling. The regression model is established by using 5 potential variables. The MRPDs of the calibration set and the validation set are 3.10 and 2.29, respectively, of which the validation set relative to all The accuracy of wavelength modeling has been improved by 9.60%. Therefore, regression modeling based on PLS-BPNN algorithm can effectively reduce the influence of hyperspectral data redundancy and collinearity, and improve the generalization ability of the model. And using the latent variables for regression modeling can improve the prediction accuracy of the model.