论文部分内容阅读
Deep leaning(DP)represents a novel branch of machine learning algorithms that attempt to model highlevel abstractions in data,which is composed of multiple processing layers to learn representations of data with multiple levels of abstraction [1].When paired with spectroscopy techniques,DP models stand out conspicuously for high-throughput analysis of complex samples no matter what kind of data structures at hand [2].However,despite the non-linear estimation capability of DP models,the uninformative variables will greatly affect the reliability of DP models,and thus variable selection plays a significant role in multivariate calibration in spectral analysis.However,majority of conventional variable selection methods represent linear space mapping,and usually focus on one aspect of the wavelength space and ignore the impact of the sample space,thus resulting in an unreliable feature select result.In this regard,a novel nonlinear methodology based on the ensemble mean impact value(EMIV)is proposed,making a balance between the effects of wavelength and sample space even when nonlinearity is introduced in the calibration space.The EMIV methodology is built with the fusion of mean impact value(MIV)and Monte Carlo(MC)resampling methods,which inflects the effects of variable and sample space,respectively.Moreover,MIV can effectively avoid the interference of non-linearity in calibration.With EMIV,a large number of DP models are constructed randomly,and then each variable is evaluated with the MIV in these models.The variables with high values are treated as important ones,and final DP model is thus constructed with these selected variables.The results indicate that the EMIV method can effectively select informative variables even when non-linearity was presented in calibration,and the performance of DP models was greatly improved.It is expected that the EMIV strategy is capable of producing a more robust and parsimonious regression model.