Predicting Starch Concentration with NIR Spectroscopy in Relation to Reference Method

Article Preview

Abstract:

The objection of the study was to evaluate the ability of near infrared spectroscopy (NIRS) to predict starch concentration in relation to hydrochloric acid polarimetry analytical methods. The study comprised 131 flour samples which were collected in Zhengzhou market and scanned using FOSS NIRS analyzer in reflectance mode (570-1100nm). The calibration models were developed by using WinISIⅡversion1.5 software on the basis of whole and structural sample sets (n=131and n=102, respectively). The quality of models was assessed by SEC, R2, SECV, 1-VR. In order to remove the irrelevant information of the spectra we use different scatter and mathematical methods to pre-treat spectra and then build models by using the modified partial least squares. After comparing the performance of the models the best pretreatment method for the whole sample set was weighted MSC and mathematical treatment (4, 4, 3, 2), the most suitable pretreatment method for the structural sample set was weighted MSC and mathematical treatment (0, 15, 13, 3).Under the aim of removing non-relevant spectral regions, the models were developed under different regions (570-658nm, 660-746nm, 748-834nm, 836-922nm, 924-1010nm, 1012-1098nm, respectively) and the results showed that the starch information was distributed in the whole spectral region. Models developed for the whole sample set by using the different modeling methods were characterized by SEC ranged from 0.9671 to 1.2819, SECV ranged from 0.1877 to 1.1945, R2 ranged from 0.8422 to 0.8859, 1-VR ranged from 0.8238 to 0.9951 and RPD ranged from 2.3880 to 15.9126, respectively. Models developed for the structured sample set by using the different modeling methods were characterized by SEC ranged from 0.8365 to 1.0452, SECV ranged from 0.2755 to 1.0438, R2 ranged from 0.8638 to 0.9168, 1-VR ranged from 0.8680 to 0.9852 and RPD ranged from 2.7639 to 10.2802, respectively. Using the validation set to examine the prediction ability of the models built by using structured sample set, the results show that it is possible to prediction starch content by NIRS method corresponding to starch content determined by the polarimetry analytical methods. NIRS method offers great advantages for the on-line application, however, its prediction on ability is limited by the referent methods and this should be taken into account.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 524-527)

Pages:

2199-2210

Citation:

Online since:

May 2012

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[2] 1 Samples It is important to select samples with uniform distribution of the constituent or constituents to be determined because accuracy of prediction and the robustness of calibration model depend on the range of the flour quality parameters[[] M. Baslar and M. F. Ertugay: TÜBİTAK Vol. 35, (2011), p.139

Google Scholar

[2] 2 Chemical analysis Samples were analyzed in triplicate and averaged. The starch parameter was performed according to the hydrochloric acid polarimetry analytical methods which was widely used in Chinese by using the automatic recording polarimeter(WZZ-1S, SPOIF, Shanghai, China), sample mass of about 2.5±0.01g and the solution of Zinc sulfate and Potassium Ferro cyanide[4].

Google Scholar

[2] 3 NIRS analysis Every kind of sample was transmitted into two parallel product cups. Flour sample scans were taken over the wavelength range from 570 to 1100nm by using a spectrometer (Infratec TM 1241, FOSS TECATOR) equipped with autocap module. Spectra were collected and managed using WinISIⅡ software, version1.50. This software allows not only the spectral acquisition but also the data treatment and the development of the quantitative models. Every product cup was scanned 60 times and obtained the representing spectra by averaging the all spectral data. The final spectra data of every sample which was used to develop model was gained by averaging representing spectral data of two parallel product cups.

DOI: 10.7554/elife.28075.010

Google Scholar

[2] 3.1 Outliers eliminating The flour samples were collected randomly and contained the samples whose spectral graph were very different with the most samples and were named outliers. If these outliers were used to developing calibration models, the adaptability and accuracy and predictive ability of models will be influenced negatively[[] V. M. Fernandez-Cabanás, A. G. Varo, J. G. Olmo and E. D. Pedro: Chemo-metrics and Intelligent Laboratory Systems Vol. 87, (2007), p.104

DOI: 10.7554/elife.31835.006

Google Scholar

[2] 3.2 Pretreatment A large amount of spectral data is usually obtained from NIR instruments and yields useful analytical information. However, the data acquired from NIR spectrometer contains background information and noise and samples physical information besides chemical composition models[[] N. Shetty and R. Gislum: Field Crops Research Vol. 120, (2010), p.31

Google Scholar

[2] 3.3 The range of spectral Different component of the flour contains molecular groups of different types and quantities. At the same time the spectral was obtained by the absorption band composited by the molecular groups, as a result the content information of different component of flour concentrated on different spectral region[[] V. T. Edward: Analytical Chemistry Vol. 15, (1994), p. 795A ,[] X. L. Chu, H. F. Yuan and W. Z. Lu: Progress in Chemistry Vol.16, (2004), p, 528 ]. In order to find the concentrated information region of starch we divide the spectral region ranging from 570-1100nm into 6 partition at intervals of 88nm. Then develop models under different spectral region, lower of the SEC and SECV value and higher of the Rc2 and 1-VR value means that the model has the better prediction ability and the corresponding spectral region is the concentrated information region of starch.

DOI: 10.3724/sp.j.1047.2012.00398

Google Scholar

[2] 3.4 The method of modeling The modeling methods usually used contains PCA (Principle Component Analysis), PLS (Partial Least Squares), MPLS (Modified Partial Least Squares, ANN (Artificial neural network) and LC (Local Calibration). PCA and PLS is the most commonly used multivariate calibration method which forms a model that specifies the relationship between a response variable(Y) and a set of dependent variables(X). However, PCA suffers from some significant limitations, the most important is the over fitting of data when there are large numbers of highly correlated variables (significantly more than the number of samples), as is often the case with hyper-spectral reflectance measurements. PLS can overcome this limitation and is slightly better than the PCR because they don't include latent variables that are less important to describe the variance of the quality paraments. PCA and PLS is not always the best option when a nonlinear model is required. MPLS is often more stable and accurate than the standard PLS algorithm. In MPLS, the NIR residuals at each wavelength, obtained after each factor is calculated, are standardized (divided by the standard deviations of the residuals at a wavelength) before calculating the next factor. When developing MPLS equations, cross validation is recommended in order to select the optimal number of factors and avoid overfitting[[] D. C. Pérez-Marín, A. Garrido-Varo, J. E. Guerrero-Ginel and A. Gómez-Cabrera: Animal Feed Science and Technology Vol. 116, (2004), p.333

Google Scholar

[2] 3.5 Validation procedure The external validation exercises were carried out using the corresponding models and validation sample set for predicting the starch content. In assessing the soundness of the calibrations performance, the main considerations were the standard error of prediction (SEP), bias, slope and the coefficient of determination in validation (Rv2). The prediction output from the calibration model for direct NIR MI measuring was compared both with the reference values and with prediction values of parameters described, using the paired samples T-test. In the process of prediction, we can use the function of bias adjustment of the WinISI Ⅱsoftware to change the bias of the calibration model, after that the prediction ability of the model can have an enhancement in a degree. Results and discussion The description statistics for all the sample sets are shown in Table1 and a wide range in starch content was observed. The standard deviation and mean indicated that the formed sets were characterized by even constituent distributions, suggesting that calibration sets will weight the calibration model equally across the entire concentration range, with minimal residuals at the extremes and relatively equal weighting at the centre. The NIR spectroscopy of the whole sample set is shown in Fig.1. From the spectroscopy graph, we can see that all the graphs have a similar changing trend. Table1 Description statistics for calibration and validation sets with regard to starch content Sample set N Range (%) Mean SD Whole sample set 131 58.315-73.145 68.151 2.849 Structured sample set 101 58.315-73.145 68.136 2.887 Validation sample set 29 61.280-72.745 68.206 2.764 N: number of samples; SD=Standard Deviation Fig. 1 The NIR spectroscopy graph of all the flour samples

Google Scholar

[3] 1 The eliminating of outliers Using the different loading type and eliminating method, different outliers are eliminated and the models developed by using the remains have different features, which are shown in Table 2. We can concluded that the predictive ability of models developed under the combination of PCA and H-statistic expressed in terms of SEC, R2, SECV, 1-VR is higher than the other models, and this combination was determined as the best conditions of eliminating outliers. Table 2 Internal validations of quantitative NIRS models built under different outlier eliminating method loading type Calculating method N SEC R2 SECV 1-VR RPD PCA R-statistic 115 1.0676 0.7699 1.2103 0.7124 1.8388 PCA H-statistic 123 1.0524 0.8616 1.2395 0.8086 2.2825 PL1 R-statistic 115 1.0676 0.7699 1.2103 0.7124 1.8388 PL1 H-statistic 126 1.2495 0.8047 1.3849 0.7597 2.0416 Because different outliers will have a latent influence to the final model, under the best conditions, we try to determine the number of outliers. The outlier number was set from 1 to 20, this result in 20 calibration models with different calibration features which are shown in Fig.2. From the graph we can find that the optimal calibration statistics are higher under the condition of 4 outliers than the other number of outliers and we can observed that the prediction ability of models changed gradually as damping vibration and has a lowering trend along the increasing of outlier number, this is in accord with the theory descript in 2.3.1. Fig. 2 The calibration statistics of the optimal NIRS models on the basis of the structured sample set and 0-20 number of outliers elimination passes

Google Scholar