TY - JOUR
T1 - Rapid classification of peanut varieties for their processing into peanut butters based on near‐infrared spectroscopy combined with machine learning
AU - Yu, Hongwei
AU - Erasmus, Sara W.
AU - Wang, Qiang
AU - Liu, Hongzhi
AU - van Ruth, Saskia M.
N1 - Publisher Copyright:
© 2023
PY - 2023/7
Y1 - 2023/7
N2 - Peanut classification based on processing purposes is becoming mainstream. In order to speed up the classification procedure, near-infrared (NIR) spectroscopy for classifying peanut varieties for their processing into peanut butters was assessed for the first time. Peanut varieties were primarily classified by principal component analysis (PCA) combined with cluster analysis based on the structural characteristics (texture and rheology) and roast characteristics (colour and volatile compounds) of the resulting peanut butters. After the completion of spectral collection and subsequent spectral pre-treatments, the performances of classification models built by partial least squares discriminant analysis, support vector machine, and random forest were compared. PCA, variable importance, and random forest selection by filter were investigated as feature extraction methods. The sensitivity, specificity, and accuracy of the filtered cross validation and external validation models were all over 90%, while the kernel density estimation presented the acceptable distribution results of categories probabilities in the selected models. These results showed that NIR spectroscopy combined with machine learning methods is a promising approach to provide a reliable evaluation of peanuts for efficient processing.
AB - Peanut classification based on processing purposes is becoming mainstream. In order to speed up the classification procedure, near-infrared (NIR) spectroscopy for classifying peanut varieties for their processing into peanut butters was assessed for the first time. Peanut varieties were primarily classified by principal component analysis (PCA) combined with cluster analysis based on the structural characteristics (texture and rheology) and roast characteristics (colour and volatile compounds) of the resulting peanut butters. After the completion of spectral collection and subsequent spectral pre-treatments, the performances of classification models built by partial least squares discriminant analysis, support vector machine, and random forest were compared. PCA, variable importance, and random forest selection by filter were investigated as feature extraction methods. The sensitivity, specificity, and accuracy of the filtered cross validation and external validation models were all over 90%, while the kernel density estimation presented the acceptable distribution results of categories probabilities in the selected models. These results showed that NIR spectroscopy combined with machine learning methods is a promising approach to provide a reliable evaluation of peanuts for efficient processing.
KW - Cluster analysis
KW - Efficient processing
KW - Near-infrared spectroscopy
KW - Peanut butters
KW - Random forest
KW - Support vector machine
UR - https://www.scopus.com/pages/publications/85153052792
U2 - 10.1016/j.jfca.2023.105348
DO - 10.1016/j.jfca.2023.105348
M3 - Article
AN - SCOPUS:85153052792
SN - 0889-1575
VL - 120
JO - Journal of Food Composition and Analysis
JF - Journal of Food Composition and Analysis
M1 - 105348
ER -