Abstract
Background: Raman Spectroscopy is a non-invasive technique capable of characterising tissue constituents and detecting conditions such as cancer with high accuracy. Machine learning techniques can automate this task and discover relevant data patterns. However, the high-dimensional, multicollinear nature of Raman data makes their deployment and explainability challenging. A model’s transparency and ability to explain decision pathways have become crucial for medical integration. Consequently, an effective method of feature-reduction while minimising information loss is sought. Methods: Two new feature selection methods for Raman spectroscopy are introduced. These methods are based on explainable deep learning approaches, considering Convolutional Neural Networks and Transformers. Their features are extracted using GradCam and attention scores, respectively. The performance of the extracted features is compared to established feature selection approaches across four classifiers and three datasets. Results: We compared the proposed method against established feature selection approaches over three real-world datasets and different compression levels. Comparable accuracy levels were obtained using only 10% of features. Model-based approaches are the most accurate. Using Convolutional Neural Networks and Random Forest-assigned feature importance performs best when maintaining between 5–20% of features, while LinearSVC with L1 penalisation leads to higher accuracy when selecting only 1% of them. The proposed Convolutional Neural Networks-based GradCam approach has the highest average accuracy. Conclusions: No approach is found to perform best in all scenarios, suggesting that multiple alternatives should be assessed in each application.
| Original language | English |
|---|---|
| Article number | 2063 |
| Journal | Diagnostics |
| Volume | 15 |
| Issue number | 16 |
| DOIs | |
| Publication status | Published - Aug 2025 |
Keywords
- Biophotonics
- Explainable AI
- feature selection
- machine learning
- Raman spectroscopy
- Tissue Classification