Abstract
The k-Nearest Neighbours (kNN) algorithm is a fundamental machine learning method known for its simplicity and effectiveness. However, it is notably sensitive to noisy data, which can significantly degrade its performance. We introduce k-Most-Influential Neighbours (kMIN), a novel variant designed to enhance noise resistance while adhering to the lazy learning paradigm of kNN. kMIN integrates reliability and similarity measures to obtain what we call influence. kMIN uses influence to identify neighbours and to weight their contributions when making predictions, mitigating the impact of unreliable training examples. We evaluate kMIN against traditional kNN, Weighted-kNN, and dataset editing algorithms for noise-filtering across multiple classification and regression datasets with varying noise levels. The results indicate that kMIN shows promise in noisy datasets, outperforming existing methods in several cases, but with further testing required to solidify these findings.
| Original language | English |
|---|---|
| Pages (from-to) | 421-433 |
| Number of pages | 13 |
| Journal | CEUR Workshop Proceedings |
| Volume | 3910 |
| Publication status | Published - 2024 |
| Event | 32nd Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2024 - Dublin, Ireland Duration: 9 Dec 2024 → 10 Dec 2024 |
Keywords
- edited nearest neighbours
- instance selection
- k-nearest Neighbours
- noisy data