TY - GEN
T1 - Query-Focused Submodular Demonstration Selection for In-Context Learning in Large Language Models
AU - Trust, Paul
AU - Minghim, Rosane
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The increase in dataset and parameter size of large language models has given rise to an emergent ability known as In-context Learning (ICL). This approach allows models to perform tasks based on human instructions and a few demonstration examples in a prompt. ICL differs from traditional fine-tuning methods by enabling the adaptation of pretrained models to new tasks without modifying their core parameters or requiring gradient updates. Despite its potential, the intri-cacies of ICL, particularly the methods for choosing effective demonstration examples to enhance predictive performance, are not fully understood, with prior research often relying on random selection. Our research addresses this gap in two ways. Firstly, we advocate the use of query-focused submodular mutual information functions for selecting demonstration examples in ICL. These functions help identify examples that are both diverse and representative, thereby improving few-shot performance in comparison to random and zero-shot baselines. Our experiments validate this approach. Secondly, we introduce an interactive tool to explore the impact of hyperparameters on model performance. These parameters include the quantity and generation methods of demonstration examples, and their influence on data manifolds and clusters. Our results show that carefully chosen examples can lead to performance improvements of up to 20%. For instance, in sentiment classification, we observed an f1-score of 88.35% compared to 51.95%, and in topic classification, 90.56% versus 31.38%.
AB - The increase in dataset and parameter size of large language models has given rise to an emergent ability known as In-context Learning (ICL). This approach allows models to perform tasks based on human instructions and a few demonstration examples in a prompt. ICL differs from traditional fine-tuning methods by enabling the adaptation of pretrained models to new tasks without modifying their core parameters or requiring gradient updates. Despite its potential, the intri-cacies of ICL, particularly the methods for choosing effective demonstration examples to enhance predictive performance, are not fully understood, with prior research often relying on random selection. Our research addresses this gap in two ways. Firstly, we advocate the use of query-focused submodular mutual information functions for selecting demonstration examples in ICL. These functions help identify examples that are both diverse and representative, thereby improving few-shot performance in comparison to random and zero-shot baselines. Our experiments validate this approach. Secondly, we introduce an interactive tool to explore the impact of hyperparameters on model performance. These parameters include the quantity and generation methods of demonstration examples, and their influence on data manifolds and clusters. Our results show that carefully chosen examples can lead to performance improvements of up to 20%. For instance, in sentiment classification, we observed an f1-score of 88.35% compared to 51.95%, and in topic classification, 90.56% versus 31.38%.
KW - Data Selection
KW - In-context Learning
KW - Language Models
KW - Submodular Optimization
KW - Visualization
UR - https://www.scopus.com/pages/publications/85189931716
U2 - 10.1109/AICS60730.2023.10470628
DO - 10.1109/AICS60730.2023.10470628
M3 - Conference proceeding
AN - SCOPUS:85189931716
T3 - 2023 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2023
BT - 2023 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2023
Y2 - 7 December 2023 through 8 December 2023
ER -