TY - GEN
T1 - Fine-Tuning Generative Pre-Trained Transformers for Clinical Dialogue Summarization
AU - Ronan, Isabel
AU - Tabirca, Sabin
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Automated clinical dialogue summarization can help make health professional workflows more efficient. With the advent of large language models, machine learning can be used to provide accurate and efficient summarization tools. Generative Pre-Trained Transformers (GPT) have shown huge promise in this area. While larger GPT models, such as GPT-4, have been used, these models pose their own problems in terms of precision and expense. Fine-tuning smaller models can lead to more accurate results with less computational expense. In this paper, we fine-tune a GPT-3.5 model to summarize clinical dialogue. We use both default hyperparameters along with manual hyperparameters for comparison purposes. We also compare our default model to past work using ROUGE-1, ROUGE-2, ROUGE-L, and BERTScores. We find our model outperforms GPT-4 across all measures. As our fine-tuning process is based on the smaller GPT-3.5 model, we show that fine-tuning leads to more accurate and less expensive results. Informal human observation also reveals our notes to be of acceptable quality.
AB - Automated clinical dialogue summarization can help make health professional workflows more efficient. With the advent of large language models, machine learning can be used to provide accurate and efficient summarization tools. Generative Pre-Trained Transformers (GPT) have shown huge promise in this area. While larger GPT models, such as GPT-4, have been used, these models pose their own problems in terms of precision and expense. Fine-tuning smaller models can lead to more accurate results with less computational expense. In this paper, we fine-tune a GPT-3.5 model to summarize clinical dialogue. We use both default hyperparameters along with manual hyperparameters for comparison purposes. We also compare our default model to past work using ROUGE-1, ROUGE-2, ROUGE-L, and BERTScores. We find our model outperforms GPT-4 across all measures. As our fine-tuning process is based on the smaller GPT-3.5 model, we show that fine-tuning leads to more accurate and less expensive results. Informal human observation also reveals our notes to be of acceptable quality.
KW - Data augmentation
KW - Machine translation
KW - Parameter tuning
KW - Synthetic data generation
KW - Transformers
UR - https://www.scopus.com/pages/publications/85217408841
U2 - 10.1109/FIT63703.2024.10838420
DO - 10.1109/FIT63703.2024.10838420
M3 - Conference proceeding
AN - SCOPUS:85217408841
T3 - 2024 International Conference on Frontiers of Information Technology, FIT 2024
BT - 2024 International Conference on Frontiers of Information Technology, FIT 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 International Conference on Frontiers of Information Technology, FIT 2024
Y2 - 9 December 2024 through 10 December 2024
ER -