Leveraging LLMs for Clinical Note Annotation and Uncertainty Estimation

Title	Leveraging LLMs for Clinical Note Annotation and Uncertainty Estimation
Summary	The student will investigate the potential of LLMs to simplify clinical note annotation along with uncertainty estimation, contributing to improved healthcare data management.
Keywords	Machine Learning, Large Language Models, Uncertainty Estimation, Electronic Health Records
TimeFrame	2023-2024
References	Yang, Zhichao, et al. "Multi-label few-shot ICD coding as autoregressive generation with prompt." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 4. 2023. Liu, Leibo, et al. "Automated icd coding using extreme multi-label long text transformer-based models." Artificial Intelligence in Medicine (2023): 102662. Hu, Edward J., et al. "Lora: Low-rank adaptation of large language models." arXiv preprint arXiv:2106.09685 (2021). Sensoy, Murat, Lance Kaplan, and Melih Kandemir. "Evidential deep learning to quantify classification uncertainty." Advances in neural information processing systems 31 (2018).
Prerequisites	Statistics; Neural Networks; Programming (Python)
Author
Supervisor	Awais Ashfaq, Prayag Tiwari
Level	Master
Status	Open

This Master's thesis project aims to harness Large Language Models (LLMs) for automating clinical note annotation, with a specific focus on generating validated diagnostic and procedure codes (ICD and KVÅ) that hold clinical significance. Beginning with the MIMIC-III dataset and extending to real Swedish clinical data, the project will explore the following technical and scientific directions:

1. Model Training: Investigate cutting-edge techniques for training LLMs, including fine-tuning strategies, domain adaptation, and transfer learning, to optimize their performance for clinical note annotation.

2. Uncertainty Estimation Methods: Develop and implement uncertainty estimation methods such as evidential deep learning to provide confidence scores for the model's annotations.

3. Real-World Clinical Utility: Evaluate the clinical utility of the generated diagnostic and procedure codes by collaborating with healthcare professionals and analyzing the impact of these codes on patient care, data management, and reimbursement processes.

4. Multi-Language Adaptation: Explore methods for adapting the LLM models to the Swedish language, ensuring their effectiveness in a non-English clinical setting.

5. Ethical Considerations: Address ethical and privacy concerns related to patient data, ensuring compliance with healthcare regulations and data protection laws.

The core research question, "How can LLMs be effectively trained and deployed to produce clinically validated codes?" will guide these technical and scientific directions. Additionally, the student is encouraged to propose and explore their own research questions.

Contact: Awais Ashfaq (awais.ashfaq@hh.se)

Leveraging LLMs for Clinical Note Annotation and Uncertainty Estimation

Navigation menu

Views

Personal tools

Search

Tools