Conditional GAN for better embedding and generation of medical codes

From ISLAB/CAISR
Title Conditional GAN for better embedding and generation of medical codes
Summary Synthetic data generation of Electronic Health Records with a focus on medical codes
Keywords GANs, Electronic Health Records, Representation Learning
TimeFrame
References Data: https://mimic.mit.edu/docs/about/

papers: https://dspace.mit.edu/handle/1721.1/128349 https://proceedings.neurips.cc/paper/2019/file/254ed7d2de3b23ab10936522dd547b78-Paper.pdf https://www.sciencedirect.com/science/article/pii/S0957417421000233

Prerequisites
Author
Supervisor Stefan Byttner, Amira Soliman, Kobra Etminani, Atiye Sadat Hashemi
Level Master
Status Open


The use of Electronic Health Records (EHR) is increasing in both research and clinical practice to enhance the ability to provide the needed care to patients without posing additional economic burdens. Patient data is considered temporal data that is being measured over time. Due to privacy constraints, EHR data can’t be publicly shared for research in machine learning and artificial intelligence. Synthetic data generation introduces a solution, by generating artificial patient data. This thesis aims to investigate the use of generative adversarial networks (GANs) in better representation and generation of patient encounters with medical diagnoses. Example of research questions: How to represent medical codes as categorical variables? How to enrich the representation of categorical variables?