The healthcare data mining with advance AI technology

From ISLAB/CAISR
Revision as of 20:32, 29 September 2024 by Cclab (Talk | contribs)

Title The healthcare data mining with advance AI technology
Summary The CVD health care project
Keywords Time series prediction and imputation
TimeFrame Fall 2024
References [[References::[1] Review of multimodal machine learning approaches in healthcare, Information Fusion, Vol 114, February 2025.

[2] Predicting Chronic Kidney Disease using a multimodal Machine Learning approach, Aakruti Mishra and Navaneeth Puthiyandi, Stockholm University.]]

Prerequisites
Author
Supervisor Guojun Liang, Prayag
Level Master
Status Ongoing


Our project aims to analyze complex healthcare data through the innovative application of advanced AI technologies such as diffusion models, graph neural networks (GNNs), and multimodal machine learning frameworks. By leveraging these state-of-the-art techniques, we seek to uncover latent structures within time series data, enabling us to address challenges such as causal discovery, missing data imputation, and accurate future prediction—ultimately enhancing healthcare decision-making and patient outcomes.

Healthcare data is inherently diverse, comprising clinical time series data (e.g., vital signs, lab results), medical imaging data (e.g., chest X-rays), and sensor-based monitoring data. Traditional approaches often focus solely on temporal patterns, overlooking the valuable relationships between different variables, such as heart rate, blood pressure, and respiratory rate, which are crucial for understanding patient health. Moreover, these relationships are not limited to physical proximity but may also reflect functional dependencies, such as how fluctuations in one vital sign impact others due to underlying medical conditions.

To better harness these complex relationships, our project proposes the development of a multimodal machine learning model using the MIMIC (Medical Information Mart for Intensive Care) dataset, which includes both clinical time series and medical imaging data. The objective is to design a unified model that integrates these diverse data modalities to predict critical clinical outcomes, such as patient mortality, disease progression, or treatment response (e.g., the effectiveness of statin therapy in cardiovascular patients). The model will employ convolutional neural networks (CNNs) for feature extraction from imaging data, while recurrent neural networks (RNNs), transformers, or advanced AI techniques like diffusion models and GNNs will process the time series data.

A key focus of the research will be on developing effective fusion strategies to combine temporal and image-based information, aiming to enhance predictive accuracy and clinical interpretability. This includes three primary areas of exploration:

Time Series Prediction for Patient Health Monitoring: Accurate time series prediction is essential for forecasting disease progression and early detection of health deterioration. This topic involves applying diffusion models, GNNs, and other AI techniques to predict future values of medical variables based on historical patient data, enabling proactive interventions. For example, predicting heart rate variability or blood glucose levels can support the management of chronic conditions and help prevent adverse events.

Time Series Imputation for Healthcare Data: Missing data is a frequent challenge in healthcare due to sensor malfunctions, patient non-compliance, or incomplete records. This topic focuses on using advanced AI technologies like GNNs and diffusion models to effectively fill in these missing values, reconstructing the full dataset to support more reliable diagnostic models and patient monitoring systems. This ensures that critical health decisions are made with comprehensive information.

Causal Discovery in Healthcare Time Series: Understanding causal relationships between medical variables is crucial for accurate diagnosis and treatment planning. This topic involves utilizing GNNs and diffusion models to uncover causal dependencies within patient data, such as how variations in medication dosages impact patient vitals or how different physiological parameters interact during disease progression. Discovering these causal links provides clinicians with deeper insights, facilitating personalized treatment and improved patient outcomes.

By integrating multiple data modalities and leveraging advanced AI techniques, this project aims to provide a comprehensive framework for multimodal analysis, supporting more informed clinical decisions and ultimately improving patient health outcomes.