Skip to main content


We are broadly interested in the research of machine learning, natural language processing, time series analysis, integrative genomics and computational phenotyping, with a focus on medical and clinical applications. Some of our recent works are on multi-modal machine learning (including deep learning) models applied to better understanding complex diseases, informing targeted therapies, improving patient outcomes, and reducing bias and disparity in health care. The common theme of our works aims at building AI/ML models that improve both prediction accuracy and interpretability, by exploring relational information in each data modality.

We have delved into different modalities of the healthcare data (e.g., unstructured clinical notes, structured EHR data, imaging data, genetic data etc.) and build methods to enable these data modalities to be individually and/or jointly mined to derive actionable intelligence. We have been actively working on developing flagship datasets to power high impact research. 

Choose from a selected project below to learn more: