About Me
I am a faculty member in the division of mental health data science in the Department of Psychiatry at Columbia University. My research interest lies in bridging Machine learning and Deep Learning to Biostatistical methodology
research and land these methods to Psychiatry Research. I have been working on the following directions:
A. (Deep) Reinforcement Learning for Medical Decision Making.
B. Interpretable Machine Learning: Enabling FDR controlled variable selection for Deep learning and other predictive blackbox model
C. New machine learning algorithms for Deep Phenotyping Psychiatry patients with multiple modalities.
To land machine learning to medical research, and to psychiatry researches in particular is not to directly borrow methods from computer image or natural language process. The noisiness, difficulties to collect unlimited samples, ethical constraints of randomized studies, requirement for reproducibility and interpretability all calls for the invention of new algorithms and methods for each specific problems. On the other hand, the traditional methods and applications in biostatistics such as causal inference, mediation analysis, latent variable modeling, ect. provides a good base for developing new machine learning methods. I graduated from Mailman School of Public Health in 2016, the focus of my Ph.D. dissertation research is in merging statistical modeling with medical domain knowledge and machine learning algorithms to help making personalized medical decisions.
Medical big data has become a heated technology innovation area that has the promise to bring about more efficient and affordable patient-centered health care to everyone. Some topics of my current and on going projects are listed below.
Dissertation Contents:
A. Personalized Optimal Screening/Diagnostic Strategy.
B. Learning Method to Find Optimal DTR in Sequential Multiple Assignment Randomized Trial (SMART).
C. Design of Sequential Multiple Assignment Randomized Trial with Enrichment.
Ongoing Projects and Topics of Interest:
A. O-learning for Functional Data and Multiple Data Sources.
With motivation to find the biosigniture to classify MDD patients that would respond to placebo and patients only respond to citalopram in the EMARC study.
B. Feature Selection for O-Learning
C. Estimating Personalized DTR from the EHR data