Yanshan Wang, PhD, FAMIA

  • Assistant Professor, Department of Health Information Management
  • Vice Chair of Research, Department of Health Information Managment
  • Assistant Professor of Intelligent Systems Intelligent Systems Program (ISP) University of Pittsburgh
  • Assistant Professor of Biomedical Informatics Department of Artificial Intelligence & Informatics Mayo Clinic

Yanshan Wang, PhD, FAMIA is Assistant Professor and Vice Chair of Research within the Department of Health Information Management, School of Health and Rehabilitation Sciences, with secondary appointments in Intelligent Systems Program, School of Computing and Information, and Department of Biomedical Informatics, School of Medicine, University of Pittsburgh. His research interests focus on artificial intelligence (AI), natural language processing (NLP), machine learning, and deep learning methodologies and applications in healthcare. His research goal is to leverage different dimensions of data and data-driven computational approaches to meet the needs of clinicians, researchers, and patients. Prior to joining Pitt, Dr. Wang was in the Department of AI & Informatics at Mayo Clinic where he is still holding an adjunct Assistant Professor position.

Representative Publications

  1. Yanshan Wang, Yiqing Zhao, Terry M. Therneau, Elizabeth J. Atkinson, Ahmad P. Tafti, Nan Zhang, Shreyasee Amin, Andrew H. Limper, Sundeep Khosla, and Hongfang Liu. Unsupervised Machine Learning for the Discovery of Latent Disease Clusters and Patient Subgroups Using Electronic Health Records. Journal of Biomedical Informatics. 2020.
  2. Yanshan Wang, Liwei Wang, Majid Rastegar-Mojarad, Sungrim Moon, Feichen Shen, Naveed Afzal, Sijia Liu, Yuqun Zeng, Saeed Mehrabi, Sunghwan Sohn, Hongfang Liu. Clinical Information Extraction Applications: A Literature Review. Journal of Biomedical Informatics 77. 2017.
  3. Yanshan Wang, Sijia Liu, Naveed Afzal, Majid Rastegar-Mojarad, Liwei Wang, Feichen Shen, Hongfang Liu. A Comparison of Word Embeddings for the Biomedical Natural Language Processing. Journal of Biomedical Informatics. 2018.
  4. Yanshan Wang, Sunghwan Sohn, Sijia Liu, Feichen Shen, Liwei Wang, Elizabeth J Atkinson, Shreyasee Amin, Hongfang Liu. A Clinical Text Classification Paradigm based on Deep Representation and Weak Supervision. BMC Medical Informatics and Decision Making. 2018.
  5. Nicolas Nunez, Joanna M. Biernacka, Manuel Gardea-Resendez, Bhavani Singh Agnikula Kshatriya, Euijung Ryu, Sunyang Fu, Balwinder Singh, Brandon Coombes, Mark Frye, and Yanshan Wang. Natural Language Processing for Automatic Identification of Major Depressive Disorders in Free-Text Electronic Health Records. Biological Psychiatry 89. 2021.

Research Interests

  • Clinical natural language processing
  • Clinical research informatics
  • Deep learning
  • Health informatics

Research Grants

NIH/Clinical and Translational Science Institute (CTSI), University of Pittsburgh 04/01/2022 – 03/31/2023
Role: Principal Investigator
Title: A3ST: AI-based Automated Fidelity Assessment for Strategy Training in Inpatient Rehabilitation
Goal: To leverage advanced artificial intelligence (AI) techniques to automate the fidelity assessment approach, which have the potential to propel and translate to rehabilitation intervention practice and research forward in new directions previously untapped.
Pitt Momentum Fund, University of Pittsburgh 05/01/2022 – 04/31/2023
Role: Principal Investigator
Title: Improving Health Equity by Analyzing Social Determinants of Health from the Electronic Health Records
Year of Data and Society Funding, University of Pittsburgh 03/31/2023
Role: Principal Investigator
Title: Understanding Bias in Big Data and Artificial Intelligence for Health Care Through an Educational Health Informatics Hackathon
AWS Diagnostic Development Initiative (DDI) Award, Amazon 12/31/2022
Role: Principal Investigator
Title: Multimodal Machine Learning for Rapid Diagnosis
10/01/2020 –
Goal: This study tries to use CT images and de-identified electronic health records to develop machine learning or deep learning models for rapid disease diagnosis and related risk factors identification.
CHECE, Center for Health Equity and Community Engagement Research Award, Mayo Clinic 02/01/2020 – 01/31/2021 Role: Principal Investigator
Title: Developing Artificial Intelligence Models to Automatically Identify Social Determinants of Health Among Minority Populations from the Electronic Health Records and to Provide Implications for Health Equity
Goal: This study has three goals; 1) attempt to automatically infer the presence of SDOH status of minority populations based on their EHRs; 2) develop and evaluate artificial intelligence (AI) models for inferring a patient’s respective SBDH from EHR data; and 3) provide implications for health equity.
NIH/NIMH-R01MH121924 09/05/2019 – 05/31/2021
Role: Co-Investigator
Title: Leveraging EHR-linked biobanks for deep phenotyping, polygenic risk score modeling, and outcomes analysis in psychiatric disorders
Goal: This proposal will apply big data techniques for development of polygenic risk scores and their association to clinical outcomes and social determinants using large-scale integrated phenotype-genotype data.
NIH/NIKKD-R03DK128127 04/01/2021 – 05/31/2021 Role: Co-Investigator
Title: Digital Phenotyping of Nonalcoholic Fatty Liver Disease
Goal: The objective of this study is to leverage data and analytics to 6 improve healthcare outcomes by early detection and risk stratification of NAFLD, before onset of liver-related 7 complications.
NIH/NCATS-UL1TR02377 07/18/2017 – 06/30/2022 Role: Co-Investigator
Title: Mayo Clinic Center for Clinical and Translational Science (CCaTS)
Goal: To facilitate the development of new therapies and their implementation in clinical practice; to train the next generation of clinical and translational physicians and scientists; to engage communities to fully participate in the translational research process; and to partner with regional and national networks to ultimately improve patient care and human health.

NIH/NLM-R01LM011934 09/01/2014 – 07/31/2020 Title: Semi-structured Information Retrieval in Clinical Text for Cohort Identification Goal: To make full use of clinical text in retrieving patients from the EMR. A layered language model for searching clinical text is introduced, addressing the need for both fine-grained information and big-picture contextual information.

Consulting Services
NIH/NIH-UL1TR02377 08/01/2018 – 07/31/2019 Role: Co-Investigator
Title: Supplement Investigation of Chronic Pain Management Based on Electronic Health Records
Goal: The long-term goal is to leverage the REP data and advanced informatics and analytics approaches to derive data-driven insights on chronic disease managements and build decision support tools for precision healthcare delivery.
NIH/NIH-R01NS102233 06/01/2017 – 05/31/2021 Role: Co-Investigator
Title: Enabling Comparative Effectiveness Research in Silent Brain Infarction Through Natural Language Processing and Big Data
Goal: The goal of this project is to improve the evidence base for the prevention of stroke in patients with silent brain infarct, i.e., a stroke on neuroimaging (head CT or MRI) but no clinical evidence of a stroke, using natural language processing that can accurately identify cases of silent brain infarction among a large population of adults