BMIR Research in Progress: Jason Fries, PhD “Program Your Training Data! Using Medical Domain Knowledge to Learn From Unlabeled Data”

November 29, 2018 @ 12:00 pm – 1:00 pm
MSOB, Conference Room X275
1265 Welch Rd
Stanford, CA 94305
Marta Vitale
(650) 208-8547

Jason Fries, PhD,
Research Scientist,
BMIR, Stanford University

In biomedicine, obtaining expert-labeled training data is a key bottleneck to using machine learning methods. However, recent efforts such as Stanford’s Snorkel system are creating new ways of using expert heuristics to train large-scale machine learning models. This approach provides many practical benefits, from improving classification performance by modeling the unobserved accuracies of label sources, to creating software artifacts that can be shared, modified, and applied to new datasets. We outline two successful applications of Snorkel in biomedicine: (1) analyzing patient clinical notes to extract implant-related complications following total hip replacement; and (2) identifying patients with rare cardiac malformations using MRI video data from the UK Biobank.

Jason Fries, PhD:
Jason Fries is a research computer scientist at the Stanford Center for Biomedical Informatics Research working with Prof. Nigam Shah. He recently completed a postdoctoral fellowship with Prof. Chris Ré and Scott Delp as part of Stanford’s Mobilize Center. His research interests include methods for training machine learning models using limited hand-labeled data, such as weak supervision and few-shot learning, with a focus on extracting information from unstructured biomedical data.