May
3
Thu
BMIR Colloquium: Craig E. Stanley Jr., PhD Paul L. Snyder, PhD William F. Dowling, PhD Mevan S. Samarasinghe “Towards an integrated healthcare knowledge graph: Transforming and connecting dynamic healthcare data” @ MSOB, Conference Room X-275
May 3 @ 12:00 pm – 1:00 pm

Elsavier group 4

Craig E. Stanley Jr., PhD;
Paul L. Snyder, PhD;
William F. Dowling, PhD;
Mevan S. Samarasinghe, VP Search & Discovery
Elsevier

ABSTRACT:
As the scale and scope of healthcare information grows, practices evolve from traditional disease-centric medicine to precision medicine. To address these challenges, we have created a knowledge graph that facilitates advanced clinical decision support approaches by connecting and extracting knowledge from across the corpus of medical literature, unifying heterogeneous sources of healthcare information. Our healthcare knowledge graph, H-Graph, is built from a comprehensive medical ontology comprising over 500,000 medical concepts and relationships between them. The core of H-Graph is realized on an RDF-based graph database platform, and it uses Linked Data principles to connect systems containing authoritative healthcare knowledge and data sets: diseases, drugs, anatomy, best practices, order sets, care plans, guidelines, clinical pathways, medical imaging data, and relevant literature, such as journals and books.

 

To represent increasingly complex healthcare data and provide a foundation on which to support clinical decisions, the H-Graph data model supports deeper granularity and expressive semantic relationships. It provides multi-language support (English, French, and Spanish), geographic specificity, and qualitative mappings to industry-standard vocabularies used in medical literature and electronic health records, including SNOMED, LOINC, ICD-10, RXNORM, and MeSH. In addition to ontological relationships, H-Graph supports contextualization of semantic relations based on patient demographics and medical history. H-Graph is able to represent both scalar reference ranges and qualitative observations for laboratory tests, categorized by patient characteristics.

 

We have also employed natural language processing and convolutional neural network methods to extend H-Graph with a comprehensive set of symptom-to-disease relations extracted from the Science Direct and Clinical Key publication platforms (including 7.7M full-text articles and book chapters published in the medical domains). Natural language extraction in concert with machine learning and automated validation pipelines allows a much higher coverage of symptoms and (uncommon or even rare) diseases than can be achieved with manually constructed knowledge bases.

Mar
7
Thu
BMIR Research Colloquium: Parag Mallick, PhD “Quantifying the Reproducibility of Proteogenomic Analyses Using a Semantically Aware Discovery Engine” @ MSOB Conference Room X275
Mar 7 @ 12:00 pm – 1:00 pm


Parag Mallick, PhD,
Associate Professor, Department of Radiology
Canary Center at Stanford, Stanford Medicine

Thursday, March 7th, 2019, 12:00 pm to 1:00 pm
MSOB Conference Room X-275

ABSTRACT:
Initiatives like the Clinical Proteomic Tumor Analysis Consortium (CPTAC) have been launched in the past decade to examine the multi-omic relationships that drive cancer behavior. The analyses of multi-omics data have highlighted interesting clinical subtypes. It has also revealed the incredibly complex relationships that exist between scales. Unfortunately, the analyses of multi- omics data are exponentially more challenging that of single-ome analyses. Seemingly subtle changes in workflow can have dramatic impacts on findings. Building on top of an intelligent semantic workflow system, we captured the analytic methods of key proteogenomic papers as workflows and executed them systematically against diverse large multi-omics datasets. These studies revealed the fragility of multi-omic analyses. At the lowest levels (peptides identified), even trivially small changes had massive implications in what peptides or proteins were identified. Interestingly, higher-order findings such as patient strata were invariant to many perturbations. Ultimately, these studies suggest that even computational analyses, which we think of as highly systematic and reproducible, may be subject to many of the same issues as experimental studies.

Bio:
Dr. Parag Mallick is an Associate Professor at Stanford University. Originally trained as an engineer and biochemist, his research spans computational and experimental systems biology, cancer biology and nanotechnology. Dr. Mallick received his undergraduate degree in Computer Science from Washington University in St. Louis. He then obtained his Ph.D. from UCLA in Chemistry & Biochemistry, where he worked with Dr. David Eisenberg. He completed Post- Doctoral studies at The Institute for Systems Biology, in Seattle, WA with Dr. Ruedi Aebersold. Beyond studying fundamental disease mechanisms, his group has been pioneering novel approaches for enabling personalized and predictive medicine.