Feb
2
Thu
BMIR Research in Progress: Marcos Martinez-Romero “CEDAR’s Predictive Data Entry: Easier and Faster Creation of High-quality Metadata” @ MSOB, Conference Room X-275
Feb 2 @ 12:00 pm – 1:00 pm

 

Martinez Romero_Marcos
Marcos Martínez-Romero, PhD
Research Software Developer
BMIR, Stanford University

Abstract:
The ability to find and to access biomedical data that are stored in online repositories depends on the quality of the associated metadata. Despite the growing number of community-developed standards for describing biomedical experiments, the practical difficulties to creating accurate, complete, and consistent metadata are still considerable.

The Center for Expanded Data Annotation and Retrieval (CEDAR) is developing novel methods and tools to simplify the process by which investigators annotate their experimental data with metadata. The CEDAR Workbench is a suite of Web-based tools that together form a pipeline for authoring metadata. As a step towards decreasing authoring time and effort while increasing metadata quality, we have enhanced the CEDAR Workbench with predictive data entry capabilities. Our system identifies common patterns in the CEDAR metadata repository, and generates real-time suggestions for filling out metadata acquisition forms. These suggestions are context-sensitive, meaning that the values predicted for a particular field are generated and ranked based on previously entered values.

In this talk, I will discuss some of the challenges that have arisen while implementing our approach, and our strategies for making this capability useful to the end users of CEDAR. I will demonstrate CEDAR’s intelligent authoring capabilities, and show how the technology that we are developing leverages existing metadata to make the authoring of high-quality metadata a manageable task.

Mar
23
Thu
BMIR Research in Progress: Alison Callahan “Painfully Deep Phenotyping – Extracting Patient Reported Pain from Clinical Notes” @ MSOB, Conference Room X-275
Mar 23 @ 12:00 pm – 1:00 pm

alison-callahan 2

Alison Callahan
Research Scientist
Shah Lab, Stanford University

Abstract:
Osteoarthritis is the most common cause of adult disability in the United States, and more than 1 million joint replacements are carried out each year to manage this disease. A significant proportion of patients who undergo joint replacement surgery do not experience an improvement in pain, and some go on to have significant complications requiring joint implant revision surgery. To better understand outcomes following joint replacement, we aim to quantify patient-reported pain before and after surgery, and to combine this information with structured data from electronic health records. We accomplish this by extracting information from clinical notes, which describe patient experience and clinician practice in ways not captured by billing codes, lab reports and medication orders. Natural variation in clinical language and reporting styles, and the cost of creating labeled datasets for supervised machine learning approaches, pose unique challenges for extracting information from unstructured clinical text. I will present in-progress work to overcome these challenges using data programming and text mining to extract mentions of patient-reported pain and implant details from the clinical notes of joint replacement patients.