M.D., Ph.D., Director of Product Management, Flatiron Health
Vineeta is a physician-scientist who is passionate about the intersection of data science and health care. As a director of product management at Flatiron Health, she works with a cross-disciplinary team to build national-scale, research-grade databases that integrate electronic health records and genomic data for translational/outcomes research and drug development in oncology. Previously, Vineeta conducted genomics research at the Broad Institute of Harvard & MIT. She has also been a data scientist at the Boston-based health tech startup Kyruus, and a management consultant for biotech, pharmaceutical, and medical device clients at McKinsey & Company in New York. Vineeta has a B.S. in biophysics from Stanford University. She earned M.D. and Ph.D. degrees from Harvard Medical School and MIT.
Real-World, EHR-Linked Clinico-Genomic Datasets to Accelerate Cancer Research
As cancer patients receive care in the real world, electronic health records (EHRs) create an ever-growing, rich digital footprint of symptoms, treatments, and outcomes. Concurrently, tumor molecular profiling is being conducted to identify optimal therapies as part of routine care. How can these “real-world” data be harnessed to enable faster learning in oncology?
We describe a new paradigm for creating a continuously refreshing, real-world clinico-genomic database (CGDB) that overcomes traditional data flow barriers. We developed HIPAA-compliant processes to link patient-level clinical data from EHRs across the U.S. (in the Flatiron Health network) with patient-level tumor sequencing data from Foundation Medicine. These data were strictly de-identified for research.
The CGDB includes >20,000 patients (>3000 with lung cancer, >2000 with colon cancer, >2000 with breast cancer) and grows quarterly. Genomic alterations (SNVs, CNVs, rearrangements) across >300 genes, tumor mutation burden (TMB), and microsatellite instability (MSI) status are included, alongside treatment history and response data. The dataset recapitulates known survival trends for biomarker-defined sub-populations receiving targeted therapies (EGFR+, ALK+), and supports recent observation that high-TMB predicts response to checkpoint-inhibitor immunotherapies. The CGDB enables rapid queries for translational science, outcomes research, and trial design, and could ultimately extend even to the point-of-care.