Interview with Dr. Frank Lee, Global Healthcare & Life Sciences Industry Leader at IBM; Chief Architect, IBM Reference Architecture for Genomics – Speaker at PMWC 2018 Silicon Valley

Q: As a pioneer in high performance architectures for genomic research what are the biggest challenges that you see clients dealing with as they invest in precision medicine initiatives?

A: As precision medicine rapidly transforms the healthcare industry into one that is completely data-driven and evidence-based, a giant data wave is forming and clients are struggling to deploy the necessary infrastructure and technologies to keep pace.

As part of my doctoral work in molecular genetics at Washington University I participated in the first international genome project which took nearly 10 years and billions of dollars to sequence the first human genome. With the arrival of next generation sequencing technologies, we are now able to complete DNA sequencing for a single genome for roughly one thousand dollars and in less than a week’s time. Those cost dynamics will eventually support the expansion of genomics analytics beyond academic research centers and pharmaceutical R&D into clinical settings but not until organizations are equipped to handle the large amount of new data that will be produced on a regular basis.

Q: But precision medicine involves much more than just genomics data. What are the other data sources that we will need to address to support precision medicine?

A: That’s right –multiple studies indicate that an individual’s health is approximately 30% attributable to his/her genomic profile. The largest source of data impacting our health is believed to be associated with factors outside the health system –what we refer to as exogenous data including individual environmental, socio-economic and lifestyle characteristics. When combined with other data sources that are essential to comprehensive precision health including clinical records, labs & imaging; omics/Dx data; and remote monitoring and wearable data, it is easy to understand the magnitude of the challenge.

Q: As you developed a target reference architecture what are the other challenges that you had to consider?

A: The client’s computer platform must be fast, easy to use, affordable, and collaborative. It must accommodate the growing volumes of data which is often dispersed across geographic and organizational boundaries to support collaborative research and care management. It must also support an open framework and the hundreds of applications which are being developed in the areas of genomics, imaging, clinical analytics and artificial intelligence & deep learning. Many of these applications are isolated in functional and operational siloes further inhibiting integration of relevant data sets.

Q: With that in mind what advice do you give to clients who are making strategic investments to modernize their existing capabilities?

A: I look at three key ingredients to building a flexible and sustainable foundation that will support the demands of precision medicine. First, the solution must be defined by software. Even though the infrastructure is built using hardware with chips and processors, and data resides in storage attachments, the ability to operate, orchestrate and organize data is in software, or middleware sitting between hardware and applications. This is what we call software-defined infrastructure (SDI).

Second, any solution needs to follow a defined reference architecture. Our experience with clients has taught us that software capabilities can easily be dictated by underlying hardware building blocks (CPU vs GPU, on-prem vs cloud, x86 vs Power8) and even more so by the hundreds if not thousands of applications it needs to support. Without a consistent framework and roadmap in the form of reference architecture, things will fall apart (or branch off) very quickly. It might take more effort initially, but the value and benefits are long-lasting and wide-reaching.

Finally, the solution, software and architecture must be part of an ecosystem. There are dozens of major healthcare and life science organizations worldwide — international initiatives, top cancer centers, genome centers and large pharma and biotech companies, that have adopted our solutions and reference architecture. One benefit they started to realize is collaboration and sharing based on common architecture. For example, a research hospital can develop a cancer genomics pipeline and share it with another institution quickly either by sending the XML-based script or publishing it in a cloud-based portal like an application store. We have also started to see early examples of data sharing using metadata and RESTful APIs. Based on this approach, parallel communities or consortium are being formed for digital medical imaging, translational research and big data analytics. This makes parallel discovery possible.

Q: It can all sound a bit overwhelming. What do you say to the skeptics who say the hurdles in our current healthcare system are too big for us to expect the reality of precision medicine anytime soon?

A: We’re on a long journey. We’ve been focused on understanding the structure and biology of the human genome. In the grand scale we’ve largely accomplished that in a remarkably short period of time.

We are now focused on understanding the biology of disease in order to ultimately advance the science of medicine and improve the effectiveness of healthcare. While the hurdles are significant, the pace is accelerating and there is no turning back on the journey. It is essential that healthcare researchers, providers, technologists and data scientists begin building a strong foundation now for turning all this data into practical insights or risk being left behind.