The FDA has a history of using real world evidence (RWE) as an integral component of the drug approval process. Moreover, RWE can mitigate the need for placebos in some clinical trials. The clinical records that make RWE evidence useful, however, often reside in unstructured formats, such as doctor’s notes, and must be “abstracted” into a clinical structured format. Cloud technologies and AI can help accelerate this process, making it significantly faster and more scalable.
Leading drug researchers are starting to augment their clinical trials with real world data for their FDA study submissions because it saves time and is more cost effective. Once the patient’s care concludes, the vast amounts of historical unstructured patient medical data ends up being a contributor to increasing storage needs. Unstructured data is key and critical in clinical decision support systems. In their original unstructured format, insights need a human to review the unstructured data. With no discrete data points from which insights can be quickly drawn, unstructured medical data can result in increased care gaps and care variances. Simple logic dictates that unassisted human abstraction alone is not fast or accurate enough to abstract all of this patient data. Applied natural language processing (NLP) using serverless software components on Google Cloud provides an efficient way of identifying and guiding clinical abstractors towards a prioritized list of patient medical documents.
Using Google Cloud’s Vertex Workbench Jupyter Notebooks, you can create a data pipeline that takes raw clinical text documents and processes them through Google Cloud’s Healthcare Natural Language API landing the structured json output into BigQuery. From there, you can build a dashboard that can show clinical text characteristics, e.g., number of labels and relationships. From this, you’ll be able to build a trainable language model that can extract text and be further improved over time by human labeling.
To better understand how the solution addresses these challenges, let’s review the medical text entity extraction workflow:
The Healthcare Natural Language API lets you efficiently run medical text entity resolution at scale by focusing on the following optimizations:
The following diagram shows the architecture of the solution.
A list of documents, by density score, helps human abstractors know which documents need a lot of work versus only a light review.
This Look (view) shows the coded medical text that was mapped to the UMLS clinical ontology by the Google Healthcare Natural Language API.
This Look (view) shows the entity mentions, including the subject of each mention and its confidence score, allowing for loading into a biomedical knowledge graph for further downstream analysis.
This Look (view) shows the entity mentions found in the raw document text.
This demo loaded the entity and document metadata into BigQuery and Looker but didn’t load the rich relationships that come out-of-the-box from the Healthcare Natural Language API. Using those relationships, it is possible to create a biomedical knowledge graph and explore the pathways between disease, treatment, and cohorts, and to help generate new hypotheses linking these facts.
We created a barebones dashboard with Looker. However, Looker has rich functionality, such as the ability to push to channels like chat when a document is available for review or to visualize the patient as a medical knowledge graph of related entities or embedding ML predictions right in the Looker LookML itself. This dashboard should be considered just a starting point for Looker powered clinical informatics.
To learn more about the Healthcare Natural Language API, please visit our product page. To try it yourself for free, please visit this demo link.
For help with loading this example medical text into a Vertex AI dataset for labeling, please contact the Google Cloud Biotech Team.
Data Privacy
No real patient data was used for any part of this blog post. Google Cloud’s customers retain control over their data. In healthcare settings, access and use of patient data is protected through the implementation of Google Cloud’s reliable infrastructure and secure data storage that support HIPAA compliance, along with each customer’s security, privacy controls, and processes. To learn more about data privacy on Google Cloud, check out this link.
The report The economic potential of generative AI: The next productivity frontier, published by McKinsey…
The new model shows open-source closing in on closed-source models, suggesting reduced chances of one…
Samsung’s celebrated flagship soundbar does just enough to beat out the rest of its Dolby…
Even highly realistic androids can cause unease when their facial expressions lack emotional consistency. Traditionally,…
These beard tools deliver a quality trim for all types of facial hair.
Artificial intelligence (AI) research, particularly in the machine learning (ML) domain, continues to increase the…