Digital Developer Conference: Cloud Security 2021 – Build the skills to secure your cloud and data Register free

Building a natural language patient finder for healthcare analytics

Estimates suggest the average patient generates 80MB of imaging and electronic medical records (EMR) data in a single year. Across health and life sciences, there’s interest in harnessing that data to improve the quality of medical care, as well as the delivery of modern, digital-first care experiences, among other goals.

But data exchange is difficult in healthcare. Health data is strictly governed by patient privacy and consent regulations, and health and life sciences organizations often have fragmented IT environments that make data sharing difficult. As a result, the industry can’t always make use of its growing trove of data in solutions that could drive clinical or operational improvements.

To help, IBM Watson Health sponsors Project Alvearie, a collection of open source projects designed to encourage collaboration around common and pervasive challenges to health data ingestion and flow. Alvearie presents a vision for building an integrated health data pipeline, in which data can be ingested, processed, and analyzed in service of powerful health and life sciences use cases, all while adhering to privacy and consent regulations.

The project is open to collaborators across healthcare and beyond, and IBM developers are active participants in the communities that make up Alvearie. In fact, several groups of IBM developers gathered to participate in the inaugural Alvearie hackathon this year, a companywide event that allowed us to broaden knowledge and experience of Alvearie assets (future public hackathons are on the roadmap, so stay tuned for more).

The three-day event offered teams of IBMers a chance to design and implement an innovative idea that showed creativity and originality by solving an interesting problem of their own choosing. I participated in one of those teams – the DREAm Team – and our hackathon experience helped show what’s possible when developers with diverse experience apply their knowledge to thorny problems in healthcare.

Using clinical NLP to build a patient finder

Our small team had broad experience in healthcare and software development. Currently, we work to build out reference implementations as part of Alvearie’s health-patterns project. In previous roles, we dabbled in machine learning and natural language processing, as well as medical logic and data analysis. We were also already familiar with some Alvearie components, such as the IBM FHIR Server for patient data persistence and the Cohorting Service for executing queries against the FHIR database in order to find groups of matching patients.

After some brainstorming, it quickly became clear that our idea should leverage these strengths. We proposed building a “patient finder” – a simple tool that would return a list of patients who meet a set of provided criteria. This type of solution could accelerate healthcare analytics, helping data scientists quickly identify groups of patients to support tasks like clinical trial recruitment, medical research, or population health management.

Our prototype system would allow a user to input a natural language text fragment and would in turn produce the Clinical Quality Language (CQL) equivalent that could then be run against the FHIR server using the available cohorting service. To be more specific, it would transform a natural language representation of a simple statement, such as “female patients > 18 years of age,” into a corresponding structured query that could be understood by the FHIR cohorting engine. This automated approach would be useful as a starting point to building complex queries that could identify groups of patients meeting a stated set of criteria.

Architecture

By the end of our brainstorming session, an architecture had emerged that encapsulated a data flow starting with a fragment of input text representing a criteria and ending with a list of IDs for patients who were stored in the FHIR server and met that criteria. Along the way, a series of steps would be implemented, some existing and some new, that would transform the text into a structured CQL statement that could then be passed on to the cohort processing phase of the pipeline.

The diagram below shows the stages. To transform the initial text, we would need the ability to analyze, classify, and extract key elements of the text fragment. Those results could then be used to fill in a generic template as part of the automated CQL generation.

Stages

The blue boxes represent those steps that were new services or repurposed from previous work (these will be described in more detail below). The green box is a clinical natural language processing component known as IBM Watson Annotator for Clinical Data that was chosen to process text and extract various medical concepts, codes, and other entities that would be useful in the transformation. Although we chose IBM Watson Annotator for Clinical Data due to familiarity and ease of use, any similar medical NLP package could be used in its place. The purple boxes represent Alvearie assets that would be used as-is to complete the pipeline.

Technology

We decided to implement the stages of our pipeline as a collection of microservices where each stage exposed an API that would allow us to quickly test them individually, as well as put them together into a useful sequence.

To implement these services, Quarkus was chosen to accelerate our ability to write Java-based solutions. Quarkus is a “full-stack, Kubernetes-native Java framework” designed to be easy to use right from the start with little or no configuration. Java code written in a Quarkus environment can quickly be built to either run in a local mode for testing of the API endpoints or can be pushed as a container to a repository where an automatically created Kubernetes configuration file can be deployed to utilize the new existing container in a cloud-based deployment. This allowed for extremely quick prototyping, debugging, and deployment of the pipeline components.

To glue all these services together into a cohesive pipeline, we relied on Project Jupyter – in particular, simple Jupyter notebooks. A notebook is a series of cells, in our case containing snippets of Python code, that could be executed standalone or in sequence to process the stages of the pipeline. These code cells were intermixed with cells containing standard markdown for documenting the pipeline. The notebook not only provided a good development environment but also allowed for easy demonstration of partial or final products.

Python was chosen for the pipeline because of its interactive, prototypic development mode as well as its easy support for writing and testing REST-based API requests. The combination of Quarkus for service development and Python for pipeline development provided a technology ecosystem that served us well in the context of the hackathon.

Building the pipeline

Many of the pipeline components were built from scratch or were based on pieces of original work we had used in other projects and could repurpose here. The first step of the pipeline handles natural language text classification. This stage, known as fragment classification or learned intent, was based on some previous work our team had done building out a machine learning model to classify snippets of text into predefined medically related categories. These categories were intended to represent the general intent of the fragment, ignoring peripheral information. The goal of the classifier was to provide a simple meaning or core topic for the text fragment. For example, the fragment “female patients > 18 years old” was classified as belonging to the age-gender category. As another example, a patient who is able to return to Johns Hopkins for follow-up appointments falls into the category able-to-return.

The importance of this classification process lies in the fact that, no matter how the fragment is written, the category serves as a kind of normalization. For our purposes here, this normalization category maps directly to a CQL template that was created specifically for that category (details will be provided below). Once the template was known, the CQL could be generated by filling in the appropriate details from the actual text.

To extract those details, the next stage in the pipeline, known as the Key Concept Extractor, was built. This service took the original text as well as the category from the previous stage and returned a collection of medical concepts and codes, as well as useful information such as negation recognition and relationship extraction. The service relied on a natural language processing component, as well as another service capable of expanding codes (take a SNOMED code and expand it to encompass other related codes).

For example, using our age-gender example from above, the Key Concept Extractor returned two codes (one UMLS and one SNOMED) representing various important aspects of the text fragment. In this case, the UMLS concept represents the age that was mentioned along with relationship information stating that greater than and 18 are important. The SNOMED code (446141000124107) that was returned identifies the female gender.

{
    "concepts": [
        {
            "code": "C0001779",
            "codeSystem": "UMLS",
            "origin": "ConceptValue",
            "text": "age",
            "trigger": "greater than",
            "units": "years",
            "value": "18"
        },
        {
            "code": "446141000124107",
            "codeSystem": "SNOMED",
            "origin": "Regex"
        }
    ],
    "type": "agegender"
}

Putting all these parts together, we know that the meaning of the fragment is related to age and gender and specifically that the gender mentioned is female and the age concept relates to the value 18 with the relationship of greater than. That provides all the individual data required to build a detailed CQL query representing this text.

It is important to note that the Key Concept Extractor performed a filtering task as part of its processing. Each category provided logic that identified which concepts were of interest. By using the fragment classification, the Extractor was able to identify only those concepts that were important for that category, leaving others as extraneous.

A second service, Concept Expansion, was also used during this phase. Concept Expansion provides additional concept codes so that as complete a set as possible could be used in the CQL query. For example, the fragment “Patient has diabetes mellitus” is classified as a condition with SNOMED code 73211009 (shown below). However, there are a number of other similar codes in the SNOMED vocabulary that can also mean the same thing, and to build a complete CQL we wanted to be sure to include them all. Concept Expansion took the code for diabetes and returned a collection of related codes.

{
    "concepts": [
        {
            "code": "73211009",
            "codeSystem": "SNOMED",
            "origin": "SymptomDisease",
            "text": "diabetes mellitus"
        }
    ],
    "type": "condition"
}

The expansion gives a number of additional codes (not all shown here for brevity).

SnoMed expansion:
{
    "parameter": [
        {
            "name": "version",
            "valueString": "http://snomed.info/sct"
        },
        {
            "name": "display",
            "valueString": "Diabetes mellitus, NOS"
        },
        {
            "name": "designation",
            "part": [
                {
                    "name": "language",
                    "valueCode": "en"
                },
                {
                    "name": "use",
                    "valueCoding": {
                        "code": "900000000000003001",
                        "display": "Fully specified name",
                        "system": "http://snomed.info/sct"
                    }
                },
                {
                    "name": "value",
                    "valueString": "Diabetes mellitus (disorder)"
                }
            ]
        },
        {
            "name": "designation",
            "part": [
                {
                    "name": "language",
                    "valueCode": "en"
                },
                {
                    "name": "use",
                    "valueCoding": {
                        "code": "900000000000013009",
                        "display": "Synonym",
                        "system": "http://snomed.info/sct"
                    }
                },
                {
                    "name": "value",
                    "valueString": "Diabetes mellitus"
                }
            ]
        },
        {
            "name": "designation",
            "part": [
                {
                    "name": "language",
                    "valueCode": "en"
                },
                {
                    "name": "use",
                    "valueCoding": {
                        "code": "900000000000013009",
                        "display": "Synonym",
                        "system": "http://snomed.info/sct"
                    }
                },
                {
                    "name": "value",
                    "valueString": "DM - Diabetes mellitus"
                }
            ]
        }
}

The next step of the pipeline, CQL Generation, takes the results of the previous stages and produces a fully instantiated CQL query for that type of fragment. The basic process is to use the fragment classification as a mapping to a generic CQL template with static and dynamic components. For example, the template below describes the components of an agegender CQL by specifying that the gender of interest could be male or female or both. Furthermore, the age aspect of the query is based on the current age (AgeInYears()) and the boundary value that was extracted from the age concept earlier. The trigger is also needed as it defines a relationship between the current age and the boundary. Other parts of the template are static (such as the header, FHIR version, and helper statements) and are always needed.

library "PatientsByAgeGender" version '1.0.0'

// gender based patients by age restriction

using FHIR version '4.0.1'

include "FHIRHelpers" version '4.0.1' called FHIRHelpers

context Patient

define "Patient is Male":
   Patient.gender.value = 'male'

define "Patient is Female":
   Patient.gender.value = 'female'

define "Initial Population":
   GENDER

define "Denominator":
   "Initial Population"

define "Numerator":
   AgeInYears() TRIGGER AGEVALUE

Once the details are known, the template can be rendered into a concise and complete CQL query. The final result produced by the CQL generator for this case is shown below. Note that it is simplified with whitespace removed and extra definitions cleaned up in order to only include those items that are actually needed to have a working CQL query.

library "PatientsByAgeGender" version '1.0.0'
// gender based patients by age restriction
using FHIR version '4.0.1'
include "FHIRHelpers" version '4.0.1' called FHIRHelpers
context Patient
define "Denominator":
Patient.gender.value = 'female'
define "Numerator":
AgeInYears() > 18

Once we have a CQL query, the rest of the pipeline utilizes Alvearie components to complete the flow. Our FHIR server contains patient data that we want to run the query against. For this case, we assume it has been populated. In order to run the query, we utilized the Health-Patterns Cohort Service, which is a wrapper around the Cohort Engine exposed by the Quality Measure and Cohort Service. The Cohort Service allows users to upload a CQL query to a library and then run an item from that library against an FHIR server.

For the age-gender example, the Jupyter notebook step outputs the following activity log.

name: PatientsByAgeGender
version: 1.0.0
endpoint extension: PatientsByAgeGender-1.0.0
Posting CQL to Cohort Service: <Response [201]>
Running CQL on Cohort Service: <Response [200]>

2 patients returned
a2d93ac8-d601-4c54-bb45-c8d3f2d91846
67e2c1bf-18f9-4645-956e-2053fe6751e9

Once the CQL was available, this step extracted the name to be used, posted to the cohort service library, then executed that named CQL. Note that the end result was a list of two patients (IDs are shown) from the current FHIR server that met the criteria of being female and also older than 18.

Artifacts

The code for these services as well as the Jupyter notebook used to facilitate the pipeline can be found in this repository. This is a demonstration project coming directly from the hackathon meaning it is a prototype designed to show what can be done but there has been no attempt to make the solution robust or complete. The README discusses details but realize that these artifacts are very preliminary, and your mileage may vary as you use them.

Takeaways and next steps

The hackathon was a fantastic experience. Our tools allowed us to quickly code, build, and deploy our solution, as a whole or in parts, as many times as needed. Being able to run locally as well as in the cloud with little or no configuration change provided us with an efficient development cycle. This allowed us to maximize our productivity over the short development time. We learned new skills that we could use immediately as we get back to our regular squad work. And yes, we won!

Most importantly, we were able to leverage a great collection of Alvearie assets to efficiently build out an application in an extremely short amount of time. We created a deliverable that could have value beyond the hackathon: Data scientists across health and life sciences need tools to accelerate their analytics work in support of new health care delivery and operational models. We hope our work provides a foundation for other developers to build on in this effort.

To learn more, explore Project Alvearie and get involved with a community of developers helping to enable healthcare innovation. You can also try IBM Watson Annotator for Clinical Data on premises or in the cloud.