There’s been a lot written about Watson being applied to diverse areas of professional expertise, especially in healthcare and more recently as a tax expert on behalf of H&R Block.  It’s even been profiled at ESPN, helping fantasy football team owners make better decisions.  But how does this happen?  Does IBM simply flip a switch, feed some research articles into the system, and voila, Watson understands a new industry?

It’s not that simple.  There are various ways Watson – which today is a diverse set of a capabilities (and not a single intelligent system) – can be taught to think like an expert.  We’re going to highlight one of those methods below – creating a “type system” in an application called Watson Knowledge Studio (WKS) – and some best practices for teaching Watson to think like an expert.  While this method for teaching Watson does require human guidance – this is called supervised machine learning – the benefits are immense.  Following this process teaches Watson to read subject specific literature – text books, research articles, emails, blogs, whatever – just like an expert who would use those sources.  WKS is cloud based, and you can learn more about it here.

If one professional can read a hundred documents in a week, or a team could read a thousand, Watson can read millions – extract the knowledge contained in that literature – and bring that knowledge to the fingertips of professionals who would no longer be at the mercy of keyword search.  They’re no longer required to read everything returned from those searches, and draw their own conclusions.  Instead, Watson can read on their behalf, extract knowledge contained, summarize it, and present it to the user of the system with evidence.  The professionals use that as a supplement to their own expertise, and can thus make better, faster, evidence based decisions.  Nothing happens in a black box.

Here’s how to get started.

TL/DR
WKS (Watson Knowledge Studio) is pretty awesome. When you use it, best practices to form a type system include:

  • Keep the type system simple!  Especially with relationships.
  • When you get your type system established, start with a small number of documents (five-ish) and two human annotators.  You’ll quickly discover if there are wrong or missing types.
  • Don’t set an arbitrary limit on number of types – use as many as are demanded by your use case.
  • Test out the type system with 1-2 people, before opening up to the full training process with your larger team.
  • If you don’t already have dictionaries built for your entity types, it’s worth the manual effort to develop them yourself, before training for mentions.

The secret sauce behind developing a type system for WKS

The scenario: you are using Watson Knowledge Studio (WKS) to parse unstructured written text, to extract entities and relationships, and need to figure out where to begin.  There are several reasons you would use WKS for this purpose, with two popular use cases being:

  1. Insurance claim negotiation.  Extracting pertinent information from road traffic accident reports, to determine severity of an accident (i.e. you’re designing a system that can automate the reading of these reports).  The knowledge you’d be extracting from these reports would include severity of an accident, high speed, low speed, impact on claimant, need for child care, etc.
  2. Domain adaptation, using medical trial survey data to understand what doctors think about results, and whether they’re ready to adopt new medication(s).  The knowledge you’d be extracting from these reports would include perceived benefits, perceived drawbacks, new drugs, stage of treatment, etc.

Historically, this kind of natural language processing (NLP) was done with programmatic rules.  The challenge with programmatic NLP is that it requires linguistic and programming skills, along with subject matter expertise.  Linguists are needed to well understand proper grammar, syntax, and vocabulary for a given domain – which often require SMEs to fully prepare – and then computer scientists are needed to translate that understanding into programmatic “rules” (little pieces of code) that parse the unstructured data (emails, reports, articles, etc.).  The output of this work can be very powerful – those rules can work great! – but it requires several distinct skillsets to reach that powerful outcome, as people who possess each of these skills (SME/compsci/linguistic) are in very short supply.

The Watson Advantage – WKS

Watson Knowledge Studio is IBM proprietary tooling that represents the latest in natural language processing.  It utilizes inferential statistics to produce “models” trained using supervised machine learning.  These models infer entities within textual data, and then infer relationships (correlations) between those entities.  A simple example is creating a system that recognizes “drug” entities and “condition” entities, with a relationship type of “treats_condition”.  After training, your WKS models would be able to read medical journals and extract knowledge contained within them, such as (the entity) “aspirin” “treats (the condition)” of having a “headache”.  That information could then be surfaced to users of the system if they instructed a Watson to “tell me all the drugs known to treat headaches”, by partnering the models with Watson Discovery Service  (as one approach).  It goes layers deeper than are offered by keyword search, because the system actually understands the context of what you’re asking about.

The example cited above is obviously pretty simple – Aspirin is well known to treat headaches – but imagine the power behind systems that can read millions of articles and extract the latest-and-greatest knowledge being published on a daily basis.  Or find that obscure correlation you didn’t think to explicitly ask about.  Or almost instantly read thousands of accident reports that flow into an insurance provider on a daily basis.  Such systems would be (and are) capable of keeping otherwise busy professionals – doctors, investigators, scientists, lawyers, analysts, etc. – apprised of the latest discoveries and most cutting edge research, without putting the onus on humans to read all these thousands of documents themselves.  This technology can be applied to virtually any industry, in any context – the only limits for use cases are your imagination.

One of the great breakthroughs with WKS is the relative simplicity afforded users to train the system.  Instead of writing complex code, anybody can sit down and “train” the system in a web browser, using a simple GUI (you’re basically highlighting text).  But before you can begin the training process, you first need to understand what is important to you and your use case, within the context of the data being parsed by Watson (emails, reports, articles, etc.).  The process of identifying that information is called developing a type system, and it’s driven by your business use case.  One of the major, easily avoidable mistakes we’ve seen people make with WKS is designing the type system to be unnecessarily complex.  They come up with too many entities, without recognizing the effort involved to train them – but the biggest problem is the number of relationships between entity types.  As an example, you’re creating a type system to recognize people (the entity), and the relationships between those people.  Ask yourself:

  • How much specificity do you need in those relationships?
  • Do you need highly specific understanding of familial relationships (“brother of”, “father of”, “mother of”, “sister of”, etc.), or would your business use case be satisfied by simply understanding there exists a “relative of” relationship between these entities?

As much as possible, gravitate toward simplicity!  Remember that each relationship will need >100 specific training examples, so you’re on the hook for finding at least one hundred examples of “father of” relationships between two people across your training documents, should you decide you need to get that granular.

It’s always helpful to have a second person testing out the type system, and it’s ideal if both people have WKS experience.  Once you’ve established a more stable type system – with the expectation and understanding that it will change early on – you can expand that group to four or five people.  Remember that after you’ve begun the training process, changes to a type system are a destructive act, and having too many voices early on will make that process worse.

If you’re working in a domain with an existing data structure, such as a product catalog or a list of domain specific terms, these can be quickly exploited as dictionaries for your new type system before you begin the training process.  And if you don’t have pre-existing dictionaries, it’s worth the (sometimes considerable) effort to develop them yourself.  It doesn’t need to be comprehensive – if you’re training a system to detect companies, for example, you don’t need to compile a list with *every* known company – but creating a list of examples will give your model a better starting point, and reduce the manual effort when you move into the formal training phase.  Even though compiling such a list does require manual effort, our experience has been these early efforts will ultimately save you time.

With your own use case in mind, you can get started with Watson Knowledge Studio here.


About the authors

Daniel Hunt is a data science professional with expertise in story telling and Watson’s cognitive services.  He serves as the technical team lead for the Watson Experience Center in New York, and as an SME for the Watson Platform.  His primary responsibility is to drive new partnerships using IBM’s Watson services by demystifying artificial intelligence and making it relatable to the C-suite.

Edd Biddle is a Senior Data Scientist with IBM Watson, with a deep background in natural language processing and text analytics.  Based in the UK, he is responsible for helping drive Watson implementations and delivery for IBM partners.

2 comments on"Teaching Watson to think like a subject matter expert"

  1. Rajesh Gudikoti January 15, 2018

    Just like relationship will need >100 specific training examples, do we have same rule(minimum number of samples for the given entity) for entities.

  2. The figure 100 is a general rule of thumb and can be applied to each entity type and each relationship between entity types. This will be very dependent on the amount of variation that there is in the way the entity type or relationship type is reported.

    The point of this comment is to try to make people think about the amount of human annotation work will be required especially if they create a complex type system with lots of entities and relationship types. At the end of the day WKS is applying a machine learning algorithm which requires statistically significant patterns to learn

Join The Discussion

Your email address will not be published. Required fields are marked *