Visualize unstructured data using Watson Natural Language Understanding

This pattern is part of the Get started with natural language processing learning path.

Level Topic Type
100 An introduction to Watson natural language processing Article
101 Look deeper into the Syntax API feature within Watson Natural Language Understanding Article
201 Visualize unstructured data using Watson Natural Language Understanding Code pattern
301 Discover hidden Facebook usage insights Code pattern

Summary

In this code pattern, we will create a web app for visualizing unstructured data using Watson™ Natural Understanding, Apache Tika, and D3.js. After a user uploads a local file of choice, the application leverages Apache Tika to extract text from the unstructured data file. The text is then passed through Watson Natural Language Understanding, where entities and concepts are extracted. Finally, the application uses the D3.js library as a visualization tool to display the results to the user.

Description

The main benefit of using the Watson Natural Understanding Service is its powerful analytics engine that provides cognitive enrichments and insights into the data. The key enrichments that are extracted include:

  • Entities – People, companies, organizations, cities, and more
  • Keywords – Important topics typically used to index or search the data
  • Concepts – Identified general concepts that aren’t necessarily referenced in the data
  • Sentiment – The overall positive or negative sentiment of the data

The enrichments will be displayed using D3.js, a JavaScript library that provides powerful visualization techniques that help bring data to life. In this app, we will use it to display each of the enrichments in an interactive bubble cloud, with each element’s size and location determined by its relative significance.

When you have completed this code pattern, you will understand how to:

  • Create and use an instance of Watson Natural Language Understanding
  • Leverage Apache Tika to extract text from unstructured files
  • Use D3.js for displaying the visuals

Flow

flow

  1. User configures credentials for the Watson Natural Language Understanding service and starts the app.
  2. User selects data file to process and load.
  3. Apache Tika extracts text from the data file.
  4. Extracted text is passed to Watson NLU for enrichment.
  5. Enriched data is visualized in the UI using the D3.js library.

Instructions

Ready to get started? Please see the README for detailed instructions.

Conclusion

This pattern showed how to create a web app for visualizing unstructured data using the Watson Natural Understanding service, Apache Tika, and D3.js. The pattern is part of the Get started with natural language processing learning path. To continue with the learning path, take a look at the next step, Discover hidden Facebook usage insights.