A lot of valuable insights are missed when unstructured data is not analyzed. The unstructured data generated has increased exponentially in the digital world over the past decade. In order to have a competitive advantage with respect to decision-making, it is essential to extract all the important insights from unstructured data.
Here is a sample of problems encountered by IT specialists and data scientists in their day-to-day lives:
- In an existing long-running software project, there are many document artifacts generated – requirements, defects, test cases, tasks. Can the unstructured text content in the artifacts be analyzed to generate a mapping between requirements, defects, test cases? The insights generated can be used for test case execution optimization or generation of new test cases in areas where there are more defects.
- A car has multiple systems. Every system has a text manual. Is it possible to extract important information from the manual to answer common queries? It has to be augmented with appropriate responses from online portals. This requires a query text to be correlated with text in the manuals and online portals.
- A prototype needs to be built to showcase a complete end-to-end analytics solution involving text analytics to a prospective customer with interactive visualization.
The composite pattern titled “Mine insights from software development” demonstrates a methodology to build a complete end-to-end text analytics solution to such problems. The methodology can be applied to problems in any domain. It can shorten the time to build a solution with reusable patterns.
At the end of this pattern, you will have learned how to build an interactive text analytics solution with customization using IBM Data Science Experience, Python NLTK, IBM Cloud services, Watson services, D3.js, and Orient DB.
View the “Mine insights from software development” pattern for demos, code, and more.