Taxonomy Icon

Data Science

Get recommendations by linking structured and unstructured data

Get the code View the demo

Summary

Processing unstructured data from different data formats has many challenges with respect to data extraction, and then using the outcome to help us make informed decisions. It can be a time consuming process to check both data sources manually for inference, and that’s where this pattern can be helpful. We will highlight how to relate the data from two different sources to help make informed decisions for process optimization and risk mitigation. We’ve used the example of an HR recruitment process, comparing the candidate’s resume with the job description and candidate database to identify the best suited candidate for a given job profile. This can help HR develop an efficient recruitment plan. In this example, we’ll use Watson Studio and Watson NLU.

Description

In this code pattern, we’ll demonstrate a methodology to integrate structured data and unstructured data to generate insights. Processing unstructured data coming in different data formats has many challenges with respect to data extraction, and then using the outcome to help us make informed decisions. It can be a time consuming process to check different data sources manually for inference, and that is where this pattern can be helpful. We will showcase a configurable yet scalable process that can help merge different data sources and expedite the process of decision making. We’ll be using the example of an HR recruitment process where we compare the candidate’s resume with the job description and candidate database to identify the best suited candidate for a given job profile. This can help HR develop an efficient recruitment plan by helping with risk mitigation, enhancing ROI, and increasing credibility in the recruitment process. We’ll use Watson Studio and Watson NLU to solve this use case.

After you complete this code pattern, you’ll understand how to:

  • Establish a relation between unstructured data and structured data for generating insights and recommendations.
  • Extract and format unstructured data and structured data using configurable Python functions.
  • Use a configuration file to specify the job requirements to initiate data processing for a job profile.
  • Use IBM Watson Natural Language Understanding API to extract metadata from documents in Jupyter notebooks.
  • Create and run a Jupyter notebook in Watson Studio platform.
  • Use Object Storage to access data and configuration files.
  • Display the processed output along with recommendations in Watson Studio.

Flow

flow

  1. Post the job description as a query to identify suitable candidates.
  2. Upload the candidate database, configuration file and candidate resumes into object storage.
  3. The query is processed in Watson Studio with the help of Watson NLU to process structured and unstructured data and generate recommendations.
  4. The recommendations with the suitable candidates for the query are displayed as output which will be consumed by the recruiter to take informed decision.

Instructions

Get detailed in instructions in the README. These steps show you how to:

  1. Create an account with IBM Cloud.
  2. Create a new Watson Studio project.
  3. Create IBM Cloud services.
  4. Create the notebook.
  5. Add the data and configuraton file.
  6. Update the notebook with service credentials.
  7. Run the notebook.
  8. Analyze the results.