Taxonomy Icon

Artificial Intelligence

Detect email phishing with Watson Natural Language Classifier

Get the code View the demo

Summary

In today’s world, email is the primary way that people and businesses communicate. However, that creates a huge opportunity for individuals to try and scam or “phish” others. Because the detection of fraudulent emails is considered to be a classification issue, this code pattern explains how to build an app that classifies email, either labeling it as “Phishing,” “Spam,” or “Ham” if it does not appear suspicious.

Description

Email is considered to be the main form of communication in today’s world. But, with that form of communication comes the problem of phishing. Preying on individuals and businesses, phishing scams contribute to the loss of billions of dollars worldwide. Phishing detection techniques must be created and used to help stop the problem.

In this code pattern, we build an app that classifies email, either labeling it as “Phishing,” “Spam,” or “Ham” if it does not appear suspicious. The code pattern uses IBM Watson Natural Language Classifier to train a model using email examples from an EDRM Enron email data set. The custom NLC model can be quickly and easily built in the web UI, deployed into a Node.js app using the Watson Developer Cloud Node.js SDK, and then run from a browser.

When you have completed this code pattern, you should understand how to:

  • Build a Watson Natural Language Classifier model using the web UI
  • Create a Node.js app that uses the NLC model to classify emails as phishing
  • Use the Watson Developer Cloud SDK for Node.js

Flow

  1. The user interacts with the Natural Language Classifier GUI to train the model.
  2. EDRM data is loaded to the Natural Language Classifier service to provide sample emails for training.
  3. The user sends the email text to the application for classification.
  4. The app uses Watson Natural Language Classifier to determine if the text is phishing, spam, or ham.

Instructions

Find the detailed instructions in the README. These steps explain how to:

  1. Clone the GitHub repo.
  2. Create the Watson Natural Language Classifier service with IBM Cloud.
  3. Train the Natural Language Classifier model.
  4. Configure the credentials.
  5. Run the application.