Analyze Tweets with Jupyter Notebooks  

Analyze and create data visualizations with Jupyter Notebooks

Last updated

Built for the application developer who may not have data science experience or a fully dedicated data science team, this journey is the fast track to leveraging pre-enriched Twitter Insights data from Bluemix® within Jupyter Notebooks.

By Mark Sturdevant, Rich Hagarty, David Taieb

Overview

As part of our ongoing effort to democratize data science, this journey aims to teach application developers who have an interest in (but not necessarily a specialized focus in) data science applications. We show you how to quickly build powerful data visualizations by using IBM and open source technologies, thus eliminating the need to staff up data science teams or the time dedicated to data science classes. Accelerate your time to value based on data insights knowledge that generally takes a lot longer to build.

From this scenario, you’ll learn how to create a dashDB warehouse that contains Twitter data, such as advanced enrichments like sentiment, gender, and location. After you create an Insights for Twitter service through Bluemix, you’ll load tweets into dashDB and analyze them in Jupyter Notebook by using SparkContext and pandas (Python data analysis library). With Jupyter, you’ll be able to easily share results with others. We’ll also demonstrate how you can create visualizations with Matplotlib and Google GeoChart.

Flow

  1. The developer adds the Bluemix services needed for this application, dashDB for Analytics, and Insights for Twitter.
  2. The developer creates a notebook within Bluemix by using the DSX Spark Service.
  3. SparkContext enables the developer to run tasks on the Spark cluster.
  4. dashDB analyzes the loaded, specified tweets from Twitter.

Components

IBM Data Science Experience

Analyze data in a configured and collaborative environment.

IBM Analytics for Apache Spark

An open source cluster computing framework optimized for extremely fast and large scale data processing.

IBM Insights for Twitter

Provides sentiment and other enrichments for multiple languages, based on deep natural language processing algorithms from IBM Social Media Analytics.

IBM dashDB for Analytics

A fully managed SQL cloud database service, optimized for data warehouse and analytics workloads.

Jupyter Notebook

An open source web application that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text.

Technologies

Analytics

Finding patterns in data to derive information.

Databases

Repository for storing and managing collections of data.

Related Blogs

Analyze traffic data from the city of San Francisco

In this developer journey, we will use PixieDust running on IBM Data Science Experience to analyze traffic data from the city of San Francisco. Data is claimed to be the most valuable commodity in the world. At IBM, we want you to take advantage of your data – manipulate it, visualize it, and understand it...

Continue reading Analyze traffic data from the city of San Francisco

Leverage the Data Science Experience to analyze StarCraft II replays

It comes as no surprise that studies show more than 70 percent of American households play video games. What might surprise you is that there are 1.8 billion gamers worldwide! While it’s hard to explain the appeal to some, it is speculated that video games fill a human void in a way that our world...

Continue reading Leverage the Data Science Experience to analyze StarCraft II replays

Related Links

Cloudant NoSQL Database

A fully managed data layer designed for modern web and mobile applications that leverages a flexible JSON schema.

Slack Python Client

A basic client for Slack.com, which can optionally connect to the Slack Real Time Messaging (RTM) API.