IBM Developer Advocacy

Winter is Coming: Importing Game of Thrones Data into Cloudant with the Simple Search Service



Glynn Bird
4/19/16

Winter is coming. The sixth season of Game of Thrones is nearly upon us and because the writers have run out of George R. R. Martin’s books, anything could happen.

To prepare for the new season, I decided to make a searchable database of Game of Thrones characters using the Simple Search Service, which we announced in January. The Simple Search Service is a Node.js app that lets you import structured data into a Cloudant NoSQL database, where it is indexed and presented as a faceted search API.

The first thing we need is information.

Information is the key. You need to learn your enemy’s strength and strategies. You need to learn which of your friends are not your friends.

Lord Varys – Game of Thrones

An API of Ice and Fire

An API of Ice and Fire provides a RESTful API service that lets you query characters, books, and houses using a simple RESTful, HTTP interface. By writing a data-processing script, I was able to convert the data into a TSV file that looks like this:

_id name    gender  culture born    died    titles  aliases father  mother  spouse  allegiances books   povBooks    tvSeries    playedBy
characters:583  Jon Snow    Male    Northmen    In 283 AC       Lord Commander of the Night's Watch Lord Snow,Ned Stark's Bastard,The Snow of Winterfell,The Crow-Come-Over,The 998th Lord Commander of the Night's Watch,The Bastard of Winterfell,The Black Bastard of the Wall,Lord Crow         House Stark of Winterfell   A Feast for Crows   A Game of Thrones,A Clash of Kings,A Storm of Swords,A Dance with Dragons   Season 1,Season 2,Season 3,Season 4,Season 5    Kit Harington
.
.
.

The file uses tab characters to delimit the columns. Its first line represents the field names, and subsequent lines are rows of data. Notice that the data isn’t completely flat: some of the fields are themselves comma-separated, indicating that they represent an array of possible values. When imported, we want the data to look like this:

{
	"_id": "characters:583",
	"_rev": "1-668a8b166b6826ca0576e9b21924f814",
	"name": "Jon Snow",
	"gender": "Male",
	"culture": "Northmen",
	"born": "In 283 AC",
	"died": "",
	"titles": ["Lord Commander of the Night's Watch"],
	"aliases": ["Lord Snow", "Ned Stark's Bastard", "The Snow of Winterfell", "The Crow-Come-Over", "The 998th Lord Commander of the Night's Watch", "The Bastard of Winterfell", "The Black Bastard of the Wall", "Lord Crow"],
	"father": "",
	"mother": "",
	"spouse": "",
	"allegiances": ["House Stark of Winterfell"],
	"books": ["A Feast for Crows"],
	"povBooks": ["A Game of Thrones", "A Clash of Kings", "A Storm of Swords", "A Dance with Dragons"],
	"tvSeries": ["Season 1", "Season 2", "Season 3", "Season 4", "Season 5"],
	"playedBy": ["Kit Harington"]
}

The data file we are going to use to build our database is here.

My mind is my weapon.

George R. R. Martin – Game of Thrones

Importing the data

First, download the data file.

Now, deploy the Simple Search Service on Bluemix and then launch the app and choose Upload to select the data file:

file upload

Then you can choose:

  • which fields are to have which data types
  • which fields are to have facet counts calculated for them—typically fields that have repeating values throughout the data set, such as culture or gender.

choose types

You need to pick out which fields are to be considered “arrays of strings”, to ensure that the comma-separated fields are imported correctly.

Click the Import button to write the data to the database and within a few seconds, your data is uploaded and searchable.

search

You can perform searches in the search box and see the results rendered as a table of results and a list of facet counts. Some sample queries:

In addition to the web front-end, you can also use the Simple Search Service as an API to power your own front-end. Just visit /search?q=, where the value q is the query you wish to perform:

Searching is not finding.

George R. R. Martin – Game of Thrones

Locking down the Simple Search Service

Once your data is uploaded, if you want to prevent other data uploads from happening, simply set a Bluemix environment variable LOCKDOWN with a value true to instruct the app to act only as a read-only API.

lockdown

Once the app restarts, it no longer displays its friendly user-interface. It only lets you access its /search endpoint, but as CORS is enabled, you can call out to your Simple Search Service URL from any client-side script without triggering a security warning from your browser.

Winter is coming.

George R. R. Martin – Game of Thrones

Building a frontend

The next blog in this series takes a Simple Search Service instance containing Game of Thrones data and shows how easy it is to build a web frontend.

Dark wings, dark words.

Ned Stark – Game of Thrones

blog comments powered by Disqus