You can access IBM Cloudant’s NoSQL database from any programming language because of its RESTful HTTP API and JSON data format. There are some circumstances, however, where you need to present your data in a format other than JSON. Perhaps you need to crunch some data in a spreadsheet, import data into your calendar, or aggregate data in an RSS reader.

You could write your own converter code on the the client side, but why not configure Cloudant to output data directly to other formats using a List Function. A list function works alongside Cloudant MapReduce to let you output a Cloudant database in the format of your choice.

These are the basic steps, which I’ll expand upon in this article:

  1. Devise a Javascript map function that defines an index into your Cloudant database. MapReduce is an efficient and highly scalable method of working with large datasets in a distributed database. This step is equivalent to indexing a field in an relational database.
  2. Devise a list function that describes how the data is to be transformed. The list function runs through each document in the result set and calls functions to generate HTTP headers and to output text.
  3. Upload the map and list functions as a Cloudant Design Document. Design documents are special records in a Cloudant database that contain code that defines MapReduce indexes, Search indexes, and list functions.
  4. Query the newly-created view with a list function. Also, use this function to transform your data format. (Once an index has built, querying the view is very efficient and has excellent performance on large data sets.)

The data set

Before we can index our data, we need to take a look at the data itself. Let’s say we have a database of appointments for a hair styling business. A typical document looks like this:

{
  "_id": "89a66bae620c33d653a0c02f2add5b4a",
  "_rev": "1-bba9e83dbb192ea0c05e13f0d9296322",
  "product": "Cut",
  "customer": "Lady Catherine de Bourgh",
  "date": "2015-09-29 16:00:00",
  "duration": 1,
  "stylist": "Mandy",
  "branch": "Stockport",
  "cost": 25,
  "paid": false
}

A document is a JSON object with simple key/value pairs indicating the attributes of the appointment. Cloudant allows large complex JSON documents, but in this example we have a simple shallow object.

You can examine the full sample data set here:

https://examples.cloudant.com/appointments/_all_docs?include_docs=true

Ordering the data by date

The appointment data is ordered by the _id field, but we would like to order our data by date, so we can extract a single day’s appointments from a larger data set. To do so, we’ll create a bydate view of this data, using a JavaScript map function. When a new view is created (by adding a design document to the database), the view’s map function is called with each document in the database in turn. The map function chooses to “emit” a key and a value which forms an index into the data set. In this case, we need to index the date field, so doc.date is emitted as the key. We don’t need to map the value of the date field so our function will emit null in the value slot:

function(doc) {
  emit(doc.date, null);
}

You can paste this map function in one of two places:

  • into the Cloudant Dashboard’s New View page (Open your database, click the + button beside All Design Docs and choose New View)

    new view menu

  • or into to a Design Document using Cloudant’s API

When you query this view by the following URL, you get all the appointments in chronological order:

https://examples.cloudant.com/appointments/_design/find/_view/bydate

If you want to see the full document bodies, simply add ?include_docs=true to the URL:

https://examples.cloudant.com/appointments/_design/find/_view/bydate?include_docs=true

This database covers only one day, but in real life, there could be millions of records in your database. To include only specific dates, we can add a startkey and endkey to the URL to define the date range.

For example, this query shows the afternoon appointments on September 29th:

https://examples.cloudant.com/appointments/_design/find/_view/bydate?include_docs=true&startkey=”2015-09-29+12:00:00″&endkey=”2015-09-29+23:59:00″

List functions

Now that we have a date-ordered view into our data set and can filter data between two dates, we can start adding list functions to transform the output data into something other than JSON. A list function contains a row iterator to run through the data set and you can optionally call a start function once to output HTTP headers. You can call a send function multiple times to output a block of data:

function(head, req) {
  var row;
  start({
    headers: {  'Content-Type': 'text/plain'  },
  });
  while(row = getRow()) {
    send('some data');
    send('some more data');
  }
};

Define your list function in the same Design Document as your map function. Many map functions and list functions can co-exist in the same design document. The structure of our design document is:

{
  "_id": "_design/find",
  "_rev": "1-91f1c6803a631e375f3e8b35a63414c5",
  "views": {
    "firstview": {
      "map": "function (doc) {n  emit(doc.date, null);n}"
    },
    "secondview": {
      "map": "function (doc) {n  ... n}"
    }
  },
  "lists": {
    "firstformat": "function (head, req) {n .. n}",
    "anotherformat": "function (head, req) {n ... n}",
    "yetanotherformat": "function (head, req) {n ... n}"
  }
}

Tip: At query time, your design document can pair views and list functions in any combination.

CSV

First let’s look at the format of the CSV file we want to create. We want:

  • each Cloudant document to be a row (apart from the first row which will show the column headings)
  • commas to separate columns
  • column values that contain commas to be encased in double quotes

So the raw .csv data will look like this:

_id,_rev,product,customer,date,duration,stylist,branch,cost,paid
1b8c3636057fabe11a020a0fe6e5f290,2-ad3273ff2b7e1def81589046314dba0a,Cut,Elinor Dashwood,2015-09-29 09:00:00,2,Justin,Stockport,60,false
89a66bae620c33d653a0c02f2a653b4d,1-d8682b8e1b7acc8d1b981a0ec362558b,Shampoo and Blow Dry,Charlotte Lucas,2015-09-29 10:00:00,2,Mandy,Stockport,50,false
c84242804e809c9ab0b8d330dab53753,1-a7dd475f1a5ea97bb1fe0ca82ef73595,Cut and colour,Lucy Steele,2015-09-29 11:00:00,1.5,Justin,Stockport,55,false
c84242804e809c9ab0b8d330da1398f2,1-5c66b1f0348c0ca14a87040288bfa320,Blow Dry,Mary Bennet,2015-09-29 12:15:00,1,Mandy,Stockport,30,false
c84242804e809c9ab0b8d330dab6f3ae,1-3cf6f0992211634d37a23e59916c4532,"Cut, colour and blow dry",Lady Middleton,2015-09-29 13:00:00,2.5,Justin,Stockport,80,false
1b8c3636057fabe11a020a0fe66808e9,1-762efa0a088b49c961b7e5c5d2b08ba8,Perm,Charlotte Lucas,2015-09-29 14:00:00,2.5,Mandy,Stockport,50,false
89a66bae620c33d653a0c02f2ae29a50,1-762efa0a088b49c961b7e5c5d2b08ba8,Perm,Charlotte Lucas,2015-09-29 14:00:00,2.5,Mandy,Stockport,50,false
c84242804e809c9ab0b8d330dabd5716,1-27a5590e6ea32c0890919eb7650aa425,Styling,Mrs Ferrars,2015-09-29 15:30:00,1,Justin,Stockport,25,false
89a66bae620c33d653a0c02f2add5b4a,1-bba9e83dbb192ea0c05e13f0d9296322,Cut,Lady Catherine de Bourgh,2015-09-29 16:00:00,1,Mandy,Stockport,25,false

The list function we need to perform this transformation is:

function(head, req) {
  var row,
    first = true;

  // output HTTP headers
  start({
    headers: {  'Content-Type': 'text/csv'  },
  });


  // iterate through the result set
  while(row = getRow()) {

    // get the doc (include_docs=true)
    var doc = row.doc;

    // if this is the first row
    if (first) {

      // output column headers
      send(Object.keys(doc).join(',') + 'n');
      first = false;
    }

    // build up a line of output
    var line = '';

    // iterate through each row
    for(var i in doc) {

      // comma separator
      if (line.length > 0) {
        line += ',';
      }

      // output the value, ensuring values that themselves
      // contain commas are enclosed in double quotes
      var val = doc[i];
      if (typeof val == 'string' && val.indexOf(',') >  -1) {
        line += '"' + val.replace(/"/g,'""') + '"';
      } else {
        line += val;
      }
    }
    line += 'n';

    // send  the line
    send(line);
  }
};

This list function outputs the “text/csv” mime type and then iterates through the result set, producing a row of column headings then creates one row per document.

We can access the same views we did before, but now transformed into CSV:

https://examples.cloudant.com/appointments/_design/find/_list/csv/bydate?include_docs=true

https://examples.cloudant.com/appointments/_design/find/_list/csv/bydate?include_docs=true&startkey=%222015-09-29+12:00:00%22&endkey=%222015-09-29+23:59:00%22

You can import this output into Excel, Google Docs, or any other spreadsheet application.
Google Sheets screenshot

To open a file in Excel, download it, and when you save, add the suffix .csv

RSS

RSS is the XML-based news aggregation format that lists time-ordered events, such as a blog, in newest-first order.

We can query our index in newest-first order by simply supplying the descending=true parameter and return a fixed number of results with limit=10 e.g.

https://examples.cloudant.com/appointments/_all_docs?include_docs=true&descending=true&limit=10

To transform this feed into RSS, we need an additional list function to produce output that looks like this:

<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>Appointments</title>
<description>A list of appointments</description>
<item>
<title>Lady Catherine de Bourgh</title>
<description>Cut</description>
<link>http://mysite.com/89a66bae620c33d653a0c02f2add5b4a</link>
<guid>89a66bae620c33d653a0c02f2add5b4a</guid>
<pubDate>2015-09-29 16:00:00</pubDate>
</item>
<item>
<title>Mrs Ferrars</title>
<description>Styling</description>
<link>http://mysite.com/c84242804e809c9ab0b8d330dabd5716</link>
<guid>c84242804e809c9ab0b8d330dabd5716</guid>
<pubDate>2015-09-29 15:30:00</pubDate>
</item>
</channel>
</rss>

Our list function again iterates through the documents, this time outputing XML tags, ensuring that characters are suitably “xml encoded” where necessary. This is the list function required:

function(head, req) {

  // from https://github.com/miketheprogrammer/xml-escape/blob/master/index.js
  var xmlescapemap = {
      '>': '&gt;'
    , '<': '&lt;'
    , "'": '&apos;'
    , '"': '&quot;'
    , '&': '&amp;'
  };

  function xmlescape(string) {
    return string.replace(/([&"<>'])/g, function(str, item) {
              return xmlescapemap[item];
            })
  };

  var row;

  // output HTTP headers
  start({
    headers: {  'Content-Type': 'application/rss+xml'  },
  });

  // send the XML header
  send('<?xml version="1.0" encoding="UTF-8" ?>n');
  send('<rss version="2.0">n');
  send('<channel>n');
  send('<title>Appointments</title>n');
  send('<description>A list of appointments</description>n');

  // iterate through the result set
  while(row = getRow()) {

    // get the doc (include_docs=true)
    var doc = row.doc;
    send('<item>n');
    send('<title>' + xmlescape(doc.customer) + '</title>n');
    send('<description>' + xmlescape(doc.product) + '</description>n');
    send('<link>http://mysite.com/' + doc._id + '</link>n');
    send('<guid>http://mysite.com/' + doc._id + '</guid>n');
    send('<pubDate>' + doc.date + '</pubDate>n');
    send('</item>n');
  }

  // send the XML footer
  send('</channel>n');
  send('</rss>n');
}

An RSS reader app app, such as Feedly, can now consume the data by by accessing this URL:

https://examples.cloudant.com/appointments/_design/find/_list/rss/bydate?include_docs=true&descending=true&limit=10

iCalendar

In order to represent our data on a calendar app such as iCal, Outlook, or Google Calendar, we first create a second view which will sort the data by stylist and date. This lets us extract a single stylist’s agenda for the day.

Our new map function looks like this:

function (doc) {
  emit([doc.stylist, doc.date], null);
}

We combine the stylist and the date into an array which forms the key of the index. View the index itself here.

We can parameterise this query to extract each stylist’s data separately:

Mandy: https://examples.cloudant.com/appointments/_design/find/_view/bystylistanddate?startkey=[%22Mandy%22,%222015-09-29+00:00:00%22]&endkey=[%22Mandy%22,%222015-09-29+23:59:00%22]&include_docs=true

Justin: https://examples.cloudant.com/appointments/_design/find/_view/bystylistanddate?startkey=[%22Justin%22,%222015-09-29+00:00:00%22]&endkey=[%22Justin%22,%222015-09-29+23:59:00%22]&include_docs=true

The iCalendar format is a text-based format with VEVENT blocks that describe calendar events inside a VCALENDAR block:

BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Cloudant/JSON to iCal//EN
BEGIN:VEVENT
UID:1b8c3636057fabe11a020a0fe6e5f290
DTSTART:20150929T090000Z
DTEND:20150929T110000Z
SUMMARY:Elinor Dashwood - Cut
SEQUENCE:1
END:VEVENT
BEGIN:VEVENT
UID:c84242804e809c9ab0b8d330dab53753
DTSTART:20150929T110000Z
DTEND:20150929T123000Z
SUMMARY:Lucy Steele - Cut and colour
SEQUENCE:0
END:VEVENT
END:VCALENDAR

Outputing a line of text from a list function is simple; the biggest challenge we have here is dates. Our dates are stored in this format: 2015-09-29 10:00:00 and the iCal format expects 20150929T100000Z, where T delimits the date from the time and Z indicates the GMT timezone. Converting this string is easy, but we also have to do some date artithmetic to add the appointment’s duration to the start time to get an end time stamp in the correct format. This is the code to do it:

function(head, req) {

  var row;

  // output HTTP headers
  start({
    headers: {  'Content-Type': 'text/calendar'  },
  });

  // send the iCal header
  send('BEGIN:VCALENDARn');
  send('VERSION:2.0n');
  send('PRODID:-//Cloudant/JSON to iCal//ENn')

  // iterate through the result set
  while(row = getRow()) {

    // get the doc (include_docs=true)
    var doc = row.doc;

    // new event
    send('BEGIN:VEVENTn');

    // unique identifier
    send('UID:' + doc._id + 'n')

    // date format conversion and arithmetic to calculate end date
    var s = new Date(doc.date.replace(" ","T"));
    var ts = s.getTime();
    ts += doc.duration * 60 * 60 * 1000;
    var end = new Date();
    end.setTime(ts);
    var endstr = end.toISOString().replace(/-/g,"").replace(/:/g,"");
    endstr = endstr.replace(/.000/,"");
    var startstr = doc.date.replace(" ","T") + "Z";
    startstr = startstr.replace(/-/g,"").replace(/:/g,"");
    send('DTSTART:' + startstr + 'n');
    send('DTEND:' + endstr + 'n');

    // description
    send('SUMMARY:' + doc.customer + ' - ' + doc.product + 'n');

    // sequence number
    var seq = parseInt(doc._rev.split("-")[0])-1;
    send('SEQUENCE:' + seq + 'n');

    // end of event
    send('END:VEVENTn');
  }

  // send the XML footer
  send('END:VCALENDARn');
}

We can then pass the date from our bystylistanddate view through our ical list function with the following URLs:

Mandy: https://examples.cloudant.com/appointments/_design/find/_list/ical/bystylistanddate?startkey=[“Mandy”,”2015-09-29+00:00:00″]&endkey=[“Mandy”,”2015-09-29+23:59:00″]&include_docs=true

Justin: https://examples.cloudant.com/appointments/_design/find/_list/ical/bystylistanddate?startkey=[“Justin”,”2015-09-29+00:00:00″]&endkey=[“Justin”,”2015-09-29+23:59:00″]&include_docs=true

You can easily import the iCal text that Cloudant produces into a Calendar application:

Google Calendar after iCal feed import

Conclusion

Now you know that you can take Cloudant’s NoSQL database JSON documents and use JavaScript map functions to order your data in a materialized view. With that view, you can extract the data in that order, or a subset of it. Then use list functions to transform the output into a range of formats other than the JSON that’s standard to Cloudant. The same design document can support multiple views of the data (bydate, bystylistanddate) and multiple list functions (CSV, RSS, and iCal), which you can combine together using Cloudant’s RESTful API.

Hopefully this has given you a flavour of what you can acheive with Cloudant. Sign up for a Cloudant account today and try it for yourself!

Join The Discussion

Your email address will not be published. Required fields are marked *