This post is part of a series of posts created by the two newest members of our Developer Advocate team here at IBM Cloud Data Services. In honour of the book Seven Databases in Seven Weeks by Eric Redmond and Jim R. Wilson, we challenged Lorna and Matt to take a new database from our portfolio every day, get it set up and working, and write a blog post about their experiences. Each post reflects the story of their day with a new database. We’ll update our seven-days GitHub repo with example code as the series progresses. —The Editors

  • Database type: schemaless JSON-like storage with search and data aggregation
  • Best tool for: creating highly scalable apps that need to query large datasets fast
MongoDB logo
MongoDB. It’s kind of a thing.

Overview

MongoDB is a NoSQL database that allows you to store your data in JSON-like documents rather than the more traditional RDBMS approach. With a focus on scalability (sharding and replication are available out of the box) and flexibility (data stores are schemaless and easily searchable via secondary indexes — even geospatial!). MongoDB intends to provide a database that maps to your application and keeps up through iterations.

There are also a number of other features, such as a powerful Data Aggregation Pipeline and MapReduce, or for more in depth analysis you can connect MongoDB directly to Hadoop or Spark.

MongoDB is open source, so you can get up and running on any platform although we used MongoDB on Bluemix for our examples. We will cover how to get started with MongoDB and put together a simple example showing how you can utilise this database to store blog posts with threaded comments.

Getting Set Up

Log in to Bluemix (register for a free trial account if you’re not already a Bluemix user) and we’ll add the MongoDB service to our account. You’ll find this by clicking on Catalog and looking in the “Data and Analytics” section for “Compose for MongoDB”. Spin up a new MongoDB and when it starts, ask it for some access credentials by going to the Service Credentials tab and clicking the New Credentials button.

Create new credentials to connect to RabbitMQ

There are two things you’ll need from the JSON data that describes your credentials:

  • The uri field has the mongodb:// URL we’ll need to connect to the database.
  • The ca_certificate_base64 is a base64-encoded version of the certificate, so we’ll need to copy this, decode it and write it to a file – in the example, the file is simply called cert.

We are going to use PHP to create our examples, and you’ll need to install the MongoDB PHP extension. Since I’m on Linux, I’ll use pecl:

pecl install mongodb

According to the PHP documentation, Mac users should use brew:

brew install php55-mongodb

New to PHP on a Mac? Without installing MAMP, here’s how to get PHP running on your local Apache web server. And here’s how to access php.ini if you need to append extension=mongodb.so.

Connecting from PHP

MongoDB is well supported, with libraries available for all of your favourite languages, including PHP. These libraries make it very easy to gain access to all of MongoDBs features and let you focus on building your app.

In addition to the MongoDB extension, we’ll use the PHP library that MongoDB provide to give a nice, easy wrapper for accessing MongoDB. This can be installed via Composer:

composer require "mongodb/mongodb=^1.0.0"

This adds the requirement into your composer.json file (creating it if it didn’t exist already). You’ll need to run the composer install command to bring the files in; these can be found in the vendor directory.

Now use the connection string we collected when creating credentials to connect to your MongoDB deployment as so (remember to replace these details with your own connection string):

connect.php

$client = new MongoDBClient("mongodb://admin:password@bluemix-sandbox-dal-9-portal.4.dblayer.com:20792/admin?ssl=true", [], ["cafile" => "./cert"]);

$posts = $client->selectDatabase("posts")->selectCollection("posts");

Notice that as well as passing in the connection string, we are also providing the path to the SSL certificate that we downloaded earlier. We can then select the posts collection from the posts database, that we create. This file is saved as connect.php and our other scripts also use it to connect.

What is a collection in MongoDB? From the MongoDB reference manual: “A [collection is a] grouping of MongoDB documents. A collection is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection have a similar or related purpose.”

Inserting Data

The PHP library is a fairly lightweight wrapper around MongoDB’s command-line interface, which really helps to make this database feel like a consistent interface across platforms (all the other language drivers also follow this pattern). In this case, we’re using the insertOne method to add a new blog post to our posts collection.

This example shows a very basic PHP script which will display an HTML form, allowing the user to enter some data, which then gets saved in the database. Here’s the form itself, followed by the code:

Adding a post

If you’re not seeing “Post saved” upon form submission, remember to enable debugging!

add_post.php

<?php

require("connect.php");

if($_POST) {
    $data = ["title" => filter_input(INPUT_POST, "title", FILTER_SANITIZE_STRING),
        "description" => filter_input(INPUT_POST, "post", FILTER_SANITIZE_STRING),
    ];
    $posts->insertOne($data);

    echo "Post saved";

} else {

    // show the form
?>

<html>
<head>
<title>MongoDB in action</title>
<link rel="stylesheet" href="http://yui.yahooapis.com/pure/0.6.0/pure-min.css">
</head>
<body>
<h1>Add A Post</h1>

<form action="add_post.php" method="post" class="pure-form pure-form-stacked">
<label for="title">Title
<input type="text" id="title" name="title" size="60"/>
</label>

<label for="post">Post
<textarea id="post" name="post" rows="6" cols="80"></textarea>
</label>

<input type="submit" value="Save post" class="pure-button pure-button-primary"/>
</form>

<?php
}
?>

You can see that if there’s no data supplied, a simple form is shown here (with a little http://purecss.io to make it nicer to look at) so we can quickly start adding data. If data does arrive as a POST request, then we build up an array with the data we want, and then save it to MongoDB.

Remember that MongoDB does not have a schema, you can build up whatever data structure you like before inserting, and the shape of the data can be different each time which makes it ideal for sparse properties, for example.

MongoDB will give our record a unique ID when it saves it, you may also want to supply this yourself which you can do by including a _id key and the desired value when creating the data to insert. Either way, this is useful when we come to fetch a list of records and want to be able to identify just one of them.

Fetching Data

Mongo has some great query functionality, and its “aggregation framework” is excellent for gaining insights into potentially large and nested data sets. We just want a list of posts however, and for that we simply use the find() method, then output each of our posts along with a count of comments (more on comments in the next section):

index.php

<?php

require("connect.php");

$all_posts = $posts->find([]);

?>

<html>
<head>
<title>MongoDB in action</title>
<link rel="stylesheet" href="http://yui.yahooapis.com/pure/0.6.0/pure-min.css">
</head>
<body>
<h1>Blog Posts</h1>

<ul>
<?php
foreach($all_posts as $p):
?>
<li><a href="post.php?id=<?=$p->_id ?>"><?=$p->title ?></a> (<?=count($p->comments) ?> comments)</li>

<?php
endforeach;
?>
</ul>

MongoDB returns each document as an object, with properties set for each of the fields that were stored. This makes it very easy to access using object notation, e.g. the $p->title in the example above. In the list, we’re also adding hyperlinks and using the ID so that we can fetch individual records on another page.

Adding Nested Data

MongoDB doesn’t really do joins, so for the most part, database design involves storing data together that will be used together. So if you’re storing content, you’ll probably have a bunch of content elements and anything they rely on, all inside one document. In this example, we’re storing blog posts and we’ll add the comments as part of the post record.

Here’s the individual post page, which displays the post, allows a user to add a comment, and lists the comments that have already been added:

One post, and the comments

post.php

<?php

require("connect.php");

if($_POST) {
    $id = filter_input(INPUT_POST, "post_id", FILTER_SANITIZE_STRING);
    $data = ["username" => filter_input(INPUT_POST, "name", FILTER_SANITIZE_STRING),
        "comment" => filter_input(INPUT_POST, "comment", FILTER_SANITIZE_STRING),
    ];
    $result = $posts->updateOne(["_id" => new MongoDBBSONObjectID($id)], ['$push' => ["comments" => $data]]);

    header("Location: /post.php?id=" . $id);
    exit;

} else {
    $id = filter_input(INPUT_GET, "id", FILTER_SANITIZE_STRING);
}

if($id):
    $post = $posts->findOne(["_id" => new MongoDBBSONObjectID($id)]);

?>

<html>
<head>
<title>MongoDB in action</title>
<link rel="stylesheet" href="http://yui.yahooapis.com/pure/0.6.0/pure-min.css">
</head>
<body>
<h1><?=$post->title ?></h1>

<p><?=$post->description ?></p>

<h2>Add Comments</h2>

<form action="post.php" method="post" class="pure-form pure-form-stacked">
<input type="hidden" name="post_id" value="<?=$id ?>" />

<label for="name">User Name
<input type="text" id="name" name="name" size="20"/>
</label>

<label for="comment">Comment
<textarea id="comment" name="comment" rows="4" cols="50"></textarea>
</label>

<input type="submit" value="Post comment" class="pure-button pure-button-primary"/>
</form>

<?php
foreach($post->comments as $comment):
?>

<hr />
<?=$comment->comment ?><em> - by <?=$comment->username?></em>

<?php
endforeach; // comments
endif; // if the post actually existed
?>

The interesting bit here is really where we save the comments, the call to $posts->updateOne. We use the same filter criteria as we do when we fetch the post, but then we go on to push the $data array onto the end of the comments collection. If this collection doesn’t exist, MongoDB will simply create it.

Look out for using the mongo identifiers such as $push — in PHP we need to carefully wrap them in single quotes so that PHP doesn’t try to interpret the $!

Now our comments are inside our existing MongoDB document:

{
    "_id" : ObjectId("575038cc1661d711090e9911"),
    "title" : "Databases are excellent",
    "description" : "We could talk about them for hours",
    "comments" : [
        {
            "username" : "lorna",
            "comment" : "I think so too"
        },
        {
            "username" : "lorna",
            "comment" : "I think so too"
        },
        {
            "username" : "fred",
            "comment" : "Thanks for this post, it helped me!"
        },
        {
            "username" : "george",
            "comment" : "I totally disagree, they are a hazard"
        }
    ]
}

With this in place, we can add some comments to our database and then revisit the index page to see how things are looking:

List the posts

Conclusion

MongoDB is quite a key player in the NoSQL arena, and this shows through with the amount of developer support that is available on their website in the shape of libraries and docs; however, there were some instances where we were looking for examples that didn’t seem to exist! On the plus side, MongoDB does have a solid user base and there is a rich ecosystem of content from forums and other people’s blog posts that will help you — beware that the PHP libraries changed relatively recently though so you may find some content is outdated.

One feature that can set it apart from some of its rivals is that you don’t need to write the whole document back again when updating — you can simply push updates to the fields that you require. This can help avoid conflicts in a write heavy application. The big selling point, however, is the schema-less and scalable nature of the database, meaning that you really can build apps with the future in mind without worrying about how your infrastructure will adapt. The inclusion secondary indexes allows quick searching on huge amounts of data and that can only be a positive.

With MongoDB being open source you can get started on any platform and deploy to more or less anywhere, or if you want to avoid that entirely there are a number of cloud based providers available.

2 comments on"Seven Databases in Seven Days – Day 2: MongoDB"

  1. Previously in my Mongo db table schema I have used a mobile_no as unique field. Now i had removed it from my schema and uploaded the project via cloud foundry cli.
    But it still giving me error for unique mobile no.
    Please suggest how can I update my Mongo db schema on Bluemix.

  2. Nice post. Is it possible to store C/C++ multi-line code in Json and save/fetech via queries.

Join The Discussion

Your email address will not be published. Required fields are marked *