One of the best features of Apache© CouchDB™ is its robust replication protocol. With a single request, users can sync databases with only periodic availability and keep them in sync indefinitely. Replication is a cornerstone feature to enable customers to build resilient data stores that span geographical locations for their applications, and to build Progressive Web Applications with Offline Sync capability.

Our users at IBM Cloudant push its CouchDB core to its limits, and we’ve found that in certain circumstances the replicator can struggle with performance, stability, and operability. Last summer, a team of engineers at IBM got together to fix those problems. Today we’re proud to announce the release of our new replicator for IBM Cloudant, which has been open sourced into the master version of Apache CouchDB.

It is important for any users of IBM Cloudant replication to read the details of this blog both to understand the new capabilities but also to take advantage of new endpoints. In addition, IBM Cloudant will be removing a portion of the replication API in the near future and for certain customers who are leveraging the replication API programmatically to check replication status, this could mean an update to their application to avoid any future issues. Please contact Cloudant Support by opening a ticket via the Cloudant Dashboard if you have any questions or concerns after reading the entirety of this blog.

Connection Pooling

An important performance improvement we made is with how the replicator allocates TCP connections. Even for idle replications, the old replicator would simply open four TCP connections for each replication. This behavior wasted cluster and load balancer resources, so we’ve added functionality to share TCP connections between replications. From internal tests, we’ve seen this reduce the number of open TCP connections by up to 90% and allow for a much larger number of concurrent replication jobs to be scheduled.

The Replication Scheduler

At the core of the new replicator is a replication scheduler. Prior to the replication scheduler, CouchDB would run all replications concurrently even when it was beyond the capability of the cluster resulting in poor performance across all replications. The new replicator runs replications in round-robin batches for a fixed amount of time. This allows a single database node to manage millions of replications.

Also, replications now have well-defined state transitions.

  1. initializing: the replication has been added to the scheduler, but has not yet been initialized or scheduled
  2. error: the replication could not be turned into a job (e.g. because it is a filtered replication and filter code could not be fetched from the source database)
  3. pending: the replication has been scheduled to run
  4. running: the replication is currently running
  5. crashing: a temporary error has occurred with the job. The job will be retried later
  6. completed: the job has completed (only non-continuous replications reach this state)
  7. failed: the job has permanently failed and will not be retried (e.g. the source or target URLs are invalid)
Replication Scheduler Diagram

New API Endpoints

With the previous iteration of the replication API, users of CouchDB and Cloudant had problems understanding the status of their replications. We’ve now added two new API endpoints, /_scheduler/docs and /_scheduler/jobs, to make it easy to see which of the several new states a replication job can be in.

The jobs endpoint provides information about every running replication. Let’s dive into an example:

> curl -u 'user:pass' ‘http://<account>.cloudant.com/_scheduler/jobs'
{
 "total": 1,
 "offset": 0,
 "jobs": [
   {
     "database": "_replicator",
     "id": "88b3c32b6d911f1cc0f6d8de4c66e999+continuous+create_target",
     "pid": null,
     "source": "http://user:*****@<account>.cloudant.com/source_db/",
     "target": "http://user:*****@<account>.cloudant.com/target_db/",
     "user": null,
     "doc_id": "myrep",
     "history": [
       {
         "timestamp": "2016-11-10T06-51-19Z",
         "type": "crashed",
         "reason": "db_not_found: could not open http://user:*****@<account>.cloudant.com/source_db/"
       },
       {
         "timestamp": "2016-11-10T06-51-19Z",
         "type": "started"
       },
       {
         "timestamp": "2016-11-10T06-50-35Z",
         "type": "crashed",
         "reason": "db_not_found: could not open http://user:*****@<account>.cloudant.com/source_db/"
       },
       {
         "timestamp": "2016-11-10T06-50-35Z",
         "type": "started"
       },
       {
         "timestamp": "2016-11-10T06-50-35Z",
         "type": "added"
       }
     ],
     "node": "node1@127.0.0.1",
     "start_time": "2016-11-10T06-50-34Z"
   }
 ]
}

You can see that there is one replication running, and it immediately crashed. It ran a couple times looking for the source database, but it doesn’t exist. Let’s create it!

> curl -u ‘user:pass’ -XPUT ‘http://<account>.cloudant.com/source_db'
{"ok": true}

Now let’s wait for a minute and see if it can successfully start.

> curl -u 'user:pass' ‘http://<account>.cloudant.com/_scheduler/jobs'
{
 "total": 1,
 "offset": 0,
 "jobs": [
   {
     "database": "_replicator",
     "id": "88b3c32b6d911f1cc0f6d8de4c66e999+continuous+create_target",
     "pid": "",
     "source": "http://user:*****@<account>.cloudant.com/source_db/",
     "target": "http://user:*****@<account>.cloudant.com/target_db/",
     "user": null,
     "doc_id": "myrep",
     "history": [
     {
         "timestamp": "2016-11-10T06-52-19Z",
         "type": "started"
       },
       {
         "timestamp": "2016-11-10T06-51-19Z",
         "type": "crashed",
         "reason": "db_not_found: could not open http://user:*****@<account>.cloudant.com/source_db/"
       },
       {
         "timestamp": "2016-11-10T06-51-19Z",
         "type": "started"
       },
       {
         "timestamp": "2016-11-10T06-50-35Z",
         "type": "crashed",
         "reason": "db_not_found: could not open http://user:*****@<account>.cloudant.com/source_db/"
       },
       {
         "timestamp": "2016-11-10T06-50-35Z",
         "type": "started"
       },
       {
         "timestamp": "2016-11-10T06-50-35Z",
         "type": "added"
       }
     ],
     "node": "node1@127.0.0.1",
     "start_time": "2016-11-10T06-50-34Z"
   }
 ]
}

Excellent! Since I have fewer than the maximum number of replications running, my replication will never be suspended.
The docs endpoint can be used to track the status of replication documents. In fact, after writing a new replication document, it is not necessarily to read that document or monitor it directly anymore as its state is available through the docs endpoint. The difference between the two endpoints is, jobs is focused on replication tasks and docs on replication documents. Most often a replication document will create a new replication job, but not always. For example, a document might have a malformed source or target URL, or it could specify a replication which is already created by another document. Also, a replication could be created via the _replicate endpoint and not be backed by a document in a _replicator database.

Both the jobs and docs endpoints allow specifying job or document based on their IDs. Continuing with the previous example:

> curl -u 'user:pass' ‘http://<account>.cloudant.com/_scheduler/docs/_replicator/myrep'
{
 "database": "_replicator",
 "doc_id": "myrep",
 "id": "88b3c32b6d911f1cc0f6d8de4c66e999+continuous+create_target",
 "node": "node1@127.0.0.1",
 "source": "http://user:*****@<account>.cloudant.com/source_db/",
 "target": "http://user:*****@<account>.cloudant.com/target_db/",
 "state": "running",
 "info": null,
 "error_count": 2,
 "last_updated": "2016-11-10T06-52-19Z",
 "start_time": "2016-11-10T06-50-34Z"
}

This way you can query for the status of a document-triggered replication directly. Both endpoints have query string parameters for filtering and pagination. You can find out more information in the documentation here.

Replicator Database Bloat

CouchDB keeps track of persistent replications in special replicator databases. If there’s an error, the replicator database will be updated so users can see if there’s an issue with their replication. However, if there are too many updates to the replicator database (particularly if there are conflicting updates) it could possibly outpace the compactor, causing unbounded growth of the replicator database.

We’ve done two things to fix the issue. Firstly, we implemented exponential back-off for crashing replications. This means that crashing replications will update the replicator database much less frequently. Secondly, updating the replicator database when a replication crashes is now optional. If disabled, replicator documents will only be updated when the replication is started and when a terminal state is reached.

API Changes, Backwards Compatibility, and Timing

The new replicator will not update replication docs with transient states. Therefore, if you are reading or manipulating replicator document states, you will need to change your application code to use the _scheduler/docs API. IBM Cloudant will use a compatibility mode for the minimum of the next 90 days to allow customers to update their applications. In that case replicator will continue updating documents with transient replication states.

After the compatibility mode window of the new replicator has passed, the following API changes will also occur:

  • New replication document state: failed. Replications end up in this state when replication documents are malformed, for example have an invalid URL source or target, or when replication documents specify a duplicate replication. This happens when there is another replication document that has the same replication parameters.
  • Sometimes replication tasks would be stopped and then started later by the scheduler. During that time task state will not appear in _active_tasks endpoint. Use _scheduler/docs endpoint instead to query their status. If replication was started from _replicate endpoint use _scheduler/jobs endpoint.
  • For compatibility, replication documents will continue to be updated with triggered and error states. However states shown in the _scheduler/docs and _scheduler/jobs endpoints will be one of: initializing, failed, error, running, pending, crashing and completed. These states are more detailed and will provide more information to the user than before. The mapping between the previous
    states and the new ones is roughly like this:

    • triggered -> initializing, running or pending
    • error -> failed, error or crashing
    • completed -> completed

Again, IBM Cloudant will be keeping the old functionality of the replication API for the sake of backward-compatibility while our customers transition to use the new jobs and docs API endpoints. However, starting from 90 days from the release of this blog, Cloudant will be updating its environments to stop updating replicator documents with errors, and customers will have to use the new endpoints to get current status of replication jobs.

In Closing

Let us know what you think! We hope you’re excited to see these improvements as much as we are. Reach out to Cloudant Support via the Cloudant Dashboard if you have any questions or concerns.

© “Apache”, “CouchDB”, “Lucene”, “Apache CouchDB”, “Apache Lucene”, and the CouchDB and Lucene logos are trademarks or registered trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.

Save

Save

Save

Save

Save

Save

2 comments on"A new replicator for Apache CouchDB & IBM Cloudant"

  1. This is great. Especially the connection pooling because anything improving the overall performance is a win for everyone using CouchDB/Cloudant.

  2. Noushin Kananian July 12, 2017

    Many thanks for sharing your experience and invaluable knowledge!

Join The Discussion

Your email address will not be published. Required fields are marked *