At first glance, the pricing plans and provisioned throughput capacity for Cloudant on IBM Cloud can be a bit confusing. This post will give you an overview of what you need to know to choose the right pricing plan and handle throughput capacity limits in your app.
How does Cloudant’s pricing work?
Cloudant has three main types of pricing plans:
- Lite (Free)
- Dedicated Hardware
*It’s important to note that within the Standard plan, you can adjust your Provisioned Throughput Capacity, with costs varying accordingly.
The Lite plan comes with a fixed amount of throughput capacity and 1GB of disk space, while the Standard plan is priced based on configurable provisioned throughput capacity and the metered amount of disk usage. The Dedicated plan is not based on throughput capacity, and is therefore outside the scope of this post, but you can find more information about it here.
The table below (taken from the Cloudant Pricing page) outlines the parameters of the Lite plan and shows four example capacity configurations of the Standard plan:
You are not limited to the four levels shown in the table above. As you can see in the screen capture below, Cloudant now gives you more control over the level of throughput capacity you want for your account. You can simply adjust the slider in your Cloudant Dashboard to adjust the capacity of your account. The lowest level you can choose in the Standard Plan is 100 Lookups/sec, 50 Writes/sec, and 5 Queries/sec, and you can select any multiple of that base plan up to 5,000 Lookups/sec, 2,500 Writes/sec, and 250 Queries/sec.
The Disk Space part of the pricing is pretty straightforward, so I won’t cover it here. The Provisioned Throughput Capacity requires a little more explanation. You can check out this documentation for a more detailed description of how it works, but here’s the basic rundown:
Each plan gives you a different number of Lookups/sec, Writes/sec, and Queries/sec. So, let’s start by defining each of those:
- Lookup: reading a specific document based on its “_id”
- Write: creating/changing/deleting a document
- Query: reading against an index, i.e. a view or search (including the primary index: _all_docs)
For each of the three event types, each plan sets a “Provisioned Throughput Capacity”, which is basically a rate limit. So, for example, if your Cloudant instance is on the lowest capacity setting of the Standard plan, and you make more than 100 lookups in a second, Cloudant will reject all subsequent requests within that second. Any lookup request made after exceeding the 100 lookups/second will return a 429 Too Many Requests response. The different events are limited independently from each other; for example, an overload of lookups will never affect your write requests.
What happens if I exceed my Provisioned Throughput Capacity?
As previously mentioned, any time you exceed your plan’s capacity, your request will get a 429 response, which means that none of your data will be returned and for writes, no data will be written to Cloudant. If you don’t handle these appropriately, they can cause serious problems for your application. Luckily, there are a few easy ways to handle them.
How do I handle 429 Responses?
There are five main things you can do to prevent 429 responses from causing problems in your application:
- Application Retry: Have your app retry any request to Cloudant that initially gets a 429 response. Cloudant’s official client libraries for java, node, and python all have built-in capabilities you can set up to retry requests that return a 429 after some interval of time. This solution is the easiest to implement, so it’s worth including as a backup even if you don’t expect to need it. However, keep in mind that if you are exceeding the rates by a large margin or if you are consistently exceeding the plan rates, retrying won’t completely fix the problem. For example, if your limit is 5 queries/sec, and you try to make 15 queries in a second, 10 will return 429 responses; if all 10 of them retry in the next second, you’ll still have 5 of those returning 429 responses. Or, imagine a situation in which you are consistently making 6 query calls per second (with a 5 query/sec limit); in the first second, one call will return 429 and try again in the next second, which will bring the total queries in the next second to 7; two of those calls will return 429 and retry in the third second, for a total of 8 queries in the third second. In such a case, the 429 responses would keep compounding and you would still end up with some of the calls failing despite the retries. Another thing to consider is that any request that has to be retried will obviously take longer than it would if it succeeded on the first try, so if your application is consistently getting 429 responses and retrying, you might start to notice the slower responses.
- Use Bulk Operations: Use bulk operations to combine multiple calls into one. Cloudant has a bulk document API that can be used to read, create and update multiple documents in a single request, and the java, node, and python libraries all support this functionality. If there is anywhere in your code where there are multiple write requests or multiple lookups happening at the same time, combining those writes or lookups into a single bulk operation would lower the rate of requests and reduce the likelihood of receiving a 429. One thing to note for bulk reads is that calling the “_all_docs” API endpoint calls the primary index and is therefore counted as a query, while calling the “_bulk_get” API endpoint counts as a lookup. Using bulk operations won’t be applicable in all use cases, but where it is relevant, it can significantly improve efficiency.
- Application Rate-Limiting: Implement rate limiting in your app to prevent it from making more requests per second than your plan allows. This blog post on queueing API requests to use Cloudant more efficiently explains an easy way to do this in node applications by using the qrate library. The basic idea is that you can create a queue for each event type (query, write, lookup), and each queue would have a rate limit corresponding to your plan’s provisioned throughput capacity for that event type; every time the application makes a call to Cloudant, the call is pushed into the appropriate queue. The queue would then regulate the rate of calls actually going through to Cloudant and keep the rate from exceeding the throughput capacity limit. This solution, while a little more difficult to implement, can be very effective in preventing 429 responses. Keep in mind, however, that at times of high traffic, the rate limiting could cause slower response times for your end user.
- Caching: Add caching to your app. Caching the data from your Cloudant databases can reduce the number of calls that are made to Cloudant. Cached data could also be used as a fallback if a call to Cloudant returns a 429 or some other error. Services like Compose for Redis on IBM Cloud can be used for caching and charge only for the amount of storage used, not for the rate of requests made (and as a side note: Using Redis with Cloudant also allows you to do other useful things like build custom indexes). Caching won’t necessarily make sense for every use case, and even if it does, you’ll have to decide what sort of caching patterns will work best. That said, if implemented well, this solution can both reduce the frequency of 429 responses and increase the speed of your application.
- Increase Throughput Capacity: Finally, you can always upgrade from the free Lite to the paid Standard plan, or increase your Provisioned Throughput Capacity in the Standard plan. If you’re considering moving to a higher capacity, you might want to check out the Monitoring tab in your Cloudant dashboard (https://<username>.cloudant.com/dashboard.html#ccm-monitoring/usage), which shows throughput for each event type for the past fifteen minutes. You might also look into your application’s logs to determine how frequently 429 responses are being returned.
It’s always useful to note that sometimes you can spend weeks of time and effort on building some grand caching and queuing strategy, which costs $50k to implement, in order to save $80 a month in cloud costs — not the best tradeoff. Keep in mind that your decisions will impact the cloud costs and the maintenance costs associated with your application.
How do I change the plan and/or capacity that my Cloudant instance is on?
You can switch between the three main types of pricing plans in the IBM Cloud Dashboard. Click on your instance of Cloudant in your IBM Cloud Dashboard, then click “Plan” in the left sidebar. You should see the table below showing the three plans; you can select and save the appropriate plan.
If your Cloudant instance is on the Standard plan, you can change the provisioned throughput capacity for your instance in the Cloudant Dashboard. First, launch the Cloudant Dashboard from the Manage page of your Cloudant service on the IBM Cloud. Then go to the Account tab (click the person icon), and you should see the screen shown below, which allows you to adjust Throughput Capacity using a slider and the Update button.
The slider for adjustable Throughput Capacity is a new feature being rolled out to Cloudant accounts over the next few weeks, so it’s possible that you might instead see a table with three tier options in your Cloudant Account tab, like the screenshot below. If that is the case, you can choose between the three Throughput Capacity levels shown or check back in a few weeks once the slider feature has been added to your account.
**Note: the images of pricing tables are all valid as of the time this article was posted (June 2018), but could change in the future.