This post is part of a series of posts created by the two newest members of our Developer Advocate team here at IBM Cloud Data Services. In honour of the book Seven Databases in Seven Weeks by Eric Redmond and Jim R. Wilson, we challenged Lorna and Matt to take a new database from our portfolio every day, get it set up and working, and write a blog post about their experiences. Each post reflects the story of their day with a new database. —The Editors
- Database type: in-memory key value store
Best tool for: storing computed but non-vital values, such as caching heavy page fragments or keeping running totals
Redis (a sort-of-acronym for REmote DIctionary Service) is an in-memory key value store. First the good news: storing data in memory means that fetching it is blisteringly fast. Any data you can safely store in Redis can be fetched many times faster than it could from any traditional database. With the good news, comes some bad news: storing data in memory means you are constrained by how much memory is available for Redis to use. If Redis runs out of memory, you’ll find that writes fail at best, or that Redis dies at worst.
These restrictions make Redis brilliant, but only for certain use cases. These cases are usually where losing some data doesn’t matter hugely, or the data isn’t large, or it’s very important that the data be available in a blisteringly fast manner. It’s rare to see an application that uses Redis as its only storage solution. Redis is usually deployed as an auxiliary datastore alongside a database like PostgreSQL or Cloudant (Apache CouchDB™). Redis makes a brilliant cache and, particularly in modern web applications, can really help to take the load off your primary datastore.
Getting Started with Redis
It’s your choice whether you want to install Redis locally or, for a quick start, you can also spin up a Redis instance on Bluemix (free trial when you sign up). This tutorial will work either way.
You’ll at least want the
redis-clitool installed for this tutorial, regardless of where Redis is actually running. The easiest way to get the
redis-cli, is to install Redis locally, whether you’re using Bluemix or not:
From your Bluemix dashboard, click on Catalog and choose the Data and Analytics option under Services. There you will find the Compose for Redis option that we’ll be using.
Choose this option, then if you’d like to customize the service name, do so and click the Create button. (A default name is always assigned.) When the service finishes provisioning, you should be able to refresh the page and see the dashboard for your Redis installation. Alternatively, you can use the hamburger menu to return to your dashboard and locate your new deployment there.
For now, we’re interested in getting our connection details so we can start using Redis. These can be found on the Service Credentials tab, where there should be a set of credentials created already. When you click View Credentials, you will see a JSON object with various connection details in it. Copy the value of the
uri_cli field (a command starting with
redis-cli) into your clipboard and then we’re all set!
Redis from the Command Line
Before we get specific with accessing Redis from any specific programming languages, we’ll have a little chat with it using its built-in command line tool,
redis-cli. You can simply paste the
uri_cli value you copied earlier in at the command line and it will open a Redis prompt for you:
$ redis-cli -h bluemix-sandbox-dal-9-portal.1.dblayer.com -p 18491 -a QIJJVMCPGSUZIJBV bluemix-sandbox-dal-9-portal.1.dblayer.com:18491>
Try out your connection by asking Redis how it’s feeling today: use the
info command and the response will show you that Redis is working, as well as telling you a raft of other statistics about its health. Redis is a key value store, so try these commands for storing, retrieving, and expiring values:
SET [key] [value]e.g.
SET comments 5(reuse this command to update a key’s value)
EXPIRE [key] [time]e.g.
EXPIRE comments 10(expires the comment in 10 seconds’ time)
SETEX [key] [time] [value]e.g.
SETEX comments 5 10(sets and expires the key in a single command)
So far, hopefully everything seems easy and you’re enjoying your new key value store. Redis can handle more than just keys and values though, so let’s look at some more interesting data types.
Working with Hashes in Redis
In Redis, a hash is a way of storing a potentially very large number of keys and values mapped together into a structure, similar to a Python hash, making them ideal for representing objects with a series of properties. The commands for working with hashes are all prefixed with
H, so we use
HSET to set a hash value and
HMSET to set multiple values in a hash. Examples to follow in just a moment, but first let’s talk about keys.
A key in Redis can be any string, and can be set to expire. By convention we namespace keys using the colon (
:) character so that similarly-named keys can be searched for. You’ll notice this in the examples too.
HSET product:hat color black (integer) 1 HGET product:hat color "black"
The first example is pretty simple, declaring a key
product:hat with a field color and a value black, with the command
HSET. This command allows us to set one field, and its sister command
HGET lets us fetch one field.
If we want to set multiple fields in a single call — which seems reasonable in any non-trivial programming example — then the
HMSET command will help us:
HMSET product:hat size L material wool cost 25 OK HGETALL product:hat 1) "color" 2) "black" 3) "size" 4) "L" 5) "material" 6) "wool" 7) "cost" 8) "25"
To retrieve all these multiple values that we set, the
HGETALL command returns all the fields and values to us. If you wanted only the keys or only the values, then the commands
HVALS are your friends. Getting the keys and values separately makes more sense in many ways, so you may find these more useful.
Finding Your Keys
Once you’ve put data into Redis, how can we inspect what’s stored there? Don’t be tempted by the command
KEYS — it does do what you expect and lists all the keys matching a specified pattern — but it unexpectedly affects performance.
bluemix-sandbox-dal-9-portal.1.dblayer.com:18491> help KEYS KEYS pattern summary: Find all keys matching the given pattern since: 1.0.0 group: generic
This command is wildly intensive to run. So feel free to use it on your toy platform today, but stay well away from running it in production. Instead, the
SCAN command is a better approach because it returns results in chunks and can also filter for specific keys. Let’s see an example of the
SCAN command in action, on the products example from before:
SCAN 0 1) "0" 2) 1) "appname" 2) "product:shoes" 3) "product:hat
SCAN takes at least one argument: the cursor to start from. When we have long lists of keys, the first call to
SCAN will return some keys, plus a cursor to use as the argument for the next call that will return yet more keys. In the output of
SCAN, the first number is the cursor, which will be zero if all results have now been returned in this call or sequence of calls. The second value is the list of matching keys, which is why it’s important to be consistent and formulaic about naming your Redis keys. With this approach, I can easily search for all product keys in my database:
SCAN 0 match product:* 1) "0" 2) 1) "product:shoes" 2) "product:hat"
We can also use the related command
HSCAN to look inside one of our product hashes that we created earlier:
HSCAN product:hat 0 1) "0" 2) 1) "color" 2) "black" 3) "size" 4) "L" 5) "material" 6) "wool" 7) "cost" 8) "25"
This example returned the cursor (
0, since this is the entire record) and the same results as
HGETALL did earlier. Remember that with
HSCAN we can also supply patterns to match and, therefore, finding predictably-named keys in a potentially complex hash becomes much easier.
Sorted Sets in Redis
Redis’s sorted sets feature offers a big performance boost. Sorted sets are collections of values that also have a score associated with them, and their commands are all prefixed with
Z. (Redis also has “normal” sets, which are just a collection of values.) The data is stored already-sorted, so there’s no sorting required if you want to get the results in order of score or if you already know where in a set an item already is. This feature is ideal for counting things like clicks, views, scores, votes (assuming the data isn’t critical), and so on. For example, if we wanted to show the most-viewed products on our site, here’s one possible solution:
- use a sorted set called
each time a product is viewed, increment the score on that particular value in the set (use the
retrieve the most-viewed products by fetching the products with the highest score (use the
ZREVRANGEcommand) and potentially then grab the product info if it is also stored in Redis
Redis and Persistence
Redis might be an in-memory datastore, but it does have durability features. By default, it flushes to disk periodically, but in my experience not often enough that you’d ever want to rely on this process running at a useful time before your server experienced a problem!
How often you want to store data to disk depends on your use case. Redis is usually used for ephemeral data that doesn’t matter if it’s lost. If you were keeping information about the most-viewed products in Redis, and your Redis server restarts, then your application should fail gracefully and perhaps fall back to showing random products until the cache warms back up again.
You can tell Redis to write to disk by using the
SAVE command, but beware that this operation will block. (It is not recommended for production use.) There’s an alternative background save mechanism called
BGSAVE, which is more appropriate. It’s possible to configure how often you want Redis to write to disk by specifying a number of writes in a particular time frame that should trigger a save.
Redis also has a log that makes it eventually durable, but again the flush-to-disk frequency of this feature is configurable and should be approached with caution. Redis is fast because it is in memory, rather than on disk. So if you configure Redis to write to disk on every transaction so that data is never lost? You’ll find it runs at the same speed as writing to a file on disk, because that’s exactly what is happening.
Redis and Your Applications
Redis is a brilliant tool. For many teams, it’s the gateway drug to a true polyglot persistence layer in their stack, where a variety of datastores are deployed to serve a number of different needs. Redis has robust replication out-of-the-box, and clustering (currently needs external tools) is also solid and easy to set up. Redis supports various data types really well and has lots of extra features: transactions are built-in, and it also has a good publish/subscribe feature.
For a way of speeding up repeated, painful queries, or to simply store session data in a way that’s quick to access, Redis a great addition to your stack and works alongside existing storage solutions to enhance their work, rather than replacing them.