Whether your organization sells primarily to consumers or to businesses, it’s the strength of your customer relationships that makes all the difference. Increasingly, companies are realizing that building and maintaining those relationships requires more than a one-size-fits-all approach to customer service.
Instead of thinking in terms of business-to-business (B2B) or business-to-consumer (B2C), the new paradigm is business-to-individual (B2I): you must know your customer, and tailor your services to their personal needs and preferences. But how do enterprises enable this paradigm shift in their approach to customer engagement? The answer is a game-changing piece of technology: the recommendation engine.
Giving Customers Their Own Personal Shopper
Recommendation engines are used in a wide range of industries to improve the customer experience by trying to intuit what the user wants to do next. Because they were originally popularized by online retailers such as Amazon, a retail example is probably the most familiar way to understand how recommendation engines help to personalize service delivery.
Traditional recommendation engines work offline: a batch process passes each customer’s purchase history through a set of complex algorithms, and generates personalized recommendations once a day or week. When first introduced, these daily or weekly recommendations provided a huge competitive advantage for the retailers who adopted them – but as they became more widespread, that advantage dulled. Today, it is no longer enough to have a basic recommendation engine; you need a recommendation engine that is sophisticated enough to set you apart from the competition.
To understand the shortcomings of offline, batch-processed recommendations, imagine that a customer visits your online store, sees a recommendation for a shirt, and adds it to their basket. What does your offline recommendation engine typically suggest they buy next? The answer, in most cases, is more shirts.
Your engine knows what the customer bought yesterday, or last week; but it doesn’t know what they’ve just purchased, and it can’t adjust to accommodate this new knowledge. As a result, its subsequent recommendations will not be as well-tailored or compelling, and the customer is likely to ignore them.
A better approach would be to react to the customer’s actions while they are still browsing your site, and recalculate their recommendations in real time – for example, to suggest accessories, pants or jackets to match that new shirt. This would give the online shopping experience the same personalized feel as talking to a knowledgeable sales assistant in-store, and is likely to result in larger basket sizes and higher revenues through successful up- and cross-selling.
Why Aren’t We Doing This Already?
If real-time customer behavior analytics can make such an important contribution to sales, why don’t more companies do it? If they already have a recommendation engine, why don’t they just run it for every transaction, instead of scheduling daily, weekly or monthly batch jobs?
The problem lies in the increasing complexity of the recommendation models, and the massively increased volumes of data that these models need to process. As the volume and complexity grow, the underlying architecture of the systems struggles to cope, and the process can take minutes or even hours to complete – much too long for customers to wait.
Let’s take a look at why this happens. The basic principle of all recommendation algorithms is the same: they depend on finding objects that have similar properties – for example, finding customers who live in the same area, who are of similar age, or who share interests. The algorithm gives each customer a score for how similar they are to each other customer.
The second step is to find out what actions each customer has taken – in this case, what products they have purchased. Imagine Alice has purchased two books: The Grapes of Wrath and Of Mice and Men. If Bob subsequently also purchases The Grapes of Wrath, and our scoring model rates him as being highly similar to Alice, we will recommend that he should add Of Mice and Men to his basket too.
The key point to notice here is that both stages of the process depend on relationships: first, the relationship between the two customers, and second, the relationships between the customers and their purchases. The speed with which you can query and traverse these relationships is fundamental to your ability to deliver recommendations in real time.
However, when most of today’s recommendation systems were originally built, there was no well-established technology on the market designed for analyzing relationships between data quickly and efficiently.
Relational databases can model relationships between records; but to traverse those relationships, you need to write SQL queries that join tables together. The joining process is computationally expensive, and becomes slower as the number of joins increases – which makes real-time analysis impractical at scale.
NoSQL databases – for example, JSON document stores like MongoDB – struggle with the same problem for a different reason: they don’t have any built-in means of connecting records at all. Any logic governing the traversal of relationships must be provided by the application layer, which is a complex task for developers, especially when individual records may not follow the same format or handle relationships in a standardized way.
The Answer for your Recommendation Engine: Graph Databases
In recent years, a new type of database has emerged that can help to solve this problem. Graph databases such as IBM Graph are now more than just an interesting computer science research problem: they are mature, enterprise-ready technologies that have been proven at massive scale, and that many companies are already using in production.
The essence of a graph database is that both records and relationships are treated as first-class citizens. The easiest way to visualize how a graph database works is by contrasting it with a relational database.
Imagine you have a relational database with two tables: Customers and Products. In the schema, you specify a relationship between those two tables: a Customer can purchase many Products, and a Product can be owned by many Customers. However, each time you want to find out which products a particular customer owns, you must first join the Customers and Products tables together, and then look through the new combined table to find any records that fit your criteria.
In a graph database, data is not segregated into separate tables, and there is no need for joins. The database contains not only Customer and Product elements (known as “vertices” or “nodes”), but also relationship elements (or “edges”) that specify how the vertices are linked to each other. If a customer purchases a product, a new edge element will be added to the database, which explicitly specifies which two objects are linked, and what their relationship is.
Because the relationships are made explicit by the edge elements, traversing the graph from one vertex to another is both conceptually simple and computationally inexpensive. As a result, you can perform sophisticated relationship-based queries in real time – making it possible to analyze patterns in customer behavior at the point of transaction.
This diagram shows a graph that maps the relationships between customers who live in Boston, and the products that they purchase.
g.V().has('customer','name','A').as('custA') .out('Buys').aggregate('self') .in('Buys').where(neq('CustA')) .out('Buys').where(without('self')) .groupCount().order(local).by(values, decr)
This Gremlin query demonstrates how simple it is to run recommendation-related queries in IBM Graph. In just a few lines of code, the query finds all the products that Customer A purchased, then all the other customers who bought the same products, and finally, all the products that those other customers bought that Customer A has not purchased yet.
Why Aren’t Graph Databases More Widely Used?
The concept of a graph database is relatively easy to understand. In fact, the way most people start designing a data model – by drawing a set of boxes or circles and linking them with lines or arrows – is remarkably close to the way graph databases work in practice.
However, the underlying mechanics of storing and traversing a graph efficiently are very complex, and gaining a deep understanding of graph theory is beyond most non-mathematicians. As a result, many IT professionals mistakenly believe that if the inner workings of a graph database are difficult to understand, it must also be difficult to use.
As graph databases have become more mature, this attitude no longer reflects reality. For example, IBM Graph provides an API that makes it easy to insert or extract data via simple HTTP requests, and an intuitive query language, Gremlin. Unlike SQL queries, where it is vital to understand the underlying structure of the tables that contain your data, Gremlin queries simply specify the result you want to obtain, and the types of edges and vertices you wish to traverse: it is much less necessary to learn the full schema on a detailed level before you can get started.
Moreover, as a differentiating factor from most other graph databases, IBM Graph is a fully managed service, delivered via the cloud and supported 24/7 by IBM experts. IBM’s flexible, scalable cloud infrastructure makes it easy to spin up new database instances, and eliminates any concerns about hardware sizing, software versions or patching.
IBM also provides an intuitive web-based user interface that provides a high level of control to help users monitor and manage their Graph instances, while abstracting away all the underlying technical complexity. It also enables users to load data, run queries and graphically visualize the results within a few mouse-clicks.
While IBM Graph offers powerful capabilities for graph experts, it is also user-friendly for beginners who want to learn more about the technology. The web interface provides sample data and sample queries to help you take your first steps, and hosts a set of easy-to-follow tutorials about graph-related technologies.
A Solution that Comes Highly Recommended
Recommendation engines inherently depend on the ability to connect records together and explore their relationships with each other. Graph databases are built from the ground up to serve this exact function, with none of the drawbacks of relational databases or NoSQL document stores.
By adding a graph database to your toolkit, you can exploit opportunities for real-time, transaction-level analysis of customer behavior – opportunities that may be impossible to grasp with your current systems. If you want to suggest the perfect set of additional purchases to a customer as they move through your online checkout process, the most effective answer is a database that puts relationships first.
Learn more about IBM Graph
- No More Joins: An Overview of Graph Database Query Languages
- Detecting complex fraud in real time with Graph databases
- Getting started with IBM Graph