If you use Slack, you know it can be hard to find information in the torrent of messages that flow through your account. Our developer advocacy team is part an enormous slack account. When I have a question, it’s hard to identify appropriate channels or members to ask.

Fun fact: We currently have about 3,000 public channels, discussing a variety of topics, including cats!

The problem

Let’s say I went to our team’s Slack acocunt looking for information about the Cloudant schema discovery process, which is used to build and populate a dashDB data warehouse from IBM’s Cloudant NoSQL database. I could:

  • Find channels that contain one or more relevant keywords in the name or purpose. To browse/join channels that include the key words I seek, like Cloudant (20+ hits), schema (2), and discovery (3) is time-consuming and may not help.

    question in slack

  • Use the built-in search features (global search, in-channel search, by-user search) to find messages that include Cloudant schema discovery. Results usually vary between no hits and a gazillion, depending on the quality of your exact search term(s) and the number of Slack messages in the system. This never seems to work well for me.

  • Ask people where or whom to ask. Not very efficient.

The solution

Use a graph database to explore relationships

With hundreds of people exchanging thousands of messages daily, chances are good that the information (or contacts) you need can be automatically derived from the messages that were exchanged between users.

sa_model_view

A graph database is the perfect place to load and analyze this data. A graph is comprised of vertices (nodes) and edges (relationships). In our scenario, Slack users, channels, and keywords are vertices. Relationship between vertices, like user-to-channel, user-to-user, and user-to-keyword are Edges.

I built a graph database prototype solution that analyzes these relationships to find answers to common questions. The solution uses a custom slash command as the “public” interface in Slack, a service to process the request and IBM Graph as the back-end database.

How it works

If you want to find info in Slack using my solution, you first enter the custom slash command /about followed by the search term. So to find info on Cloudant, you’d enter: /about cloudant.

Solution overview

The service queries the graph database and returns the results to Slack for display. Immediately you see the people and channels containing that term.

Retrieve information about channels or users by entering /about #nosql and /about @claudia, respectively.

Building a Slack team graph

To create a graph for a team representing users, channels, and keywords we:

  1. Generate social and keyword statistics from the Slack messages. Batch scripts collect the data, operating on exported team message archives. We use Watson’s AlchemyAPI to extract keywords and user and channel references (like @betty and #cloudant-sdp) to collect social stats.

    We’ve really just scratched the surface …
    Additional information could be used to improve result quality. For example, channels frequented primarily by bots (like #cloudant-devops) might be ranked lower than channels with heavy user activity (#cloudant-help).

  2. Build a graph model based on these statistics. The model is a logical representation of the Slack team graph, representing users, channels, keywords, and their relationships. The sample messages shown in the beginning of the blog post, might be represented in the model as follows:

    sa_slack_sample_model_view

    Once all relevant information has been added to the graph model, we can load it into IBM Graph.

    A graph model can be translated on the fly to Gremlin or input via bulk input APIs, so we can create many vertices and edges in the database with a relatively small number of requests.

  3. Load the graph model into IBM Graph. We translate the graph model to Gremlin scripts and run those to create the vertices and edges. Once all objects are created we can use the IBM Graph web console in Bluemix to explore the Slack team graph by running traversal steps.

    For example, to inspect the Slack team graph, open the Query tab and enter Gremlin queries, like:

    def g=graph.traversal(); g.V().has("isUser", true).count();
    def g=graph.traversal(); g.V().has("isChannel", true).count();
    def g=graph.traversal(); g.V().has("iskeyword", true).count();

    to count users, channels, or keywords:

    Sample Slack graph traversals

Here’s the big picture of how we create the graph:

Loading Slack metadata into IBM Graph

How Slack users access the graph

To provide users easy access to the graph (within Slack) we’ve created a simple service called about, implemented in NodeJS. This service extracts the query details (channel name, user name, or keyword) from the Slack request, connects to IBM Graph and runs predefined graph traversals using the IBM Graph client library (hat tip to Mike Elsmore). The results are visible only to the user that invoked the slash command.

About service details

Sound interesting? Ready to explore your Slack Graph? Start here.

Join The Discussion

Your email address will not be published. Required fields are marked *