Attending North Bay Python last December was a fantastic experience for me. And I really enjoyed getting to represent IBM on-stage for a few minutes as part of our sponsorship of the conference. My focus during the short presentation was writing some Python functions running in OpenWhisk on IBM’s Cloud Functions service. With this set-up you can auto-magically handle management of your GitHub projects as you will soon see!

Background and Details on OpenWhisk

If you aren’t familiar, OpenWhisk is an Apache Foundation open-source project to build a serverless / function as a service environment. It uses Docker containers as the foundation, spinning up either pre-defined or custom-named containers, running to completion, then exiting. OpenWhisk was developed before Kubernetes, so it has its own Docker orchestration built in.

In addition to just the run time, OpenWhisk has pretty solid logging and interactive editing through the web UI. This capability becomes critical when you do anything that’s more than trivial with cloud functions, because the execution environment looks very different than just your laptop.

What are Cloud Functions good for?

Cloud Functions are really good when you have code that you want to run after some event has occurred, and you don’t want to maintain a daemon sitting around polling or waiting for that event. A good concrete example of this use-case is Github Webhooks.

If you have a repository where you’d like to automate activities for a new issue or PR (pull request), handling it with Cloud Functions means you don’t need to maintain a full system just to run a small bit of code on these service events.

They can also be used kind of like a web cron, so that you don’t need a full VM running if there is just something you want to fire off once in a while to do 30 seconds of work.

GitHub Helpers using Cloud Functions on OpenWhisk

I wrote a few example uses of this for my open source work. Because my default mode for writing source code is open source, I have quite a few open source repositories on GitHub. They are all under very low levels of maintenance which I’m aware of, but others don’t. So instead of having PR requests just sit in the void for a month I thought it would be nice to auto-respond to folks (especially new folks) on the state of the world.

#
#
# main() will be invoked when you Run This Action
#
# @param Cloud Functions actions accept a single parameter, which must be a JSON object.
#
# @return The output of this action, which must be a JSON object.
#
#

import github
from openwhisk import openwhisk as ow


def thank_you(params):
    p = ow.params_from_pkg(params["github_creds"])
    g = github.Github(p["accessToken"], per_page=100)

    issue = str(params["issue"]["number"])


    repo = g.get_repo(params["repository"]["full_name"])
    name = params["sender"]["login"]
    user_issues = repo.get_issues(creator=name)
    num_issues = len(list(user_issues))

    issue = repo.get_issue(params["issue"]["number"])

    if num_issues < 3:
        comment = """
I really appreciate finding out how people are using this software in
the wide world, and people taking the time to report issues when they
find them.
I only get a chance to work on this project on the weekends, so please
be patient as it takes time to get around to looking into the issues
in depth.
"""
    else:
        comment = """
Thanks very much for reporting an issue. Always excited to see
returning contributors with %d issues created . This is a spare time
project so I only tend to get around to things on the weekends. Please
be patient for me getting a chance to look into this.
""" % num_issues

    issue.create_comment(comment)


def main(params):
    action = params["action"]
    issue = str(params["issue"]["number"])
    if action == "opened":
        thank_you(params)
        return { 'message': 'Success' }
    return { 'message': 'Skipped invocation for %s' % action }

 

This code is pretty basic, it responds back within a second or two of folks posting to an issue, telling them what’s up. While you can do a light-weight version of this with templates in GitHub native, using a cloud functions platform lets you be more specific and customize your response to individuals based on their previous contribution rates. You can also see how you might extend it to do different things based on the content of the PR itself.

Using a Custom Docker Image

IBM’s Cloud Functions provides a set of docker images for different programming languages (Javascript, Java, Go, Python2, Python3). In my case I needed more content then was available in the Python3 base image.

The entire system runs on Docker images, so extending those is straight forward. Here is the Dockerfile I used to do that:

# Dockerfile for example whisk docker action
FROM openwhisk/python3action

# add package build dependencies
RUN apk add --no-cache git

RUN pip install pygithub

RUN pip install git+git://github.com/sdague/python-openwhisk.git

 

This builds upon the base and installs 2 additional python libraries: pygithub to make github api access (especially paging) easier and a utility library I put up on GitHub to keep from repeating code to interact with the OpenWhisk environment.

When you create your actions in Cloud Functions, you just have to specify the docker image instead of a language environment.

Weekly Emails and Reminders on Cloud Functions

My spare time open source work mostly ends up falling between the hours of 6 – 8am on Saturdays and Sundays, when I’m awake before the rest of the family. One of the biggest problems is figuring out what I should work on, because if I spend an hour figuring that out, it doesn’t leave much time to do some coding. So as a reminder, I set up 2 weekly emails to myself using Cloud Functions.

The first email looks at all the projects I own, and provides a list of all the open issues & PRs for them. These are issues coming in from other folks, that I should probably respond to, or at least make some progress on. Even just tackling one a week would get me to a zero issue space by the mid-spring. That’s one of my 2018 goals.

The second does a keyword search on Home Assistant’s issue tracker for components I wrote, or that I run in my house that I’m pretty familiar with. Those are issues that I can probably meaningfully contribute to. Home Assistant is a big enough project now, that as a part time contributor, finding a narrower slice is important to getting anything done.

Those show up at 5am in my inbox on Saturday, so it will be the top of my email when I wake up, and a good reminder to have a look.

The Unknown Unknowns

This had been my first deep-dive down into functions as a service rabbit hole, and it was a very educational one. The biggest challenge I had was getting into a work-flow of iterative development. The execution environment here is pretty specialized, including a bunch of environmental setup.

I didn’t realize how truly valuable a robust Web IDE and detailed log server is in these environments. Being someone that would typically just run a vm and put some code under cron, or run a daemon, you get to keep all your normal tools. But the trade off of getting rid of a server that you need to keep patched is worth it some times. I think that as we see a lot of new entrants into the function-as-a-service space, that is going to be what makes or breaks them: how good their tooling is for interactive debug and iterative development.

Replicate and Extending this GitHub Helpers project

I’ve got a pretty detailed write up in the README for how all this works, and how you would replicate this yourself. Pull requests are welcome and discussions of related things you might be doing are as well.

This is code that I’ll continue to run to make my GitHub experience better. The pricing on IBM’s Cloud Functions means that this kind of basic usage works fine at the free tier. I hope this post covers most of your questions on GitHub Helpers using Cloud Functions! Ping me on GitHub if you have any more.


Catch up with Sean Dague on his blog


I’m a software engineer, who loves the great outdoors, and seems to pick up far too many hobbies along the way. I’m also a huge believer in Open Source Software, and have been an avid Linux user since 1999. I do open source at work, but I also do plenty of it on my off hours. There really is only one Sean Dague on the internet right now (I feel bad if someone else pops up with the same name), so if you find my name in a software credits file, it’s me.  Over the years I’ve written reasonable projects in Perl, C, Java, C#, Python, and Ruby, with plenty of dabbling in PHP, DHTML and Javascript (both in browser, and as XUL extension).

Join The Discussion

Your email address will not be published. Required fields are marked *