During the last couple of years, cloud computing technologies have been developing at lightning speed. New infrastructures, services, ways to provide high availability, and approaches to building apps have made developers’ lives a lot easier, not to mention that products can now reach end-users a lot faster.
Today, enterprises develop cloud solutions using microservice-based architectures, decomposing complex applications into small, independent modules that can be easily replaced. OpenWhisk, the technology we would like to discuss in this post, helps to build such microservice-based architectures by simplifying the development process.
At their official website, IBM defines “OpenWhisk” as “a new event-driven platform that lets developers quickly and easily build feature-rich apps that automatically trigger responses to events.”
OpenWhisk completely eliminates any infrastructure- and operations-related challenges from the process of software development and this is what makes it so powerful. Fault tolerance, load balancing, auto-scaling, and other complex aspects are abstracted away from engineers, so that they can focus entirely on building the logic of their microservice-based software.
In this blog post, we will pay special attention to application of OpenWhisk in the IoT sphere. That said, OpenWhisk is a general purpose technology and IoT is just one of the areas where it can bring its advantages.
The modern cloud ecosystem
A short while ago, it seemed that tools, such as OpenStack and Cloud Foundry, which manage virtual machines, runtimes, and container environments, have covered the entire cloud computing market. Despite that, OpenWhisk and similar products have created an entirely new type of cloud environment—platforms for event-driven applications, e.g., IoT. Having occupied this fresh niche, they now have a solid potential to change the ecosystem of cloud technologies. The following diagram shows the three types of cloud-native environments that exist today along with specific technology examples.
What OpenWhisk Does
The high-level workflow
The three pillars that OpenWhisk is based on are:
- Triggers. A trigger is an external event that requires any kind of processing. Triggers mostly contain payload data.
- Actions. An action is some piece of code that can be executed in response to a trigger.
- Rules. A rule binds one or more actions to a trigger. It actually defines which actions will be executed in response to a trigger event.
Below are two diagrams that show how it works. The first one is an overview of OpenWhisk’s basic workflow on the highest level of abstraction.
Next is a high-level overview of what happens under the hood of OpenWhisk.
The following table provides more details on what each of the components you saw on the previous diagram does.
|NGINX||The endpoint for API calls that works as a proxy|
|Controller||Manages API calls|
|Load balancer||The load balancer’s working principle is very simple: it sends a message to all live invokers in a queue. There is also a separate feed that the load balancer uses to check availability of invokers. The statuses for all invokers are stored in Consul.|
|Activator||Activators process events produced by triggers. An activator can call all actions bound by a rule to a particular trigger.|
Invokers perform actions and also store a pool of Docker containers. Two container templates are available: Swift and Node.js. It is also possible to use any custom Docker container with binary code in it.
When an application is launched for the first time, its container is added to the pool. From there, containers can start a lot faster, since they do not have to be created from scratch.
|Kafka + ZooKeeper||Kafka and ZooKeeper work as a message bus that enables interactions between the load balancer, activators, and invokers.|
|Consul||Consul provides a DNS for OpenWhisk’s components. It serves as key-value storage where the health status of slaves (activators and invokers) is saved.|
|Registrator||Registrator automatically registers and unregisters services for any Docker container by inspecting containers as they come online. Registrator supports pluggable service registries, which currently include Consul, ETCD, and SkyDNS 2.|
|Cloudant||Cloudant is where authorization data, history of trigger launches and actions, as well as the results of their work are kept. It also stores the code for these actions and information about triggers and packages (the latter can be used as aliases).|
And here is another outline of OpenWhisk’s components that shows their locations and how they interact with each other. It is followed by another workflow diagram.
A typical workflow example
Now that we got familiar with OpenWhisk’s components, let’s take a closer look at the typical workflow that happens inside a system powered by OpenWhisk:
- Let’s assume that a third-party app has received data about some changes. For example, a sensor has detected that the oil level in a vehicle’s gearbox is low.
- The app needs to inform the system by making a request to the NGINX server. The request calls a trigger (‘low_oil_level’) and passes some identification parameter, such as car model and number plate.
Note: OpenWhisk also supports requests for actions, but this feature is most useful for debugging purposes.
- The NGINX server works as an entry point and passes the request to the controller.
- The controller saves this request to the Cloudant database and passes the record ID to the load balancer.
- The load balancer passes the message to an activator, which processes the message rules. The activator runs one or more requests for action(s) and sends them to the load balancer. In our example, there can be several requests, e.g.:
- “Set up an emergency service.” and
- “Limit the speed of the vehicle to prevent damaging the gearbox.”
- All the messages are stored in Cloudant.
- During the next step, the load balancer distributes action messages among several action invokers.
- An invoker creates (or gets from its own container pool) an appropriate container and runs the action code on this container. The result is stored in Cloudant.
The Docker container is logically bound to a user and an action name. Additionally, it is bound to the version of the action, so one container can execute code for one action and for a specific user only.
After the code has been executed, the Docker container is suspended and moved to the container pool. The size of the container pool is currently limited to ten items. When it becomes full, the oldest containers are removed (this feature can be turned off in the configuration).
That’s it. The speed has been decreased and maintenance has been scheduled. You can read more about the OpenWhisk architecture in this overview.
Using OpenWhisk as an IBM Bluemix service provides a terrific possibility to only pay for compute resources allocated for a few seconds or even a fraction of a second. It is different from the traditional architectural approach where apps are hosted inside a VM or a container, continuously waiting for a signal to perform some action. In case there are high availability requirements, we need to run multiple instances and pay for them, too.
Another advantage is the nearly instant scaling of jobs. In a traditional system, an abrupt workload spike can cause queries to be queued for seconds or minutes while the necessary infrastructure is being provisioned. During this time, users will experience significant latencies in system responses. Another downside is that you may be billed on a per-hour basis for VMs and containers, which will probably exist only for several minutes. OpenWhisk reacts much faster and the system will handle requests in real time or close to that. This means you will be charged just for the time that the scaled out infrastructure needs to handle the workload.
Another case to consider is using cloud resources for a low-performing device. Imagine a courageous Raspberry Pi crunching data on a network edge or your iPhone resizing a set of images. OpenWhisk can provide significant computing power extension in a cloud that will help to do the job in seconds instead of minutes or hours.
Here is some more good news: OpenWhisk does not have to run on Bluemix only. The open-source nature of the technology allows users to build their own OpenWhisk clusters on other public or private IaaSes, e.g., OpenStack. So, technically, OpenWhisk jobs can run anywhere.
However, in that case, you are responsible for utilization of the cluster and you will use different formulae to manage the economy of operating your event-driven apps. The system will be efficient if the number of OpenWhisk actions running on the infrastructure is high enough and guarantees high utilization of the cluster. In the majority of use cases it will still be more cost-effective than having all the daemons permanently up and waiting for a start signal. This approach resembles how Hadoop works. A private installation in a local environment is not feasible if you are going to run only 3–4 jobs a day.
OpenWhisk is a promising new approach to software development. From what we have seen so far, it can be suitable for different purposes, e.g.:
- Web app backends. OpenWhisk is great if you need to develop backends for typical web apps. We can imagine scenarios where it will not fit, but it seems OpenWhisk should be a good option for most backend use cases.
- MBaaS. Another good example of a potential OpenWhisk use case is Mobile-Backend-as-a-Service for a mobile app.
- Streams processing. Apps running on OpenWhisk can react to triggers; get information from different channels, e.g., social media; process, combine, and transform this data, etc.
- IoT. As we have mentioned above (see “A typical workflow example”), OpenWhisk looks like a great solution for building IoT apps.
- Data transformation and processing. We can easily imagine OpenWhisk powering multiple small apps, which can be part of a common pipeline for processing and transforming data.
OpenWhisk limitations derive from its advantages. For example, porting a big monolithic app—developed traditionally by a team of 50–70 engineers—to OpenWhisk will inevitably create significant architectural complexity, because there will be a considerable number of relatively isolated components. This can be addressed by customizing existing or developing new service registry tools.
Another aspect to think about is communication. The horde of nanoservices will need to exchange data between each other and with the databases. To ensure smooth communication, you will need to implement asynchronous communication algorithms, deal with circuit breakers, and find ways to guarantee message delivery.
Finally, OpenWhisk does not provide the flexibility of a server or a virtual machine in terms of app technology stack or memory and CPU allocation.
Based on the above, we can expect to see sizable applications built as a combination of “PaaS-like” microservices, which are becoming traditional, and event driven “nanoservices” powered by OpenWhisk. As it often happens in the early days of a technology, designing the architecture of such apps will require knowledge, skill, and talent.
We at Altoros see OpenWhisk as a potentially revolutionary tool that has already expanded the existing cloud ecosystem, adding a new type of environment and a new level of abstraction. This will provide a chance for developers to become even more productive and for the users to enjoy better performing cloud-based and mobile apps.