It’s only been a couple of weeks since the Elyra open source community published version 2.1, which introduced experimental support for Apache Airflow.
Now, version 2.2 delivers more enhancements that our growing user base had on their wish list. In this article, I’ll summarize the highlights:
- New R script editor
- Improved pipeline editor with support for R scripts
- Deployment on Kubeflow Notebook servers
- Extended command-line interface with support for pipeline execution
As always, you can find the complete list of features and bug fixes that made it into the release in the changelog.
Edit and run R scripts
The Elyra script editor was extended to support the R language. Therefore, you can now create, edit, and run R scripts in JupyterLab in addition to Python scripts.
By installing the optional Language Server Protocol support for R, you can take advantage of productivity features that you are likely familiar with from other IDEs, such as code linting and auto-completion.
Note: We realize that this editor can’t compete with RStudio, but its start!
If you’ve used the Jupyter Notebooks in an enterprise deployment, you are probably familiar with the Jupyter Enterprise Gateway (JEG). In fact, even if you aren’t familiar with it, you might have used it. What it does – in a nutshell – is give you the ability to run notebooks in remote kernels, allowing for better resource allocation and usage.
Note: Watson Studio is one example of a managed enterprise service that uses the JEG.
Because it is extending JupyterLab functions, Elyra can take advantage of the Enterprise Gateway as well. However, one little-known feature of Elyra is its ability to also allow for remote execution without the need for JEG through its support for Kubeflow Pipelines or Apache Airflow.
If you have access to a Kubeflow Pipelines or Apache Airflow deployment, you can run R scripts (just like Python scripts and Jupyter Notebooks) in those deployments directly from the editor. This is especially useful for scripts that require resources that are not available (or not sufficiently available) in your local environment.
Run R scripts in pipelines
In the Visual Pipeline Editor, you can now assemble pipelines from multiple R scripts, or mix R scripts with Jupyter Notebooks and Python scripts, as necessary.
You can run these pipelines locally in JupyterLab or remotely on Kubeflow Pipelines or Apache Airflow.
If you are new to Elyra pipelines, take a look at the tutorials. They guide you through the process of creating and running a pipeline in various environments.
Use Elyra to run Kubeflow-hosted notebooks
Elyra can be deployed locally or in remote environments.
A local deployment typically serves only a single user and is created by installing Elyra from PyPI, conda, source code, or pulling a ready-to-use container image.
Remote deployments, such as in a data center or the cloud, are typically used when support for many users is required.
A common approach is to deploy JupyterHub on Kubernetes and configure it for Elyra, like it’s done in Open Data Hub on the Red Hat OpenShift Container platform.
If you already have Kubeflow deployed and don’t want to provision a dedicated instance of JupyterHub to serve notebooks, we’ve got great news for you. We’ve recently started to publish custom Elyra container images on Docker Hub and quay.io that you can use to run JupyterLab with Elyra on Kubeflow Notebook Servers. All you need to do is specify the Elyra container image name and (version) tag when you configure a new notebook server and you are good to go.
Extended command-line interface
As an extension to JupyterLab, Elyra is primarily GUI driven. However, there are certain tasks that can also be completed using the
elyra-metadata command-line interface:
The Elyra command-line interface was extended in version 2.2 to support running of pipelines in local and remote environments. Initially, this capability is only exposed through the
elyra-pipeline command-line interface, but work is on the way to provide a unified interface.
Run pipelines locally
run command to run the pipeline locally, passing the pipeline file name as a parameter, like so:
$ elyra-pipeline run /path/to/hello-world.pipeline
Note: This feature is still under active development.
Run pipelines remotely
submit command to run the pipeline on Kubeflow Pipelines or Apache Airflow, passing the pipeline file name and the runtime configuration name as parameters, like so:
$ elyra-metadata list runtimes Schema Instance Resource ------ -------- -------- kfp kfp_test_env /.../runtimes/kfp_test_env.json $ elyra-pipeline submit --runtime-config kfp_test_env /path/to/hello-world.pipeline ...
If the pipeline was successfully submitted for execution, the command returns a GUI link that you can use to monitor the progress and a link to the cloud storage where the pipeline run artifacts are stored.
If you’ve used previous releases of Elyra, you should notice quite a few usability improvements that we’ve made. There’s no denying that the Elyra project has matured a lot since it was started approximately a year ago.
Coming up next
We’ve just started work for our next releases. There’s plenty of stuff brewing in our lab. If you’d like to get the inside scoop, check out our discussion forum, chat with us, or join the weekly community meeting.