Get the Code
Published December 23, 2018
Jupyter Enterprise Gateway is a lightweight, multi-tenant, scalable and secure gateway. With Jupyter Enterprise Gateway, you can enable Jupyter Notebooks to share resources across an Apache Spark cluster and extend Jupyter Kernel Gateway with enterprise-level capabilities, such as optimized cluster resource utilization and multi-user support.
Jupyter Enterprise Gateway is based on three primary themes:
Optimized Resource Allocation
Multi-user support with user impersonation within Kerberos-enabled clusters
Currently, all notebook-based offerings launch their kernels local to the server providing the service. In large Apache Spark installations, this equates to many resource-intensive applications running on the same server — YARN client mode — which introduces a bottleneck for teams of data scientists performing work.
Jupyter Enterprise Gateway introduces the ability to launch kernels as managed resources within Spark clusters — that is, YARN cluster mode — which was previously not possible for Jupyter kernels. This enables the number of kernels to increase linearly based on the available cluster resources, as demonstrated in the graph below:
To accomplish these distributed capabilities, we wrap the target kernel’s invocation with what we call “kernel launchers.” This enables us to implement additional capabilities without any modification to the underlying kernel implementations (such as auto-creation of Spark contexts for kernels that don’t provide that functionality). In addition, the way to launch a given kernel is conveyed within the kernelspec file, which we’ve also extended within Jupyter Enterprise Gateway. As a result, we include kernel launchers and kernelspec files for the following kernels (all of which include automatic and delayed Spark context initialization):
We’ll be looking to update the following topic areas in the future:
Kernel configuration profile
We’re pleased with the progress we’ve made with Jupyter Enterprise Gateway, but we’re not satisfied. There’s a lot more to accomplish, and we need your help. If you’re interested in Jupyter Enterprise Gateway and what it has to offer, we’d love for you to join our community and make the project even better. By sharing your insight and experience, you’ll help solidify Jupyter Enterprise Gateway’s presence within the larger data science ecosystem, and that’s something we can all benefit from and be proud of.
Jupyter Enterprise Gateway has identified a gap in data analytics tooling: how to fully leverage cluster resources within an enterprise while providing data scientists autonomy over their notebooks. Through Jupyter Enterprise Gateway, corporate enterprises and cloud providers alike can maximize the amount of resource-intensive work they accomplish, increasing their productivity and improving user experiences.
This work isn’t easy, and there are a lot of problems to overcome that haven’t previously been encountered. That’s the challenge — how can we best optimize resource utilization given the basic requirements and constraints of the Jupyter ecosystem?
Visit the Jupyter Enterprise Gateway website! You can download the latest build, find out what Jupyter Enterprise Gateway is all about, and where we’re headed.
March 25, 2019
April 23, 2019
March 27, 2019
Back to top