Learning objective
By completing this introductory tutorial, you learn how to deploy a deep learning microservice from the Model Asset Exchange on Red Hat OpenShift.
Upon completion of the tutorial, you know how to use the OpenShift web console or OpenShift Container Platform command-line interface (CLI) to:
- Create a new project
- Deploy a model-serving microservice from a public container image on Docker Hub
- Create a route that exposes the microservice to the public
Prerequisites
To follow this tutorial, you must have:
- An IBM Cloud account or a Red Hat account
- An installation of the Red Hat OpenShift Container Platform CLI (optional, only required if you want to complete the second part of this tutorial)
The tutorial was created using Red Hat OpenShift v3.11 on IBM Cloud. You will notice a few GUI differences if you complete the tutorial using Red Hat OpenShift Online, which at the time of writing was version 3.
Estimated time
It should take you approximately 30 minutes to complete this tutorial. The tutorial modules are:
Tutorial setup
To complete this tutorial you must have access to a small 1-node OpenShift cluster.
Provision a Red Hat OpenShift cluster
If you don’t have access to an existing cluster, provision one on the IBM cloud or on openshift.com.
You can complete this tutorial using the free starter tier on openshift.com. However, note that RAM is limited to 2 GiB, which might not always be sufficient for resource-intensive models.
Throughout this tutorial, the instructions and screen shots refer to a cluster named os-max-test-cluster
. Whenever necessary, substitute this name with the name of your cluster.
You are ready to start the tutorial.
Deploy a deep learning model-serving microservice on Red Hat OpenShift
In this tutorial, you will deploy a model-serving microservice from the Model Asset Exchange on Red Hat OpenShift using the OpenShift web console or the OpenShift Container Platform CLI. You can complete both parts or only one part.
Deploy using the OpenShift web console
In this first part of the tutorial, you deploy the MAX-Object-Detector model-serving microservice, which can be used to identify objects in pictures, as shown in this demo.
Open the OpenShift web console.
- In a web browser, open the Kubernetes cluster page on IBM Cloud.
- Locate your OpenShift cluster entry
os-max-test-cluster
. From the … drop-down menu, select OpenShift web console.
The catalog browser is displayed.
From the drop-down menu, select the Application console. A list of your projects is displayed. A project is used in OpenShift to organize related resources.
- Create a new project and assign it a name (such as
max-deployments
), a display name, and an optional description. - Open the project.
Deploy the microservice Docker image
You can deploy Docker images that are hosted in public or private registries. The MAX-Object-Detector Docker image codait/max-object-detector
is hosted on Docker Hub.
On the project overview page, choose Deploy Image.
If your Overview page looks different (for example, because you are using an existing project that already contains an application), use the Add to Project drop-down menu in the menu bar and select Deploy Image.
Select the model-serving Docker image that you want to deploy.
- Choose the Image Name radio button to select a public or private repository as the source.
Enter
codait/max-object-detector
as Image name or pull spec (or enter the name of another model-serving Docker image from our repository on Docker Hub).- Click on the magnifying glass next to the image name (or press Enter) to load the Docker image’s metadata.
Review the deployment configuration for the selected Docker image.
The Name field is pre-populated using the name of the Docker image. OpenShift uses this name to identify the resources being created when the application is deployed.
If you append, modify, or delete a few characters, you’ll notice how the change is impacting the generated image stream name, the deployment configuration name, the service name, and the host name. (However, leave the name as it is in this tutorial!)
You can customize the behavior of most model-serving microservices by setting environment variables. For example, you can enable Cross-Origin Resource Sharing support by setting the
CORS_ENABLE
variable in the deployment configuration totrue
. Refer to the model documentation for model-specific environment settings.Deploy the Docker image.
Close the window and open the Overview tab if it is not selected by default. The deployment configuration for the Object Detector model-serving microservice is displayed.
By default, only one instance of the microservice is started. You can manually scale out the deployment (“run more instances of the microservice to handle more requests concurrently”) by increasing the target number of running pods. You can also enable autoscaling (“run up to X instances if needed”) by clicking the deployment configuration name, and selecting Configuration and Add Autoscaler.
After the microservice is deployed, it is only visible internally (at port 5000) to the cluster. To expose it to the public, you must create a route.
Create a route
When you create a route in OpenShift, you have the option to expose an unsecured or a secured route. Because the model-serving miroservice communicates over HTTP, you can configure the router to expose an unsecured HTTP connection (which is what you’ll do in this tutorial) or expose a secured HTTP connection, which the OpenShift router automatically terminates.
- Select Create Route under the NETWORKING section.
Review the configuration settings. The defaults expose an unsecured HTTP connection at a generated URL if you leave the hostname empty.
To configure a secured HTTP route, you’d select Secure route, set TLS Termination to Edge, and set Insecure Traffic to None. You can specify your own TLS certificate or use the router’s default certificate.
Create the route. Upon completion of the operation, the public URL for the deployed model-serving microservice is displayed.
If an error is displayed indicating that the router rejected the request, shorten the route name. The maximum length of the generated URL (derived by concatenating the route name, the project name, the cluster name, and the router’s domain) likely exceeds the 63 character maximum.
Open the displayed URL to verify that the microservice’s OpenAPI specification endpoint can be accessed.
If you have been following this tutorial using the Object Detector, open the embedded sample application by appending
/app
to the URL displayed in your browser, and test the deployed service by submitting an image.
Skip to the tutorial summary or review the instructions in the next section to learn how to deploy a model-serving microservice using the command-line interface.
Deploy using the CLI
You can deploy and manage applications using the OpenShift Container Platform CLI (oc
). Before you can use the CLI to deploy the microservice in your cluster, you must log in.
Open the OpenShift web console and copy the login command.
- In a web browser, open the Kubernetes cluster page on IBM Cloud.
- Locate your OpenShift cluster entry
os-max-test-cluster
. From the … drop-down menu, select OpenShift web console.
The catalog browser is displayed.
From the avatar drop down, select Copy Login Command.
Do not share the login command. It contains a token that grants access to the resources in your account.
Log in to your cluster using the OpenShift CLI.
- Open a terminal window.
Paste the copied login command.
oc login https://... --token=...
Tip: To learn more about a command, run
oc <command> --help
, for example,oc login --help
.Create a new project using a name such as
max-deployments-cli
. OpenShift uses a project to organize related resources.oc new-project max-deployments-cli
You can list existing projects using
oc projects
or switch between projects usingoc project <project-name>
.
Deploy the Docker image
You can deploy Docker images that are hosted in public or private registries. In this part of the tutorial, you deploy the MAX-Image-Caption-Generator model-serving microservice, which can be used to describe the content of a picture in a sentence, as shown in this demo.
Verify that you can access the
codait/max-image-caption-generator
Docker image on Docker Hub.oc new-app --search codait/max-image-caption-generator
The output should look as follows:
Docker images (oc new-app --docker-image=<docker-image> [--code=<source>]) ----- codait/max-image-caption-generator Registry: Docker Hub Tags: latest
Deploy the Docker image.
oc new-app codait/max-image-caption-generator
Review the output and note that the image name
max-image-caption-generator
is used by default to name the generated resources, such as the image stream, the deployment configuration, the service, and the host.--> Found Docker image ... (... days old) from Docker Hub for "codait/max-image-caption-generator" * An image stream tag will be created as "max-image-caption-generator:latest" that will track this image * This image will be deployed in deployment config "max-image-caption-generator" * Port 5000/tcp will be load balanced by service "max-image-caption-generator" * Other containers can access this service through the hostname "max-image-caption-generator" * WARNING: Image "codait/max-image-caption-generator" runs as the 'root' user which may not be permitted by your cluster administrator --> Creating resources ... imagestream.image.openshift.io "max-image-caption-generator" created deploymentconfig.apps.openshift.io "max-image-caption-generator" created service "max-image-caption-generator" created --> Success ... Run 'oc status' to view your app.
You can change the default by supplying the
--name <my-custom-name>
parameter when you deploy the image.You can customize the behavior of most model-serving microservices by setting environment variables. For example, you can enable Cross-Origin Resource Sharing support by setting the
CORS_ENABLE
variable in the deployment configuration totrue
:oc set env deploymentconfig max-image-caption-generator CORS_ENABLE=true
Refer to the model documentation for model-specific environment settings.
Query the deployment status.
oc status
Note that after deployment one microservice instance is running and it is only visible internally (at port 5000) to the cluster:
In project max-deployments-cli on server https://... svc/max-image-caption-generator - ...:5000 dc/max-image-caption-generator deploys istag/max-image-caption-generator:latest deployment #1 deployed 15 minutes ago - 1 pod
You can manually scale out the deployment by increasing the target number of running pods (
oc scale --replicas=2 deploymentconfig max-image-caption-generator
) or configure autoscaling.To expose the service to the public, you must create a route.
Create a route
When you create a route in OpenShift, you have the option to expose an unsecured or a secured route. Because the model-serving miroservice communicates over HTTP, you can configure the router to expose an unsecured HTTP connection (which is what you’ll do in this tutorial) or expose a secured HTTP connection, which the OpenShift router automatically terminates.
Create a route for the service.
oc expose service max-image-caption-generator
Retrieve the microservice’s public URL.
oc get route max-image-caption-generator
Under the
HOST/PORT
column, the generated host name is displayed.NAME HOST/PORT ... max-image-caption-generator max-image-caption-generator-max-deployments-cli...appdomain.cloud ...
If the displayed
HOST/PORT
readsInvalidHost
, the generated host name is invalid. This commonly happens when the length exceeds the 63-character maximum. To resolve the issue, shorten the route name. For example, to shorten the route name for themax-image-caption-generator
service, runoc expose service max-image-caption-generator --name <shorter-route-name>
followed byoc get route <shorter-route-name>
to retrieve the public URL.Open the displayed URL in your web browser to verify that the microservice is accessible.
Validate that the
/model/predict
endpoint returns the expected result for a picture of your choice.Note: For some compute-intensive models, calls to the
/model/predict
endpoint might result in HTTP error 504 (Gateway timeout) if you use the router’s default configuration. To resolve the issue, increase the router’s timeout value by runningoc annotate route <router-name> --overwrite haproxy.router.openshift.io/timeout=<number-of-seconds>s
.
Summary
In this tutorial, you learned how to use the OpenShift web console or OpenShift Container Platform CLI to:
- Create a new project
- Deploy a model-serving microservice from a public container image on Docker Hub
- Create a route that exposes the microservice to the public
To learn more about the Model Asset Exchange or how to consume the deployed model-serving microservice in Node-RED or JavaScript, take a look at Learning Path: An introduction to the Model Asset Exchange and these pens.
You might find the following resources useful if you’d like to learn more about Red Hat OpenShift: