This blog describes a solution to avoid HTTP 404 errors when a CICS region starts up due to Liberty accepting HTTP requests before an application is ready. This involves starting the Liberty server with disabled HTTP endpoints, and only enabling them once the web application is ready using a policy system rule.
Distributing HTTP requests across Liberty JVM servers
In a high availability CICS configuration, incoming HTTP requests are typically processed by a cluster of cloned CICS regions. This enables the balancing of requests across the cluster, avoids single points of failure, and provides for continuous availability during outages. This configuration usually involves the following TCP/IP components:
- A distributed DVIPA providing a single externally advertised virtual IP address (VIPA) for the cluster
- Sysplex distributor to route new TCP/IP connections across the different TCP/IP stacks running on the LPARs in the sysplex
- TCP/IP port sharing to enable CICS regions on the same LPAR and TCP/IP stack to listen on a shared TCP/IP port
When the target application is a Java web application running in a Liberty JVM server, the Liberty server controls the HTTP endpoint and provides access to the application deployed in each of the CICS regions, as shown in Figure 1.
Starting CICS regions
When you want to add a new CICS region to the cluster you will need to consider how this will impact access to the Java web application. When a CICS region starts, the Liberty servers and web applications will be started as their associated JVMSERVER and BUNDLE resource definitions are installed from the relevant CSD groups. In a busy system, this may mean the Liberty server is started and begins accepting HTTP requests before the web applications are ready.
For example, the log messages below from a sample startup scenario show that the TCP/IP ports are bound to the server at 14:55:47 for HTTP and 14:55:54 for HTTPS, but the web application only became available at 14:55:55.
[5/27/19 14:55:47:452 GMT] 00000056 com.ibm.ws.tcpchannel.internal.TCPChannel I CWWKO0219I: TCP Channel defaultHttpEndpoint has been started and is now listening for requests on host * (IPv6) port 8080. [5/27/19 14:55:54:647 GMT] 00000052 com.ibm.ws.tcpchannel.internal.TCPChannel I CWWKO0219I: TCP Channel defaultHttpEndpoint-ssl has been started and is now listening for requests on host * (IPv6) port 8443. [5/27/19 14:55:54:773 GMT] 0000004d com.ibm.ws.app.manager.AppMessageHelper I CWWKZ0018I: Starting application com.ibm.cicsdev.restapp. [5/27/19 14:55:55:892 GMT] 0000004d com.ibm.ws.http.internal.VirtualHostImpl A CWWKT0016I: Web application available (default_host): http://zt00.pssc.mop.fr.ibm.com:8080/com.ibm.cicsdev.restapp/
The time gap is 8 seconds for HTTP, and 1 second for HTTPS. During this time gap Liberty responds with an HTTP 404 Not Found status code because the web application context root has not been registered and so cannot be found. The context root of a web application indicates to the server which web application is to process the HTTP requests (it is the first part of the path in the URL).
This situation needs special attention in a CICS cluster environment, because HTTP connection requests are likely to be routed to the newly started Liberty server as this will have the fewest established connections, and thus these requests will return a Context Root Not Found error and HTTP 404 Not Found status code until the web applications become available.
In this article we describe a solution to synchronize web application availability with the enablement of Liberty server HTTP endpoints. The cics-java-liberty-app-deployment sample provides the source code and configuration for the proposed solution with CICS TS V5.5.
Before going through the suggested automated solution, let’s explain the two main capabilities it uses:
- The ability to resume and pause Liberty server endpoints
- The creation of an event when a CICS bundle becomes enabled
Liberty provides a set of Managed Beans (MBeans), which can be used to manage and monitor the Liberty server configuration and resources. See List of provided MBeans in the WebSphere Application Server for z/OS Liberty documentation in IBM Knowledge Center.
An enhancement was made in Liberty fix pack 184.108.40.206 to add the ServerEndpointControlMBean MBean that allows you to list, pause and resume Liberty endpoints. The endpoints can be HTTP endpoints or message-driven bean (MDB) message endpoints. When an endpoint is paused it stops listening for new requests.
There are two principal ways to access an MBean, either by running an application in the same Liberty JVM server which then uses the MBean class, or by invoking the MBean REST API. Alternatively, on z/OS a Liberty server can be paused or resumed from the command line (see Pausing and resuming a Liberty server from the command line) with the
wlpenv server pause or
wlpenv server resume commands.
CICS policy system rule
A CICS policy rule allows you to perform a defined action when all the conditions specified in the policy rule are met (see CICS policies). There are two types of policy rules: system rules and user task rules. System rules monitor the state of CICS system resources and automatically emit a message or an event when the monitored changes occur. User task rules monitor the resource utilization of individual user tasks and automatically respond when a task resource usage exceeds a pre-defined threshold. An enhancement was made in CICS TS V5.5 so that the CICS bundle status now reflects the Liberty application status. This means that a CICS bundle that contains Java EE applications will only reach the ENABLED status when all the Java EE applications in the bundle are successfully installed in their Liberty JVM server. Once the bundle is enabled, the Java EE application context root is guaranteed to be available.
Consequently, a policy system rule that monitors the state of a CICS bundle can trigger an event when the Liberty application is completely installed and ready. Moreover, this event can be associated with your own actions as described in the Defining your own actions for policy system rules blog.
The solution suggested in this blog is based on preventing the Liberty JVM server receiving HTTP requests until its application context root is available. To do so, the Liberty server needs to be configured with the httpEndpoint configuration elements set to enabled=”false”. This defines the HTTP endpoints but it does not bind the Liberty server to the TCP/IP ports. The HTTP endpoints that need to be disabled are the ones receiving HTTP requests for the Java EE application; in a CICS cluster this is usually the HTTP endpoint listening on the DVIPA and shared port.
Once an application context root is ready, the Liberty server HTTP endpoint listeners can be resumed. Depending on the CICS TS version the way to monitor the availability of the context root and to resume the activity to the Liberty server is different. Figure 2 illustrates this step-by-step.
- The new CICS region has been started and its resources (for example, Liberty JVM servers) are enabled. The Liberty JVM servers are not configured to bind to their TCP/IP ports when launched.
- In the starting Liberty JVM server, a web application is in the process of being installed.
- Once the web application is installed, the server will write a CWWKT0016I message in the log that can be redirected to the MVS console. Additionally in CICS TS V5.5, the related CICS bundle enable status is switched to ENABLED.
- Monitoring one of these two events with an automation tool allows the Liberty HTTP endpoint to be resumed.
- When the Liberty HTTP endpoint is resumed, the associated TCP/IP ports are bound to the server.
- The Liberty JVM server is now ready to process HTTP requests targeting the web application.
In the next sections we describe how this process can be implemented in CICS TS V5.5, and then for earlier releases.
CICS TS V5.5
Since the CICS bundle enable status information is more accurate for this version of CICS TS, the idea is to use a CICS policy system rule to monitor the CICS bundle enable status. When the status changes to ENABLED, the application context root defined by the application in the CICS bundle is available. The action associated to this policy rule is to emit an event that will be processed by an event processing adapter (EP adapter) – which allows us to define an action.
In this case the action is a transaction start with data. This transaction runs a COBOL program that parses the input and then links to a Java program running in the Liberty JVM server, if the link to the Java program fails the COBOL program will retry the link request a configurable number of times to allow for any delays. Once invoked, the Java program resumes the Liberty HTTP endpoints by using the ServerEndpointControl MBean. This automated process has one caveat, the CICS bundle that defines the policy system rule needs to be installed and enabled before the target CICS bundle containing the Liberty Java applications in order that the policy rule is activated before the target bundle is installed.
Note: When monitoring a bundle that contains Java bundle parts, defining the policy in a Resource Definition Online (RDO) group before the group that contains the target bundle in the group list ensures that the policy is installed and enabled before the target bundle. However, if monitoring a bundle that does not contain Java parts, it is possible that the target bundle could become enabled before the policy is activated regardless of the order of the RDO groups in the group list. In this case the policy will not catch any bundle state changes that happen earlier in the install processing.
For more details on how this automated process can be implemented, have a look at the sample on GitHub.
CICS TS earlier releases
For previous CICS TS versions, instead of monitoring the CICS bundle enable status, you can monitor the Liberty server log messages. These log messages can be redirected to the MVS console for an easier access from system automation tools, with the following configuration:
<zosLogging hardCopyMessage="CWWKT0016I,CWWKF0011I" disableHardcopyMessages="false"/>
Once an application is fully installed, a message is written to the Liberty log, for example:
CWWKT0016I: Web application available (default_host): http://zt00.pssc.mop.fr.ibm.com:8080/com.ibm.cicsdev.restapp/
Then when all applications are started and the server has started successfully the following message is written to the log.
CWWKF0011I: The server is ready to run a smarter planet
Note: that the smarter planet message CWWKF0011I will only be issued after all applications have started, if the applications start within the default Liberty application start timeout of 30s. To extend this timeout you can set the startTimeout attribute on the applicationManager element to a larger value, such as 10 minutes as follows:
The monitoring of the Liberty server log messages can be done using an automation tool such as Tivoli System Automation or Netview. When the tool detects the application availability or server startup message has been written to the MVS console, it can run a batch job to resume the HTTP endpoint listener (refer to the Liberty ServerEndpointControlMBean section). For example, the batch job can call the BPXBATCH program, which allows you to run a shell script or shell command:
//STEP1 EXEC PGM=BPXBATCH,PARM='SH <WLP_USER_DIR>/wlpenv server resume'
where <WLP_USER_DIR> is the value for WLP_USER_DIR defined in the JVM profile. This will run the script wlpenv and resume the Liberty server endpoints.
This article shows you how to start Liberty JVM servers and ensure web applications are ready before enabling HTTP endpoints that allow requests to be received. The provided sample illustrates how this can be achieved for applications in a single CICS bundle, and it could be enhanced to support additional CICS bundles if required. When Liberty JVM servers are stopped, the quiesce stage ensures no new HTTP requests are received and existing requests can complete.
Together these enable CICS regions to be started and stopped, and capacity added or removed without impacting the availability of the installed web applications. When used in combination with sysplex distributor and port sharing, this can form the basis of a highly available and scalable solution.
12/Dec/2019 – Updated with details of how to extend Liberty application start timeout