Java batch is a new standard that makes it easier to process large amounts of data and manage how that data is processed. Here, we introduce Java batch and how it is used in practice. We walk through a simple batch scenario and talk about some of the unique operational batch features in Liberty that you can use to build a complete batch environment.
Don’t call it a comeback, batch has been here for years!
Batch processing has existed since the early days of computing and remains a critical part of businesses today. Batch is used for payroll, payment processing, claims processing, inventory management, credit card processing, credit risk analysis, and report generation, etc; typically record-oriented processing with an expectation for results to finish within some determined, ‘non-immediate’ time frame.
Traditionally, batch has been associated with mainframe systems, but this is not necessarily the case today. Batch workloads are now varied across multiple technologies and platforms.
Why Java batch?
As a response to the disparate batch technologies that emerged over the years, the Java community consolidated the experience of batch experts and developed an industry-wide standard known as JSR-352. This Java specification establishes a standard batch programming model for the Java EE 7 specification which allows batch developers to write applications that are portable across multiple platforms.
The JSR-352 programming model also allows application developers to focus on their core business logic instead of building infrastructure to support qualities of service like job resiliency or parallel processing. All applications written to the JSR-352 programming model are portable across all implementations of the JSR-352 standard.
Along with a programming model, JSR-352 also defines a Job Specification Language (JSL), as a standard to order the execution of batch artifacts within a job. JSL can be implemented as a simple XML file and provides a way to clearly separate the business logic into granular units of reusable work from the wiring that combines all the artifacts together. Each JSL definition is used to perform some useful business process known as a job. It also allows configuration or operational changes to be made to the job without any changes to the business logic.
Java batch in practice
WebSphere Application Server Liberty profile includes a fully compliant JSR-352 batch container. In order to bring JSR-352 into the world of enterprise batch some key operational capabilities such as security, resiliency, and high availability are required. Liberty batch provides a complete enterprise batch solution that includes a host of key operational features around the JSR-352 standard to enhance the operational integration of your batch environment.
Along with all of the specification defined capabilities in JSR-352, Liberty batch makes the complete set of Java EE 7 technologies available to your batch application, and also provides additional qualities of service along with operational and management tools without any need for further integration, all in one lightweight environment.
Benefits of Java batch on Liberty
With Liberty batch you can integrate your online and batch executions into a common runtime to reuse the hardware, management, security, monitoring, tools, and support staff that are already in place. Liberty batch is also designed to easily integrate into your existing business process work flows which we’ll discuss in more detail later in this article. Also, runtime services and Java assets can easily be shared between online and batch applications resulting in a true service oriented architecture.
Another key concern, especially among z/OS batch users is the shrinking pool of skilled COBOL developers that can maintain legacy batch workloads. On the other hand, Java skills are plentiful in most organizations, all while Java runtimes become more efficient and Java tooling support continues to grow and make Java developers even more productive.
Migrating existing, or creating new, batch workloads using Java batch also has a direct cost benefit on z/OS since Java shifts general processor workload to specialty engines. This means lower overall costs and increased system productivity as general purpose processors are available for other mainframe workloads. Now let’s take look at an example that shows a typical batch scenario.
The JSL below shows a simple job that has three steps. If you are interested in further understanding JSL and individual JSL elements see the JSR-352 specification, but you won’t need that knowledge to follow this use case.
<job id="MonthlyStatementJob" xmlns="http://xmlns.jcp.org/xml/ns/javaee" version="1.0"> <step id="setupStep" next="ProcessingStep" allow-start-if-complete="true"> <batchlet ref="com.ibm.example.batch.ResourceSetupBatchlet" /> </step> <step id="ProcessingStep" next="PrintingStep"> <chunk item-count="15"> <reader ref="com.ibm.example.batch.CustomerDataReader" /> <processor ref="com.ibm.example.batch.CustomerDataProcessor"/> <writer ref="com.ibm.example.batch.CustomerDataWriter" /> </chunk> </step> <step id="PrintingStep" > <batchlet ref="com.ibm.example.batch.PrintingBatchlet" /> </step> </job>
In Example 1 we have a job that processes customers’ monthly data and then sends a printed statement to each customer. The first step is a simple setup step that prepares any resources that are required for the job.
In the second step,
ProcessingStep, we use the built-in checkpointing capability of a chunk so that if the step fails in the middle of processing we can restart the step where we left off. In the above example
item-count is a directive to checkpoint every 15 records. This way, if we are processing thousands or millions of records we will only reprocess 15 records at most if there is a failure.
As our customer base grows in our monthly statement scenario (above), we may find that using only one printer is taking too long. We need to have a way to print multiple statements at the same time, so we decide to add another printer to cut the printing time in half.
<job id="MonthlyStatementJob" xmlns="http://xmlns.jcp.org/xml/ns/javaee" version="1.0"> <step id="setupStep" next="ProcessingStep" allow-start-if-complete="true"> <batchlet ref="com.ibm.example.batch.ResourceSetupBatchlet" /> </step> <step id="ProcessingStep" next="PrintingStep"> <chunk > <reader ref="com.ibm.example.batch.CustomerDataReader" /> <processor ref="com.ibm.example.batch.CustomerDataProcessor"/> <writer ref="com.ibm.example.batch.CustomerDataWriter" /> </chunk> </step> <step id="PrintingStep" > <batchlet ref="com.ibm.example.batch.PrintingBatchlet" /> <partition> <plan partitions="2"> <properties partition="0"> <property name="printer" value="printer0" /> </properties> <properties partition="1"> <property name="printer" value="printer1" /> </properties> </plan> </partition> </step> </job>
In Example 2, you can see that we added a partition to the printing step. The plan tells the batch container to split the printing step into two partitions and to send all work in partition 0 to printer 0, and all work in partition 1 to printer 1. Without making any changes to the code we notify the batch container that we want to print our statements in parallel.
The batch container manages the parallel threads in each of the partitions. This means that a batch operations team can easily make changes to the batch work flow without waiting for the batch developer to make any changes, further reducing operational overhead. All of the above features are part of the JSR-352 standard and are common to all JSR-352 implementations.
Liberty batch features
As we previously discussed, Liberty batch builds on the JSR-352 standard to provide a complete batch solution. Here are some of the key features that are unique to Liberty batch that make it an enterprise class batch runtime by providing an ecosystem around security, integration, scaling, auditing, and tooling:
You can secure the Liberty batch environment using role-based security. Users can be a part of one or more possible batch roles:
batchMonitor. By configuring a user registry and authorization roles, an administrator can restrict access to batch operations and job instance data through the batch REST API or JobOperator interface. Security is included as part of batch right out of the box and it is easy to configure through using Liberty’s basic registry. As your batch users grow you can easily switch over to a LDAP user registry or SAF registry on z/OS.
Batch REST API
The batch REST API provides a complete set of operations to securely manage your batch environment. The REST interface gives you the flexibility to use your favorite REST client or scripting language to remotely manage your jobs. It provides start/submit, stop, restart functions along with the ability to view job execution status and other detailed execution data as well as job logs.
Use job logging to audit your batch jobs and debug potential problems in your batch applications. As your applications progress from development to production you can fine-tune the level of logging to suit your auditing needs.
Adding the batch management feature enables multi-server batch functionality. This provides a way to build a robust, highly available, and highly scalable batch topology. The Liberty batch multi-server support can be used with standalone Liberty servers or in conjunction with Liberty collectives for more centralized administration.
Batch manager command line interface
A command line utility (‘batchManager’) is a convenient mechanism for using the remote management API. The batch manager is a natural integration point with your existing business process work flows and external schedulers since it can be used to wait for job completion and to propagate return codes back to your existing workload automation.
WDT batch tools
WebSphere Developer Tools includes batch tools to help you get started with developing a batch application. They enable you to create batch applications using the JSR-352 programming model interfaces and compose jobs using JSL job definitions. You can then use the tools to easily deploy and test your application on a local or remote Liberty server. For more information on WDT and Liberty batch we’ll have a more detailed post on “How to write a Java Batch application using the Developer Tools” coming soon.
The Liberty batch container provides a standards-based approach to developing batch applications that can be securely managed and scaled into a highly available topology. Whether you are generating internal reports or processing loan applications, Liberty Java batch can be a powerful tool.