Java developers have recognized the need for batch applications a really long time ago, but have had to get by with non-standard approaches – until now. JBatch (JSR-352) introduces an exciting Java specification for building, deploying, and running batch applications, thus standardising the development of such applications. Currently JSR-352 is part of the Java EE 7 specification and has full access to all other features of the platform, including transaction management, persistence, messaging, and more, offering the developer very robust data processing opportunities. In this article, we will create from scratch, a small web application which demonstrates the use of JSR-352 on Apache TomEE server.
JBatch – JSR-352
JSR-352 is defined for both Java EE 7 and Java SE 6 platforms. While nearly all of JSR-352 is common between Java EE and Java SE environments; this article will focus exclusively on JSR-352 as part of Java EE 7.
JSR-352 addresses three critical concerns*: a batch programming model, a job specification language, and a batch runtime. This constitutes a separation of concerns.
- Application developers have clear, reusable interfaces for constructing batch style applications;
- Job writers have a powerful expression language for how to execute the steps of a batch execution;
- Solution integrators have a runtime API for initiating and controlling batch execution.
JSR-352 defines numerous batch programming APIs that help to implement batch business logic. Commonly used among these are ItemReader
, ItemProcessor
, and ItemWriter
, which define the basic runtime contracts for reading, processing, and writing data items for a batch step. Other APIs are available for building task-oriented steps, interposing on lifecycle events, and controlling partitioned steps.
Batch steps are organized into jobs, which define the run sequence. JobOperator
runtime interface is used to run and interact with jobs. Jobs may have flows; steps that can be executed in multiple threads. Job executions are stored in a repository, enabling query of current and historical job status. Batch jobs are described in xml files that should be placed in /META-INF/batch-jobs folder. Each xml has as root element. Each job may contain multiple steps. Consult the JSR-352 Home Page for additional details.
(Credits: https://jaxenter.com/java-ee-7-introduction-to-batch-jsr-352-106192.html)
TomEE and JBatch
The Apache TomEE server is currently certified and fully compatible with Java EE 6 specification. Work is in progress to make Apache TomEE fulfill the Java EE 7 specification. You can find TomEE Batch Support in the latest milestone Apache TomEE 7.0.0-M3 Plus profile.
Currently the reference implementation of JSR-352 is provided by IBM. Apache BatchEE is based on this implementation and the one shipped with TomEE. It includes a set of enhancements that allow BatchEE to provide custom implementations to readers and writers (JdbcReader
, FlatFileItemReader
, StaxItemWriter
, to name a few). It also allows BatchEE to provide integrations with other platforms like Hazelcast, Camel or Groovy. Finally, it also ships with it’s own web console to monitor and manage Batch Jobs.
TomEE and JBatch Sample
Let’s have a look at a full scale example from the Java EE 7 Tutorial written by Arun Gupta and Roberto Cortez, and adopted for the article.
Imagine that it is required to process a csv file with Person data, import it in a webapp and then list it. The persons will be the characters from the adorable Big Bang Theory TV series.
The webapp should consist of two pages. On the first one there is a button to trigger import and on the second we visualize them. We will use JSF for visualization.
So let’s start!
Our domain object Person
would be quite simple:
public class Person {
private String name;
private String hiredate;
public Person(String name, String hiredate) {
this.name = name;
this.hiredate = hiredate;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getHiredate() {
return hiredate;
}
public void setHiredate(String hiredate) {
this.hiredate = hiredate;
}
@Override
public String toString() {
return name + hiredate;
}
}
Users require service to handle the Persons in our app, we’ll keep it simple:
@ApplicationScoped
@Named
public class PersonService {
private List personList;
@PostConstruct
void init(){
personList = new ArrayList<>();
}
public List getPersonList() {
return personList;
}
}
Now the fun part – JSR-352! As mentioned previously, JSR-352 defines multiple standardized approaches. There may be simple small pieces of work implemented in batchlets or there may be full scale readers, processors and writers. That’s exactly the most suitable way for the current spec.
JBatch Chunk
As the data is imported from a csv file:
- it first has to be read;
- each line should be parsed (processed) into the domain object;
- finally it should be stored using service.
A JSR-352 job has to be defined. The job is described in an xml file. In our case lets call it bigbangjob.xml
:
<job id="bigbangjob" xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/jobXML_1_0.xsd" version="1.0">
<step id="bigBangStep" >
<chunk item-count="3">
<reader ref="bigBangReader"/>
<processor ref="bigBangProcessor"/>
<writer ref="bigBangWriter"/>
</chunk>
</step>
</job>
There is only one job and only one step. The data should be processed in chunks of three applying corresponding reader, processor and writer, thus transforming the data. The reader is responsible for opening the data and reading it in chunks. In the case of app, a csv file should be opened then it should be read line by line. On each iteration a string containing line should be returned:
JBatch ItemReader
@Named
public class BigBangReader extends AbstractItemReader {
private BufferedReader reader;
@Override
public void open(Serializable checkpoint) throws Exception {
reader = new BufferedReader(
new InputStreamReader(
Thread.currentThread().getContextClassLoader().getResourceAsStream("/mydata.csv")));
}
@Override
public String readItem() {
try {
return reader.readLine();
} catch (IOException ex) {
Logger.getLogger(getClass().getName()).log(Level.SEVERE, null, ex);
}
return null;
}
}
The reader should extend AbstractItemReader
. Method open is used to get csv file; it’s called only once. Then with the method readItem, a piece of data is extracted, in our case one line from the file.
JBatch ItemProcessor
Now the data should be processed. As defined in the job, a line from the csv file is passed to a processor where it is parsed into the domain object Person:
@Named
public class BigBangProcessor implements ItemProcessor {
private SimpleDateFormat format = new SimpleDateFormat("M/dd/yy");
@Override
public Person processItem(Object t) {
System.out.println("processItem: " + t);
StringTokenizer tokens = new StringTokenizer((String) t, ",");
String name = tokens.nextToken();
String date;
try {
date = tokens.nextToken();
format.setLenient(false);
format.parse(date);
} catch (ParseException e) {
return null;
}
return new Person(name, date);
}
}
As seen from the code the processor should extend ItemProcessor
. In the processItem method the line string passed from the reader is parsed into Person and is being returned.
JBatch ItemWriter
In the last step, a chunk of processed Persons has to be stored. For the case a PersonService is used:
@Named
public class BigBangWriter extends AbstractItemWriter {
@Inject
PersonService personService;
@Override
public void writeItems(List<?> list) {
System.out.println("Writing items: " + list);
personService.getPersonList().addAll(list);
}
}
The writer extends AbstractItemWriter
. In the method writeItems the parsed from the string Persons are stored using the PersonService.
JBatch Web Interface
Now the processed data has to be visualized. RequestScope-d backing bean will be just OK. Java EE 7 specification allows the container to manage the JSF beans so it can be annotated as Named. PersonService is injected as it provides access to the Persons.
@RequestScoped
@Named
public class BatchBean {
@Inject
private PersonService personService;
public String runBatch() {
JobOperator jobOperator = BatchRuntime.getJobOperator();
jobOperator.start("bigbangjob", new Properties());
return "success";
}
public List getPersonList(){
return personService.getPersonList();
}
}
The JSR-352 job in the runBatch method is started like this:
JobOperator jobOperator = BatchRuntime.getJobOperator();
jobOperator.start("bigbangjob", new Properties());
Finally, some JSF pages are required to show the data from the backing bean. The initial batch.xhtml
:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:f="http://java.sun.com/jsf/core"
xmlns:h="http://java.sun.com/jsf/html">
<h:body bgcolor="white">
<f:view>
<h:form>
<h:commandButton action="#{batchBean.runBatch}" value="Run Batch"/>
</h:form>
</f:view>
</h:body>
</html>
The result.xhtml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:f="http://java.sun.com/jsf/core"
xmlns:h="http://java.sun.com/jsf/html">
<h:body>
<f:view>
<h:form id="mainForm">
<h2>Result of batch job</h2>
<h:dataTable value="${batchBean.personList}" var="person">
<h:column>
<h:outputText value="#{person.name}"/>
</h:column>
<h:column>
<h:outputText value="#{person.hiredate}"/>
</h:column>
</h:dataTable>
<h:commandLink action="back">
<h:outputText value="Home"/>
</h:commandLink>
</h:form>
</f:view>
</h:body>
</html>
Basically that’s it! Now just run the app!
Click on the “Run batch” and as a result:
If we take a look at the console we will see that BatchEE runs properly.
____ _ _ ______ ______
| _ \ | | | | | ____| ____|
| |_) | __ _| |_ ___| |__ | |__ | |__
| _ < / _` | __/ __| '_ \| __| | __|
| |_) | (_| | || (__| | | | |____| |____
|____/ \__,_|\__\___|_| |_|______|______|0.2-incubating
Status STARTING
processItem: Penny, 12/1/12
processItem: Leonard Hofstadter, 12/6/08
processItem: Howard Wolowitz, 8/27/7
Writing items: [Penny 12/1/12, Leonard Hofstadter 12/6/08, Howard Wolowitz 8/27/7]
processItem: Bernadette Rostenkowski-Wolowitz, 8/14/13
processItem: Sheldon Cooper, 3/18/9
processItem: Amy Farah Fowler, 11/22/10
Writing items: [Bernadette Rostenkowski-Wolowitz 8/14/13, Sheldon Cooper 3/18/9, Amy Farah Fowler 11/22/10]
processItem: Raj Koothrappali, 3/2/11
Writing items: [Raj Koothrappali 3/2/11]
Notice that the data is processed in a chunk of three.
Conclusion
That’s basically it! JSR-352 integrates with Apache TomEE really seamlessly! Especially on the latest versions and BatchEE is a great implementation for that. The example above demonstrates just a small part of JSR-352 possibilities. More examples can be found at https://github.com/javaee-samples/javaee7-samples/tree/master/batch.
JBatch (JSR-352) is a really robust framework which provides a standardized way of data processing. Bulk data processing is an essential part of most enterprise applications. JSR-352 defines a powerful programming model and runtime to easily build, deploy, and run mission-critical batch applications. Now that JSR-352 is part of Java EE 7, users can benefit from its full integration in the specification stack.
The example project demonstrated in the article can be found here: https://github.com/dalexandrov/jsf-batch-example.
Very useful info,How to use transaction manager?