Spring Boot Batch: Reader, Processor, Writer example

Spring Batch is a powerful framework designed to facilitate robust and scalable batch processing in Java applications. It follows a structured approach where data processing occurs in three main stages: reading, processing, and writing. These stages are handled by the essential components: Item Reader, Item Processor, and Item Writer.

Configuring Spring Boot applications for Batch Execution

Firstly, in order to be able to use Spring Batch API, we need to include the Spring Batch Dependency in your project:

spring batch tutorial

Download the sample Spring Boot project or add manually the dependencies:

<dependency>
      <groupId>org.springframework.batch</groupId>
      <artifactId>spring-batch-core</artifactId>
</dependency>
<dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-batch</artifactId>
</dependency>

Then, let’s begin adding the main Configuration Class for your Spring Batch application. This following Class JobConfig, is a configuration class that defines and configures the main components required to execute a Spring Batch job. It’s responsible for setting up and configuring the Job, Step, Item Reader, Item Processor, and Item Writer, which are the core building blocks of a Spring Batch job.

@Configuration
public class JobConfig {
    public static final String jobName = "JobExample";
    private static final Logger logger = LoggerFactory.getLogger(JobConfig.class);

    @Bean
    public Job createJob(JobRepository jobRepository,
                    JobListener listener,
                    Step step1){
        return new JobBuilder(jobName, jobRepository)
                .incrementer(new RunIdIncrementer())
                .listener(listener)
                .start(step1)
                .build();
    }

    @Bean
    public Step createStep(JobRepository jobRepository,
                      PlatformTransactionManager transactionManager){
        return new StepBuilder("JobExample-step1", jobRepository)
                .<String, String>chunk(1,transactionManager)
                .reader(reader())
                .processor(processor())
                .writer(writer())
                .build();
    }

    @Bean
    public CustomItemReader reader(){
        return new CustomItemReader();
    }

    @Bean
    public CustomProcessor processor(){
        return new CustomProcessor();
    }

    @Bean
    public CustomItemWriter writer(){
        return new CustomItemWriter();
    }

}

More in detail:

  1. Job Definition: The createJob method defines the main job, setting up its name, incrementer, and attaching a listener. It’s created using a JobBuilder instance, specifying details such as the job name, incrementer for generating unique job instances, and a listener for monitoring job execution.
  2. Step Definition: The createStep method defines a step within the job. It uses a StepBuilder instance to configure the step, defining the chunk size, transaction management, reader, processor, and writer. This step represents a unit of work in the job flow.
  3. Item Reader, Processor, Writer: The reader, processor, and writer methods create instances of custom implementations for Item Reader, Item Processor, and Item Writer, respectively. These components handle reading input data, processing it, and writing the processed data to an output.

Adding the Reader, Processor and Writer Classes

Once that we have defined the configuration of our Spring Batch application, let’s add the Classes to perform the Chunk Reading, Processing and Writing.

spring boot batch example

Let’s begin with the Item Reader.

Coding the Item Reader

An Item Reader is responsible for ingesting data into a Spring Batch application from various sources. It reads input data in chunks or individually, providing a stream of items for processing. It can interface with databases, files (CSV, XML, JSON), REST APIs, and more. The read data is then flows to the Item Processor for further transformation.

public class CustomItemReader implements ItemReader<String> {

    private final static Logger logger = LoggerFactory.getLogger(CustomItemReader.class);

    private static Integer count = 1;

 
    @Override
    public String read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
        logger.info("Custom Item Reader ", count);
        
        Thread.sleep(1000 * 1);

        if(count > 2)
            return null;

        count++;

        Faker faker = new Faker();
        return faker.name().fullName();
       
    }
}

In our example, the Reader Class will simulate reading data from a Source. In practice, it will use the JakaFaker library to create a Random name. The name will be then processed twice (according to the counter variable) by the Processor.

Coding the Processor

The Item Processor receives the data from the Item Reader, applies business logic or transformations, and returns processed data. It’s an optional component, allowing modification, validation, or enrichment of the input data. The processing can range from simple tasks like data cleansing to complex computations or aggregations.

Here is our CustomProcessor:

public class CustomProcessor implements ItemProcessor<String, String> {    

    @Override
    public String process(String text) throws Exception {
        // logger.info("Custom Processor {}", item);

        Person person = new Person(text, (int) (Math.random() * 100) + 1);

        ObjectMapper objectMapper = new ObjectMapper();

        // Convert Person record to JSON string
        String json = objectMapper.writeValueAsString(person);
        return json;

    }
}

As you can see, the CustomProcessor Class performs a transformation of the String into a JSON, using the ObjectMapper API from Jackson library.

Coding the Writer Class

Once the data is processed by the Item Processor, it’s handed over to the Item Writer. The Item Writer is responsible for writing the processed data to specific destinations. It could involve persisting data into databases, writing to files, sending output to message queues, or invoking external APIs. The Item Writer ensures the data is persisted or dispatched correctly based on the defined logic.

In our case, the Writer will merely print on the Console the incoming JSON String:

public class CustomItemWriter implements ItemWriter<String> {

    private final static Logger logger = LoggerFactory.getLogger(CustomItemReader.class);

    @Override
    public void write(Chunk<? extends String> chunk) throws Exception {
        logger.info("Custom Item Writer {}", chunk.getItems().get(0));
        Thread.sleep(1000 * 1);
    }
}

Adding the JobExecutionListener

Within our Job Configuration, we are also including a JobExecutionListener. The purpose of this Class is to perform actions before the Job starts and when it finishes. In our case, we are simply printing a line to inform about the execution status:

@Component
public class JobListener implements JobExecutionListener {

    private static final Logger logger = LoggerFactory.getLogger(JobListener.class);

    @Override
    public void beforeJob(JobExecution jobExecution){
        logger.info("JobListener: beforeJob {}", jobExecution.getStatus());
    }

    @Override
    public void afterJob(JobExecution jobExecution){
 
        logger.info("JobListener: afterJob  {}", jobExecution.getStatus());
    }
}

Triggering the Batch Job

Finally, to kickstart our Batch Job we will add a Controller which contains a GET method which triggers the Batch execution:

@RestController
@RequestMapping("/batch")
public class BatchJobController {

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Job job;  

    private final static Logger logger = LoggerFactory.getLogger(BatchJobController.class);

    @GetMapping("/startjob")
    public BatchStatus startJob() throws Exception {
        JobParameters jobParameters = new JobParametersBuilder()
                .addLong("time", System.currentTimeMillis())
                .toJobParameters();

        JobExecution jobExecution = jobLauncher.run(job, jobParameters);

        logger.info("Job " + job.getName() + " done...");
        

        return jobExecution.getStatus();
    }
}

You can then invoke the Controller and verify from the Console that the Custom Reader, Processor and Writer completed their execution twice:

2024-01-10T19:33:20.413+01:00  INFO 180692 --- [nio-8080-exec-1] o.s.b.c.l.support.SimpleJobLauncher      : Job: [SimpleJob: [name=JobExample]] launched with the following parameters: [{'time':'{value=1704911600377, type=class java.lang.Long, identifying=true}'}]
2024-01-10T19:33:20.426+01:00  INFO 180692 --- [nio-8080-exec-1] c.springbatchexample.config.JobListener  : JobListener: beforeJob STARTED
2024-01-10T19:33:20.431+01:00  INFO 180692 --- [nio-8080-exec-1] o.s.batch.core.job.SimpleStepHandler     : Executing step: [JobExample-step1]
2024-01-10T19:33:20.442+01:00  INFO 180692 --- [nio-8080-exec-1] c.s.job.CustomItemReader                 : Custom Item Reader 
2024-01-10T19:33:21.571+01:00  INFO 180692 --- [nio-8080-exec-1] c.s.job.CustomItemReader                 : Custom Item Writer {"name":"Jesse Harris","age":45}
2024-01-10T19:33:22.579+01:00  INFO 180692 --- [nio-8080-exec-1] c.s.job.CustomItemReader                 : Custom Item Reader 
2024-01-10T19:33:23.644+01:00  INFO 180692 --- [nio-8080-exec-1] c.s.job.CustomItemReader                 : Custom Item Writer {"name":"Franchesca O'Kon","age":28}
2024-01-10T19:33:24.651+01:00  INFO 180692 --- [nio-8080-exec-1] c.s.job.CustomItemReader                 : Custom Item Reader 
2024-01-10T19:33:25.657+01:00  INFO 180692 --- [nio-8080-exec-1] o.s.batch.core.step.AbstractStep         : Step: [JobExample-step1] executed in 5s225ms
2024-01-10T19:33:25.660+01:00  INFO 180692 --- [nio-8080-exec-1] c.springbatchexample.config.JobListener  : JobListener: afterJob  COMPLETED
2024-01-10T19:33:25.662+01:00  INFO 180692 --- [nio-8080-exec-1] o.s.b.c.l.support.SimpleJobLauncher      : Job: [SimpleJob: [name=JobExample]] completed with the following parameters: [{'time':'{value=1704911600377, type=class java.lang.Long, identifying=true}'}] and the following status: [COMPLETED] in 5s239ms
2024-01-10T19:33:25.662+01:00  INFO 180692 --- [nio-8080-exec-1] c.s.controller.BatchJobController        : Job JobExample done...

Conclusion:

Spring Batch offers a structured and scalable approach to handle batch processing tasks. By leveraging these three core components—Item Reader, Item Processor, and Item Writer—you can efficiently handle data ingestion, transformation, and output, making it a robust solution for batch processing requirements in enterprise applications.

Source code: https://github.com/fmarchioni/masterspringboot/tree/master/batch/chunk-step

Found the article helpful? if so please follow us on Socials
Twitter Icon       Facebook Icon       LinkedIn Icon       Mastodon Icon