Spring batch

Table of Contents

TODO

  1. https://docs.spring.io/spring-batch/docs/4.1.x/reference/html/job.html#javaConfig

  2. https://spring.io/guides/gs/batch-processing/

  3. https://docs.spring.io/spring-batch/docs/current/reference/html/spring-batch-intro.html#:~:text=You%20can%20use%20Spring%20Batch,it%2C%20and%20so%20on).

  4. https://stackoverflow.com/questions/26929308/advantages-of-spring-batch

  5. https://dzone.com/articles/spring-batch-typical-use-case

  6. https://stackoverflow.com/questions/5241416/is-spring-batch-an-overkill

  7. Write a sample application (probably the same use-case as that of Pru?)

  8. Implement using h2 database. Without a database integration, it will not be very helpful.

  9. Set-up unit tests for individual tests

  10. Set-up end-to-end test for the entire batch using a mocked data set.

    1. Verify everything that needs to be done in each of the steps using “verify” and “assert”.

Reasons not to use

  1. If the data model that is being passed from one step to a different step is complex, we could run into serialization issues.
    1. I had a scenario where I tried to pass a list of objects from one Step to another.
    2. Since the objects have fields that are of type LocalDateTime, it kept throwing this error.
      com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Java 8 date/time type `java.time.LocalDate` not supported by default: add Module "com.fasterxml.jackson.datatype:jackson-datatype-jsr310" to enable handling (through reference chain)
      
    3. Customizing the configuration to make the ObjectMapper use JavaTimeModule turned out to be incredibly difficult.
      objectMapper.registerModule(new JavaTimeModule());
      
    4. https://stackoverflow.com/questions/63214823/unable-to-deserialize-the-execution-context-exception-when-running-spring-batch
    5. https://stackoverflow.com/questions/56233013/overriding-bean-issue-in-spring-batch
    6. https://stackoverflow.com/questions/53116676/spring-batch-not-deserialising-dates
    7. This turned out to be helpful with customizing the serializer to work with LocalDateTime.
      import com.fasterxml.jackson.databind.ObjectMapper;
      import com.fasterxml.jackson.databind.jsr310.JavaTimeModule;
      
      import org.springframework.batch.core.repository.JobRepository;
      import org.springframework.batch.core.repository.dao.Jackson2ExecutionContextStringSerializer;
      import org.springframework.batch.core.repository.support.JobRepositoryFactoryBean;
      
      import org.springframework.boot.autoconfigure.batch.BasicBatchConfigurer;
      import org.springframework.boot.autoconfigure.batch.BatchProperties;
      import org.springframework.boot.autoconfigure.transaction.TransactionManagerCustomizers;
      import org.springframework.boot.context.properties.PropertyMapper;
      
      public class MyCustomBatchConfigurer extends BasicBatchConfigurer {
      
          private final BatchProperties properties;
          private final DataSource datasource;
      
          protected MyCustomBatchConfigurer(BatchProperties properties, DataSource datasource, TransactionManagerCustomizers transactionManagerCustomizers) {
              super(properties, datasource, transactionManagerCustomizers);
              this.properties = properties;
              this.datasource = datasource;
          }
      
          @Override
          protected JobRepository createJobRepository() throws Exception {
              JobExplorerFactoryBean factory = new JobExplorerFactoryBean();
      
              PropertyMapper map = PropertyMapper.get();
              map.from(this.datasource).to(factory::setDataSource);
              map.from(this::determineIsolationLevel).whenNotNull().to(factory::setIsolationLevelForCreate);
              map.from(this.properties.getJdbc()::getTablePrefix).whenHasText().to(factory::setTablePrefix);
              map.from(this::getTransactionManager).to(factory::setTransactionManager);
      
              ObjectMapper objectMapper = new ObjectMapper();
              objectMapper.registerModule(new JavaTimeModule());
      
              Jackson2ExecutionContextStringSerializer serializer = new Jackson2ExecutionContextStringSerializer();
              serializer.setObjectMapper(objectMapper);
      
              factory.setSerializer(serializer);
              factory.afterPropertiesSet();
              return factory.getObject();
          }
      }
      
      Then, in the batch configuration class:
      @Bean
      public BasicBatchConfigurer basicBatchConfigurer(BatchProperties properties, DataSource datasource, TransactionManagerCustomizers transactionManagerCustomizers) {
          return new MyCustomBatchConfigurer(properties, datasource, transactionManagerCustomizers);
      }
      
  2. If the different steps in the batch job are expected to work with different entities (views, tables, etc.), it could get very difficult to get it to work with spring batch.
    1. I had a scenario where I read data from a view in Step 1 and I am supposed to update a (different) table in Step 2.
    2. The batch job kept trying to update the view in Step 2 as well (even though the Repository class is for a table).
  3. Setting up tests was hard.

Alternatives

  1. https://github.com/j-easy/easy-batch/wiki
  2. https://www.jeasy.org/

Links to this note