4. Batch

This section goes into more detail about Spring Cloud Task’s integration with Spring Batch. Tracking the association between a job execution and the task in which it was executed as well as remote partitioning through Spring Cloud Deployer are covered in this section.spring-doc.cn

4.1. Associating a Job Execution to the Task in which It Was Executed

Spring Boot provides facilities for the execution of batch jobs within an über-jar. Spring Boot’s support of this functionality lets a developer execute multiple batch jobs within that execution. Spring Cloud Task provides the ability to associate the execution of a job (a job execution) with a task’s execution so that one can be traced back to the other.spring-doc.cn

Spring Cloud Task achieves this functionality by using the TaskBatchExecutionListener. By default, this listener is auto configured in any context that has both a Spring Batch Job configured (by having a bean of type Job defined in the context) and the spring-cloud-task-batch jar on the classpath. The listener is injected into all jobs that meet those conditions.spring-doc.cn

4.1.1. Overriding the TaskBatchExecutionListener

To prevent the listener from being injected into any batch jobs within the current context, you can disable the autoconfiguration by using standard Spring Boot mechanisms.spring-doc.cn

To only have the listener injected into particular jobs within the context, override the batchTaskExecutionListenerBeanPostProcessor and provide a list of job bean IDs, as shown in the following example:spring-doc.cn

public TaskBatchExecutionListenerBeanPostProcessor batchTaskExecutionListenerBeanPostProcessor() {
    TaskBatchExecutionListenerBeanPostProcessor postProcessor =
        new TaskBatchExecutionListenerBeanPostProcessor();

    postProcessor.setJobNames(Arrays.asList(new String[] {"job1", "job2"}));

    return postProcessor;
}
You can find a sample batch application in the samples module of the Spring Cloud Task Project, here.

4.2. Remote Partitioning

Spring Cloud Deployer provides facilities for launching Spring Boot-based applications on most cloud infrastructures. The DeployerPartitionHandler and DeployerStepExecutionHandler delegate the launching of worker step executions to Spring Cloud Deployer.spring-doc.cn

To configure the DeployerStepExecutionHandler, you must provide a Resource representing the Spring Boot über-jar to be executed, a TaskLauncher, and a JobExplorer. You can configure any environment properties as well as the max number of workers to be executing at once, the interval to poll for the results (defaults to 10 seconds), and a timeout (defaults to -1 or no timeout). The following example shows how configuring this PartitionHandler might look:spring-doc.cn

@Bean
public PartitionHandler partitionHandler(TaskLauncher taskLauncher,
        JobExplorer jobExplorer) throws Exception {

    MavenProperties mavenProperties = new MavenProperties();
    mavenProperties.setRemoteRepositories(new HashMap<>(Collections.singletonMap("springRepo",
        new MavenProperties.RemoteRepository(repository))));

    Resource resource =
        MavenResource.parse(String.format("%s:%s:%s",
                "io.spring.cloud",
                "partitioned-batch-job",
                "1.1.0.RELEASE"), mavenProperties);

    DeployerPartitionHandler partitionHandler =
        new DeployerPartitionHandler(taskLauncher, jobExplorer, resource, "workerStep");

    List<String> commandLineArgs = new ArrayList<>(3);
    commandLineArgs.add("--spring.profiles.active=worker");
    commandLineArgs.add("--spring.cloud.task.initialize.enable=false");
    commandLineArgs.add("--spring.batch.initializer.enabled=false");

    partitionHandler.setCommandLineArgsProvider(
        new PassThroughCommandLineArgsProvider(commandLineArgs));
    partitionHandler.setEnvironmentVariablesProvider(new NoOpEnvironmentVariablesProvider());
    partitionHandler.setMaxWorkers(2);
    partitionHandler.setApplicationName("PartitionedBatchJobTask");

    return partitionHandler;
}
When passing environment variables to partitions, each partition may be on a different machine with different environment settings. Consequently, you should pass only those environment variables that are required.

Notice in the example above that we have set the maximum number of workers to 2. Setting the maximum of workers establishes the maximum number of partitions that should be running at one time.spring-doc.cn

The Resource to be executed is expected to be a Spring Boot über-jar with a DeployerStepExecutionHandler configured as a CommandLineRunner in the current context. The repository enumerated in the preceding example should be the remote repository in which the über-jar is located. Both the manager and worker are expected to have visibility into the same data store being used as the job repository and task repository. Once the underlying infrastructure has bootstrapped the Spring Boot jar and Spring Boot has launched the DeployerStepExecutionHandler, the step handler executes the requested Step. The following example shows how to configure the DeployerStepExecutionHandler:spring-doc.cn

@Bean
public DeployerStepExecutionHandler stepExecutionHandler(JobExplorer jobExplorer) {
    DeployerStepExecutionHandler handler =
        new DeployerStepExecutionHandler(this.context, jobExplorer, this.jobRepository);

    return handler;
}
You can find a sample remote partition application in the samples module of the Spring Cloud Task project, here.

4.2.1. Notes on Developing a Batch-partitioned application for the Kubernetes Platform

  • When deploying partitioned apps on the Kubernetes platform, you must use the following dependency for the Spring Cloud Kubernetes Deployer:spring-doc.cn

    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-deployer-kubernetes</artifactId>
    </dependency>
  • The application name for the task application and its partitions need to follow the following regex pattern: [a-z0-9]([-a-z0-9]*[a-z0-9]). Otherwise, an exception is thrown.spring-doc.cn

4.2.2. Notes on Developing a Batch-partitioned Application for the Cloud Foundry Platform

  • When deploying partitioned apps on the Cloud Foundry platform, you must use the following dependencies for the Spring Cloud Foundry Deployer:spring-doc.cn

    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-deployer-cloudfoundry</artifactId>
    </dependency>
    <dependency>
        <groupId>io.projectreactor</groupId>
        <artifactId>reactor-core</artifactId>
        <version>3.1.5.RELEASE</version>
    </dependency>
    <dependency>
        <groupId>io.projectreactor.ipc</groupId>
        <artifactId>reactor-netty</artifactId>
        <version>0.7.5.RELEASE</version>
    </dependency>
  • When configuring the partition handler, Cloud Foundry Deployment environment variables need to be established so that the partition handler can start the partitions. The following list shows the required environment variables:spring-doc.cn

An example set of deployment environment variables for a partitioned task that uses a mysql database service might resemble the following:spring-doc.cn

spring_cloud_deployer_cloudfoundry_url=https://api.local.pcfdev.io
spring_cloud_deployer_cloudfoundry_org=pcfdev-org
spring_cloud_deployer_cloudfoundry_space=pcfdev-space
spring_cloud_deployer_cloudfoundry_domain=local.pcfdev.io
spring_cloud_deployer_cloudfoundry_username=admin
spring_cloud_deployer_cloudfoundry_password=admin
spring_cloud_deployer_cloudfoundry_services=mysql
spring_cloud_deployer_cloudfoundry_taskTimeout=300
When using PCF-Dev, the following environment variable is also required: spring_cloud_deployer_cloudfoundry_skipSslValidation=true

4.3. Batch Informational Messages

Spring Cloud Task provides the ability for batch jobs to emit informational messages. The “Spring Batch Events” section covers this feature in detail.spring-doc.cn

4.4. Batch Job Exit Codes

As discussed earlier, Spring Cloud Task applications support the ability to record the exit code of a task execution. However, in cases where you run a Spring Batch Job within a task, regardless of how the Batch Job Execution completes, the result of the task is always zero when using the default Batch/Boot behavior. Keep in mind that a task is a boot application and that the exit code returned from the task is the same as a boot application. To override this behavior and allow the task to return an exit code other than zero when a batch job returns an BatchStatus of FAILED, set spring.cloud.task.batch.fail-on-job-failure to true. Then the exit code can be 1 (the default) or be based on the specified ExitCodeGenerator)spring-doc.cn

This functionality uses a new CommandLineRunner that replaces the one provided by Spring Boot. By default, it is configured with the same order. However, if you want to customize the order in which the CommandLineRunner is run, you can set its order by setting the spring.cloud.task.batch.commandLineRunnerOrder property. To have your task return the exit code based on the result of the batch job execution, you need to write your own CommandLineRunner.spring-doc.cn