Batch
This section goes into more detail about Spring Cloud Task’s integration with Spring Batch. Tracking the association between a job execution and the task in which it was executed as well as remote partitioning through Spring Cloud Deployer are covered in this section.
Associating a Job Execution to the Task in which It Was Executed
Spring Boot provides facilities for the execution of batch jobs within a Spring Boot Uber-jar. Spring Boot’s support of this functionality lets a developer execute multiple batch jobs within that execution. Spring Cloud Task provides the ability to associate the execution of a job (a job execution) with a task’s execution so that one can be traced back to the other.
Spring Cloud Task achieves this functionality by using the TaskBatchExecutionListener
.
By default,
this listener is auto configured in any context that has both a Spring Batch Job
configured (by having a bean of type Job
defined in the context) and the
spring-cloud-task-batch
jar on the classpath. The listener is injected into all jobs
that meet those conditions.
Overriding the TaskBatchExecutionListener
To prevent the listener from being injected into any batch jobs within the current context, you can disable the autoconfiguration by using standard Spring Boot mechanisms.
To only have the listener injected into particular jobs within the context, override the
batchTaskExecutionListenerBeanPostProcessor
and provide a list of job bean IDs, as shown
in the following example:
public static TaskBatchExecutionListenerBeanPostProcessor batchTaskExecutionListenerBeanPostProcessor() {
TaskBatchExecutionListenerBeanPostProcessor postProcessor =
new TaskBatchExecutionListenerBeanPostProcessor();
postProcessor.setJobNames(Arrays.asList(new String[] {"job1", "job2"}));
return postProcessor;
}
You can find a sample batch application in the samples module of the Spring Cloud Task Project, here. |
Remote Partitioning
Spring Cloud Deployer provides facilities for launching Spring Boot-based applications on
most cloud infrastructures. The DeployerPartitionHandler
and
DeployerStepExecutionHandler
delegate the launching of worker step executions to Spring
Cloud Deployer.
To configure the DeployerStepExecutionHandler
, you must provide a Resource
representing the Spring Boot Uber-jar to be executed, a TaskLauncherHandler
, and a
JobExplorer
. You can configure any environment properties as well as the max number of
workers to be executing at once, the interval to poll for the results (defaults to 10
seconds), and a timeout (defaults to -1 or no timeout). The following example shows how
configuring this PartitionHandler
might look:
@Bean
public PartitionHandler partitionHandler(TaskLauncher taskLauncher,
JobExplorer jobExplorer) throws Exception {
MavenProperties mavenProperties = new MavenProperties();
mavenProperties.setRemoteRepositories(new HashMap<>(Collections.singletonMap("springRepo",
new MavenProperties.RemoteRepository(repository))));
Resource resource =
MavenResource.parse(String.format("%s:%s:%s",
"io.spring.cloud",
"partitioned-batch-job",
"1.1.0.RELEASE"), mavenProperties);
DeployerPartitionHandler partitionHandler =
new DeployerPartitionHandler(taskLauncher, jobExplorer, resource, "workerStep");
List<String> commandLineArgs = new ArrayList<>(3);
commandLineArgs.add("--spring.profiles.active=worker");
commandLineArgs.add("--spring.cloud.task.initialize.enable=false");
commandLineArgs.add("--spring.batch.initializer.enabled=false");
partitionHandler.setCommandLineArgsProvider(
new PassThroughCommandLineArgsProvider(commandLineArgs));
partitionHandler.setEnvironmentVariablesProvider(new NoOpEnvironmentVariablesProvider());
partitionHandler.setMaxWorkers(2);
partitionHandler.setApplicationName("PartitionedBatchJobTask");
return partitionHandler;
}
When passing environment variables to partitions, each partition may be on a different machine with different environment settings. Consequently, you should pass only those environment variables that are required. |
Notice in the example above that we have set the maximum number of workers to 2. Setting the maximum of workers establishes the maximum number of partitions that should be running at one time.
The Resource
to be executed is expected to be a Spring Boot Uber-jar with a
DeployerStepExecutionHandler
configured as a CommandLineRunner
in the current context.
The repository enumerated in the preceding example should be the remote repository in
which the Spring Boot Uber-jar is located. Both the manager and worker are expected to have visibility
into the same data store being used as the job repository and task repository. Once the
underlying infrastructure has bootstrapped the Spring Boot jar and Spring Boot has
launched the DeployerStepExecutionHandler
, the step handler executes the requested
Step
. The following example shows how to configure the DeployerStepExecutionHandler
:
@Bean
public DeployerStepExecutionHandler stepExecutionHandler(JobExplorer jobExplorer) {
DeployerStepExecutionHandler handler =
new DeployerStepExecutionHandler(this.context, jobExplorer, this.jobRepository);
return handler;
}
You can find a sample remote partition application in the samples module of the Spring Cloud Task project, here. |
Asynchronously launch remote batch partitions
By default batch partitions are launched sequentially. However, in some cases this may affect performance as each launch will block until the resource (For example: provisioning a pod in Kubernetes) is provisioned.
In these cases you can provide a ThreadPoolTaskExecutor
to the DeployerPartitionHandler
. This will launch the remote batch partitions based on the configuration of the ThreadPoolTaskExecutor
.
For example:
@Bean
public ThreadPoolTaskExecutor threadPoolTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(4);
executor.setThreadNamePrefix("default_task_executor_thread");
executor.setWaitForTasksToCompleteOnShutdown(true);
executor.initialize();
return executor;
}
@Bean
public PartitionHandler partitionHandler(TaskLauncher taskLauncher, JobExplorer jobExplorer,
TaskRepository taskRepository, ThreadPoolTaskExecutor executor) throws Exception {
Resource resource = this.resourceLoader
.getResource("maven://io.spring.cloud:partitioned-batch-job:2.2.0.BUILD-SNAPSHOT");
DeployerPartitionHandler partitionHandler =
new DeployerPartitionHandler(taskLauncher, jobExplorer, resource,
"workerStep", taskRepository, executor);
...
}
We need to close the context since the use of ThreadPoolTaskExecutor leaves a thread active thus the app will not terminate. To close the application appropriately, we will need to set spring.cloud.task.closecontextEnabled property to true .
|
Notes on Developing a Batch-partitioned application for the Kubernetes Platform
-
When deploying partitioned apps on the Kubernetes platform, you must use the following dependency for the Spring Cloud Kubernetes Deployer:
<dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-deployer-kubernetes</artifactId> </dependency>
-
The application name for the task application and its partitions need to follow the following regex pattern:
[a-z0-9]([-a-z0-9]*[a-z0-9])
. Otherwise, an exception is thrown.
Batch Informational Messages
Spring Cloud Task provides the ability for batch jobs to emit informational messages. The “Spring Batch Events” section covers this feature in detail.
Batch Job Exit Codes
As discussed earlier, Spring Cloud Task
applications support the ability to record the exit code of a task execution. However, in
cases where you run a Spring Batch Job within a task, regardless of how the Batch Job
Execution completes, the result of the task is always zero when using the default
Batch/Boot behavior. Keep in mind that a task is a boot application and that the exit code
returned from the task is the same as a boot application.
To override this behavior and allow the task to return an exit code other than zero when a
batch job returns an
BatchStatus
of FAILED
, set spring.cloud.task.batch.fail-on-job-failure
to true
. Then the exit code
can be 1 (the default) or be based on the
specified
ExitCodeGenerator
)
This functionality uses a new ApplicationRunner
that replaces the one provided by Spring
Boot. By default, it is configured with the same order. However, if you want to customize
the order in which the ApplicationRunner
is run, you can set its order by setting the
spring.cloud.task.batch.applicationRunnerOrder
property. To have your task return the
exit code based on the result of the batch job execution, you need to write your own
CommandLineRunner
.