Appendices

Having trouble with Spring Cloud Data Flow, We’d like to help!spring-doc.cn

Appendix A: Data Flow Template

As described in API Guide chapter, Spring Cloud Data Flow’s functionality is completely exposed through REST endpoints. While you can use those endpoints directly, Spring Cloud Data Flow also provides a Java-based API, which makes using those REST endpoints even easier.spring-doc.cn

The central entry point is the DataFlowTemplate class in the org.springframework.cloud.dataflow.rest.client package.spring-doc.cn

This class implements the DataFlowOperations interface and delegates to the following sub-templates that provide the specific functionality for each feature-set:spring-doc.cn

Interface Description

StreamOperationsspring-doc.cn

REST client for stream operationsspring-doc.cn

CounterOperationsspring-doc.cn

REST client for counter operationsspring-doc.cn

FieldValueCounterOperationsspring-doc.cn

REST client for field value counter operationsspring-doc.cn

AggregateCounterOperationsspring-doc.cn

REST client for aggregate counter operationsspring-doc.cn

TaskOperationsspring-doc.cn

REST client for task operationsspring-doc.cn

JobOperationsspring-doc.cn

REST client for job operationsspring-doc.cn

AppRegistryOperationsspring-doc.cn

REST client for app registry operationsspring-doc.cn

CompletionOperationsspring-doc.cn

REST client for completion operationsspring-doc.cn

RuntimeOperationsspring-doc.cn

REST Client for runtime operationsspring-doc.cn

When the DataFlowTemplate is being initialized, the sub-templates can be discovered through the REST relations, which are provided by HATEOAS (Hypermedia as the Engine of Application State).spring-doc.cn

If a resource cannot be resolved, the respective sub-template results in NULL. A common cause is that Spring Cloud Data Flow allows for specific sets of features to be enabled or disabled when launching. For more information, see one of the local, Cloud Foundry, or Kubernetes configuration chapters, depending on where you deploy your application.

A.1. Using the Data Flow Template

When you use the Data Flow Template, the only needed Data Flow dependency is the Spring Cloud Data Flow Rest Client, as shown in the following Maven snippet:spring-doc.cn

<dependency>
  <groupId>org.springframework.cloud</groupId>
  <artifactId>spring-cloud-dataflow-rest-client</artifactId>
  <version>2.10.4-SNAPSHOT</version>
</dependency>

With that dependency, you get the DataFlowTemplate class as well as all the dependencies needed to make calls to a Spring Cloud Data Flow server.spring-doc.cn

When instantiating the DataFlowTemplate, you also pass in a RestTemplate. Note that the needed RestTemplate requires some additional configuration to be valid in the context of the DataFlowTemplate. When declaring a RestTemplate as a bean, the following configuration suffices:spring-doc.cn

  @Bean
  public static RestTemplate restTemplate() {
    RestTemplate restTemplate = new RestTemplate();
    restTemplate.setErrorHandler(new VndErrorResponseErrorHandler(restTemplate.getMessageConverters()));
    for(HttpMessageConverter<?> converter : restTemplate.getMessageConverters()) {
      if (converter instanceof MappingJackson2HttpMessageConverter) {
        final MappingJackson2HttpMessageConverter jacksonConverter =
            (MappingJackson2HttpMessageConverter) converter;
        jacksonConverter.getObjectMapper()
            .registerModule(new Jackson2HalModule())
            .addMixIn(JobExecution.class, JobExecutionJacksonMixIn.class)
            .addMixIn(JobParameters.class, JobParametersJacksonMixIn.class)
            .addMixIn(JobParameter.class, JobParameterJacksonMixIn.class)
            .addMixIn(JobInstance.class, JobInstanceJacksonMixIn.class)
            .addMixIn(ExitStatus.class, ExitStatusJacksonMixIn.class)
            .addMixIn(StepExecution.class, StepExecutionJacksonMixIn.class)
            .addMixIn(ExecutionContext.class, ExecutionContextJacksonMixIn.class)
            .addMixIn(StepExecutionHistory.class, StepExecutionHistoryJacksonMixIn.class);
      }
    }
    return restTemplate;
  }
You can also get a pre-configured RestTemplate by using DataFlowTemplate.getDefaultDataflowRestTemplate();

Now you can instantiate the DataFlowTemplate with the following code:spring-doc.cn

DataFlowTemplate dataFlowTemplate = new DataFlowTemplate(
    new URI("http://localhost:9393/"), restTemplate);         (1)
1 The URI points to the ROOT of your Spring Cloud Data Flow Server.

Depending on your requirements, you can now make calls to the server. For instance, if you want to get a list of the currently available applications, you can run the following code:spring-doc.cn

PagedResources<AppRegistrationResource> apps = dataFlowTemplate.appRegistryOperations().list();

System.out.println(String.format("Retrieved %s application(s)",
    apps.getContent().size()));

for (AppRegistrationResource app : apps.getContent()) {
  System.out.println(String.format("App Name: %s, App Type: %s, App URI: %s",
    app.getName(),
    app.getType(),
    app.getUri()));
}

A.2. Data Flow Template and Security

When using the DataFlowTemplate, you can also provide all the security-related options as if you were using the Data Flow Shell. In fact, the Data Flow Shell uses the DataFlowTemplate for all its operations.spring-doc.cn

To let you get started, we provide a HttpClientConfigurer that uses the builder pattern to set the various security-related options:spring-doc.cn

	HttpClientConfigurer
		.create(targetUri)                                             (1)
		.basicAuthCredentials(username, password)                      (2)
		.skipTlsCertificateVerification()                              (3)
		.withProxyCredentials(proxyUri, proxyUsername, proxyPassword)  (4)
		.addInterceptor(interceptor)                                   (5)
		.buildClientHttpRequestFactory()                               (6)
1 Creates a HttpClientConfigurer with the provided target URI.
2 Sets the credentials for basic authentication (Using OAuth2 Password Grant)
3 Skip SSL certificate verification (Use for DEVELOPMENT ONLY!)
4 Configure any Proxy settings
5 Add a custom interceptor e.g. to set the OAuth2 Authorization header. This allows you to pass an OAuth2 Access Token instead of username/password credentials.
6 Builds the ClientHttpRequestFactory that can be set on the RestTemplate.

Once the HttpClientConfigurer is configured, you can use its buildClientHttpRequestFactory to build the ClientHttpRequestFactory and then set the corresponding property on the RestTemplate. You can then instantiate the actual DataFlowTemplate using that RestTemplate.spring-doc.cn

To configure Basic Authentication, the following setup is required:spring-doc.cn

	RestTemplate restTemplate = DataFlowTemplate.getDefaultDataflowRestTemplate();
	HttpClientConfigurer httpClientConfigurer = HttpClientConfigurer.create("http://localhost:9393");

	httpClientConfigurer.basicAuthCredentials("my_username", "my_password");
	restTemplate.setRequestFactory(httpClientConfigurer.buildClientHttpRequestFactory());

	DataFlowTemplate dataFlowTemplate = new DataFlowTemplate("http://localhost:9393", restTemplate);

Appendix B: “How-to” guides

This section provides answers to some common ‘how do I do that…​’ questions that often arise when people use Spring Cloud Data Flow.spring-doc.cn

If you have a specific problem that we do not cover here, you might want to check out stackoverflow.com to see if someone has already provided an answer. That is also a great place to ask new questions (use the spring-cloud-dataflow tag).spring-doc.cn

We are also more than happy to extend this section. If you want to add a “how-to”, you can send us a pull request.spring-doc.cn

B.1. Configure Maven Properties

You can set the Maven properties, such as the local Maven repository location, remote Maven repositories, authentication credentials, and proxy server properties through command-line properties when you start the Data Flow server. Alternatively, you can set the properties by setting the SPRING_APPLICATION_JSON environment property for the Data Flow server.spring-doc.cn

The remote Maven repositories need to be configured explicitly if the applications are resolved by using the Maven repository. The one exception to this rule is for the local Data Flow server installation, which already has Maven Central and the Spring Artifactory remote repositories pre-configured. The other (non-local) server installations have no default value for remote repositories.spring-doc.cn

If you configure your own remote repositories, be sure to add Maven central (repo.maven.apache.org/maven2) as it is not automatically added for you.

To pass the properties as command-line options, run the server with a command similar to the following:spring-doc.cn

$ java -jar <dataflow-server>.jar --maven.localRepository=mylocal
--maven.remote-repositories.repo1.url=https://repo1
--maven.remote-repositories.repo1.auth.username=repo1user
--maven.remote-repositories.repo1.auth.password=repo1pass
--maven.remote-repositories.repo2.url=https://repo2 --maven.proxy.host=proxyhost
--maven.proxy.port=9018 --maven.proxy.auth.username=proxyuser
--maven.proxy.auth.password=proxypass

You can also use the SPRING_APPLICATION_JSON environment property:spring-doc.cn

export SPRING_APPLICATION_JSON='{ "maven": { "local-repository": "local","remote-repositories": { "repo1": { "url": "https://repo1", "auth": { "username": "repo1user", "password": "repo1pass" } },
"repo2": { "url": "https://repo2" } }, "proxy": { "host": "proxyhost", "port": 9018, "auth": { "username": "proxyuser", "password": "proxypass" } } } }'

Here is the same content in nicely formatted JSON:spring-doc.cn

SPRING_APPLICATION_JSON='{
  "maven": {
    "local-repository": "local",
    "remote-repositories": {
      "repo1": {
        "url": "https://repo1",
        "auth": {
          "username": "repo1user",
          "password": "repo1pass"
        }
      },
      "repo2": {
        "url": "https://repo2"
      }
    },
    "proxy": {
      "host": "proxyhost",
      "port": 9018,
      "auth": {
        "username": "proxyuser",
        "password": "proxypass"
      }
    }
  }
}'
Depending on the Spring Cloud Data Flow server implementation, you may have to pass the environment properties by using the platform specific environment-setting capabilities. For instance, in Cloud Foundry, you would pass them as cf set-env SPRING_APPLICATION_JSON.

B.2. Troubleshooting

This section covers how to troubleshoot Spring Cloud Data Flow on your platform of choice. See the Troubleshooting sections of the microsite for Stream and Batch processing.spring-doc.cn

B.3. Extending application classpath

Users may require the addition of dependencies to the existing Stream applications or specific database drivers to Dataflow and Skipper or any of the other containers provider by the project.spring-doc.cn

The Spring Cloud Dataflow repository contains scripts to help with this task. The examples below assume you have cloned the spring-cloud-dataflow repository and are executing the scripts from src/templates/add-deps.

B.3.1. Containers

In order to add dependencies to existing container the following steps will be the approach.spring-doc.cn

  • Create a folder with the extra dependencies.spring-doc.cn

  • Create a new container image while copying the files to the libraries folder.spring-doc.cn

  • Push the image to a private registry.spring-doc.cn

Environmental variables
  • DEPS_FOLDER should be a full filename or path expression for files to copy to the container.spring-doc.cn

  • CONTAINER_REPO the source docker image name.spring-doc.cn

  • CONTAINER_TAG the tag of source image.spring-doc.cn

  • PRIVATE_REGISTRY the host name of the private registry.spring-doc.cn

Examples
export CONTAINER_REPO="springcloud/spring-cloud-dataflow-server"
export CONTAINER_TAG="2.9.5-jdk17"
export PRIVATE_REGISTRY="our.private.registry"
export DEPS_FOLDER="./extra-libs/"
docker build -f Dockerfile -t "$PRIVATE_REGISTRY/$CONTAINER_REPO:$CONTAINER_TAG"
docker push "$PRIVATE_REGISTRY/$CONTAINER_REPO:$CONTAINER_TAG"
As pointed out above, the Dockerfile lives in the spring-cloud-dataflow repository under src/templates/add-deps.

B.3.2. JAR File

When using CloudFoundry or local deployment you will need to update jar before publishing it to a private registry or Maven Local.spring-doc.cn

Example

This example adds the dependencies and then installs the jar to Maven local.spring-doc.cn

./gradlew -i publishToMavenLocal \
    -P appFolder="." \
    -P appGroup="org.springframework.cloud" \
    -P appName="spring-cloud-dataflow-server" \
    -P appVersion="2.9.5" \
    -P depFolder="./extra-libs"
Use the publishMavenPublicationToMavenRepository task to publish to a remote repository. Update the gradle.properties with the remote repository details. Alternatively move repoUser and repoPassword to ~/.gradle/gradle.properties

B.4. Create containers for architectures not supported yet.

In the case of macOS on M1 the performance of amd64/x86_64 is unacceptable. We provide a set of scripts that can be used to download specific versions of published artifacts. We also provide a script that will create a container using the downloaded artifact for the host platform. In the various projects you will find then in src/local or local folders.spring-doc.cn

Project Scripts Notes

Data Flowspring-doc.cn

src/local/download-apps.sh
src/local/create-containers.sh
spring-doc.cn

Download or create container for: spring-cloud-dataflow-server,
spring-cloud-dataflow-composed-task-runner,
spring-cloud-dataflow-single-step-batch-job,
spring-cloud-dataflow-tasklauncher-sink-kafka,
spring-cloud-dataflow-tasklauncher-sink-rabbit
spring-doc.cn

Skipperspring-doc.cn

local/download-app.sh
local/create-container.shspring-doc.cn

Download or create container for: spring-cloud-skipper-serverspring-doc.cn

Stream Applicationsspring-doc.cn

local/download-apps.sh
local/create-containers.sh
local/pack-containers.shspring-doc.cn

create-containers.sh uses jib
pack-containers.sh uses packspring-doc.cn

B.4.1. Scripts in spring-cloud-dataflow

src/local/download-apps.sh

Downloads all applications needed by create-containers.sh from Maven repository.spring-doc.cn

If the timestamp of snapshots matches the download will be skipped.spring-doc.cn

Usage: download-apps.sh [version]spring-doc.cn

  • version is the dataflow-server version like 2.9.6. Default is 2.10.4-SNAPSHOTspring-doc.cn

src/local/create-containers.sh

Creates all containers and pushes to local docker registry.spring-doc.cn

This script requires jib-clispring-doc.cn

Usage: create-containers.sh [version] [jre-version]spring-doc.cn

  • version is the dataflow-server version like 2.9.6. Default is 2.10.4-SNAPSHOTspring-doc.cn

  • jre-version should be one of 11, 17. Default is 11spring-doc.cn

B.4.2. Scripts in spring-cloud-skipper

local/download-app.sh

Downloads all applications needed by create-containers.sh from Maven repository.spring-doc.cn

If the timestamp of snapshots matches the download will be skipped.spring-doc.cn

Usage: download-app.sh [version]spring-doc.cn

  • version is the skipper version like 2.8.6 or default is 2.9.4-SNAPSHOTspring-doc.cn

local/create-container.sh

Creates all containers and pushes to local docker registry. This script requires jib-clispring-doc.cn

Usage: create-containers.sh [version] [jre-version]spring-doc.cn

  • version is the skipper version like 2.8.6 or default is 2.9.4-SNAPSHOTspring-doc.cn

  • jre-version should be one of 11, 17spring-doc.cn

B.4.3. Scripts in stream-applications

local/download-apps.sh

Downloads all applications needed by create-containers.sh from Maven repository.spring-doc.cn

If the timestamp of snapshots matches the download will be skipped.spring-doc.cn

Usage: download-apps.sh [version] [broker] [filter]spring-doc.cn

  • version is the stream applications version like 3.2.1 or default is 3.2.2-SNAPSHOTspring-doc.cn

  • broker is one of rabbitmq, rabbit or kafkaspring-doc.cn

  • filter is a name of an application or a partial name that will be matched.spring-doc.cn

local/create-containers.sh

Creates all containers and pushes to local docker registry.spring-doc.cn

This script requires jib-clispring-doc.cn

Usage: create-containers.sh [version] [broker] [jre-version] [filter]spring-doc.cn

  • version is the stream-applications version like 3.2.1 or default is 3.2.2-SNAPSHOTspring-doc.cn

  • broker is one of rabbitmq, rabbit or kafkaspring-doc.cn

  • jre-version should be one of 11, 17spring-doc.cn

  • filter is a name of an application or a partial name that will be matched.spring-doc.cn

If the file is not present required to create the container the script will skip the one.spring-doc.cn

local/pack-containers.sh

Creates all containers and pushes to local docker registry.spring-doc.cn

This script requires packeto packspring-doc.cn

Usage: pack-containers.sh [version] [broker] [jre-version] [filter]spring-doc.cn

  • version is the stream-applications version like 3.2.1 or default is 3.2.2-SNAPSHOTspring-doc.cn

  • broker is one of rabbitmq, rabbit or kafkaspring-doc.cn

  • jre-version should be one of 11, 17spring-doc.cn

  • filter is a name of an application or a partial name that will be matched.spring-doc.cn

If the required file is not present to create the container the script will skip that one.spring-doc.cn

If any parameter is provided all those to the left of it should be considered required.

B.5. Configure Kubernetes for local development or testing

B.5.1. Prerequisites

You will need to install kubectl and then kind or minikube for a local cluster.spring-doc.cn

All the examples assume you have cloned the spring-cloud-dataflow repository and are executing the scripts from src/local/k8s.spring-doc.cn

On macOS you may need to install realpath from Macports or brew install realpathspring-doc.cn

The scripts require a shell like bash or zsh and should work on Linux, WSL 2 or macOS.

B.5.2. Steps

Kubernetes Provider

How do I choose between minikube and kind? kind will generally provide quicker setup and teardown time than Minikube. There is little to choose in terms of performance between the 2 apart from being able to configure limits on CPUs and memory when deploying minikube. So in the case where you have memory constraints or need to enforce memory limitations Minikube will be a better option.spring-doc.cn

Kubectl

You will need to install kubectl in order to configure the Kubernetes clusterspring-doc.cn

Kind

Kind is Kubernetes in docker and ideal for local development.spring-doc.cn

The LoadBalancer will be installed by the configure-k8s.sh script by will require an update to a yaml file to provide the address range available to the LoadBalancer.spring-doc.cn

Minikube

Minikube uses one of a selection of drivers to provide a virtualization environment.spring-doc.cn

Delete existing Minikube installation if you have any. minikube delete

B.5.3. Building and loading containers.

For local development you need control of the containers used in the local environment.spring-doc.cn

In order to ensure to manage the specific versions of data flow and skipper containers you can set SKIPPER_VERSION and DATAFLOW_VERSION environmental variable and then invoke ./pull-dataflow.sh and ./pull-skipper.sh or if you want to use a locally built application you can invoke ./build-skipper-image.sh and ./build-dataflow.shspring-doc.cn

B.5.4. Configure k8s environment

You can invoke one of the following scripts to choose the type of installation you are targeting:spring-doc.cn

use-kind.sh [<namespace>] [<database>] [<broker>]
use-mk-docker.sh [<namespace>] [<database>] [<broker>]
use-mk-kvm2.sh [<namespace>] [<database>] [<broker>]
use-mk.sh <driver> [<namespace>] [<database>] [<broker>] (1)
use-tmc.sh <cluster-name> [<namespace>] [<database>] [<broker>]
use-gke.sh <cluster-name> [<namespace>] [<database>] [<broker>]
1 <driver> must be one of kvm2, docker, vmware, virtualbox, vmwarefusion or hyperkit. docker is the recommended option for local development.
<namespace> will be default if not provided. The default <database> is postgresql and the default <broker> is kafka.

Since these scripts export environmental variable they need to be executes as in the following example:spring-doc.cn

source ./use-mk-docker.sh test-ns postgresql rabbitmq
TMC or GKE Cluster in Cloud

The cluster must exist before use and you should use the relevant cli to login before executing source ./use-gke.shspring-doc.cn

Create Local Cluster.

The following script will create the local cluster.spring-doc.cn

./configure-k8s.sh
  • For kind follow instruction to update src/local/k8s/yaml/metallb-configmap.yaml and then apply using kubectl apply -f src/local/k8s/yaml/metallb-configmap.yamlspring-doc.cn

  • For minikube launch a new shell and execute minikube tunnelspring-doc.cn

Deploy Spring Cloud Data Flow.
Configure Broker
export BROKER=<broker> (1)
1 <broker> one of kafka or rabbitmq
Configure Database
export DATABASE=<database> (1)
1 <database> one of mariadb or postgresql
This is still optional and PostgreSQL support isn’t available yet but will follow soon.
./install-scdf.sh
source ./export-dataflow-ip.sh
Delete the deployment from the cluster.
./delete-scdf.sh
Delete the cluster

This script will also delete the TMC cluster if you have configured one.spring-doc.cn

./destroy-k8s.sh

B.5.5. Utilities

The following list of utilities may prove useful.spring-doc.cn

Name Description

k9sspring-doc.cn

k9s is a text based monitor to explore the Kubernetes cluster.spring-doc.cn

kailspring-doc.cn

Extra and tail the logs of various pods based on various naming criteria.spring-doc.cn

kail
  • Using kail to log activity related to a specific stream.spring-doc.cn

kail --label=spring-group-id=<stream-name>
kail --ns=<namespace>

B.5.6. Scripts

Some of the scripts apply to local containers as well and can be found in src/local, the Kubernetes specific scripts are in src/local/k8sspring-doc.cn

Script Description

build-app-images.shspring-doc.cn

Build all images of Restaurant Sample Stream Appsspring-doc.cn

pull-app-images.shspring-doc.cn

Pull all images of Restaurant Sample Stream Apps from Docker Hubspring-doc.cn

pull-dataflow.shspring-doc.cn

Pull dataflow from DockerHub based on DATAFLOW_VERSION.spring-doc.cn

pull-scdf-pro.shspring-doc.cn

Pull Dataflow Pro from Tanzu Network based on SCDF_PRO_VERSION.spring-doc.cn

pull-skipper.shspring-doc.cn

Pull Skipper from DockerHub base on the SKIPPER_VERSION.spring-doc.cn

build-dataflow-image.shspring-doc.cn

Build a docker image from the local repo of Dataflowspring-doc.cn

build-scdf-pro-image.shspring-doc.cn

Build a docker image from the local repo of Dataflow Pro. Set USE_PRO=true in environment to use Dataflow Prospring-doc.cn

build-skipper-image.shspring-doc.cn

Build a docker image from the local repo of Skipper.spring-doc.cn

configure-k8s.shspring-doc.cn

Configure the Kubernetes environment based on your configuration of K8S_DRIVER.spring-doc.cn

delete-scdf.shspring-doc.cn

Delete all Kubernetes resources create by the deployment.spring-doc.cn

destroy-k8s.shspring-doc.cn

Delete cluster, kind or minikube.spring-doc.cn

export-dataflow-ip.shspring-doc.cn

Export the url of the data flow server to DATAFLOW_IPspring-doc.cn

export-http-url.shspring-doc.cn

Export the url of an http source of a specific flow by name to HTTP_APP_URLspring-doc.cn

install-scdf.shspring-doc.cn

Configure and deploy all the containers for Spring Cloud Dataflowspring-doc.cn

load-images.shspring-doc.cn

Load all container images required by tests into kind or minikube to ensure you have control over what is used.spring-doc.cn

load-image.shspring-doc.cn

Load a specific container image into local kind or minikube.spring-doc.cn

local-k8s-acceptance-tests.shspring-doc.cn

Execute acceptance tests against cluster where DATAFLOW_IP is pointing.spring-doc.cn

register-apps.shspring-doc.cn

Register the Task and Stream apps used by the unit tests.spring-doc.cn

Please report any errors with the scripts along with detail information about the relevant environment.

B.6. Frequently Asked Questions

In this section, we review the frequently asked questions for Spring Cloud Data Flow. See the Frequently Asked Questions section of the microsite for more information.spring-doc.cn