Getting Started
3. Getting Started - Local
See the Local Machine section of the microsite for more information on setting up docker compose and manual installation.
4. Getting Started - Cloud Foundry
This section covers how to get started with Spring Cloud Data Flow on Cloud Foundry. See the Cloud Foundry section of the microsite for more information on installing Spring Cloud Data Flow on Cloud Foundry.
Once you have the Data Flow server installed on Cloud Foundry, you probably want to get started with orchestrating the deployment of readily available pre-built applications into coherent streaming or batch data pipelines. We have guides to help you get started with both Stream and Batch processing.
5. Getting Started - Kubernetes
Spring Cloud Data Flow is a toolkit for building data integration and real-time data-processing pipelines.
Pipelines consist of Spring Boot applications built with the Spring Cloud Stream or Spring Cloud Task microservice frameworks. This makes Spring Cloud Data Flow suitable for a range of data-processing use cases, from import-export to event streaming and predictive analytics.
This project provides support for using Spring Cloud Data Flow with Kubernetes as the runtime for these pipelines, with applications packaged as Docker images.
See the Kubernetes section of the microsite for more information on installing Spring Cloud Data Flow on Kubernetes.
Once you have the Data Flow server installed on Kubernetes, you probably want to get started with orchestrating the deployment of readily available pre-built applications into a coherent streaming or batch data pipelines. We have guides to help you get started with both Stream and Batch processing.
5.1. Application and Server Properties
This section covers how you can customize the deployment of your applications. You can use a number of properties to influence settings for the applications that are deployed. Properties can be applied on a per-application basis or in the appropriate server configuration for all deployed applications.
Properties set on a per-application basis always take precedence over properties set as the server configuration. This arrangement lets you override global server level properties on a per-application basis. |
Properties to be applied for all deployed Tasks are defined in the src/kubernetes/server/server-config-[binder].yaml
file and for Streams in src/kubernetes/skipper/skipper-config-[binder].yaml
. Replace [binder]
with the messaging middleware you are using — for example, rabbit
or kafka
.
5.1.1. Memory and CPU Settings
Applications are deployed with default memory and CPU settings. If you need to, you can adjust these values. The following example shows how to set Limits
to 1000m
for CPU
and 1024Mi
for memory and Requests
to 800m
for CPU and 640Mi
for memory:
deployer.<application>.kubernetes.limits.cpu=1000m
deployer.<application>.kubernetes.limits.memory=1024Mi
deployer.<application>.kubernetes.requests.cpu=800m
deployer.<application>.kubernetes.requests.memory=640Mi
Those values results in the following container settings being used:
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 800m
memory: 640Mi
You can also control the default values to which to set the cpu
and memory
globally.
The following example shows how to set the CPU and memory for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
limits:
memory: 640mi
cpu: 500m
The following example shows how to set the CPU and memory for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
limits:
memory: 640mi
cpu: 500m
The settings we have used so far affect only the settings for the container. They do not affect the memory setting for the JVM process in the container. If you would like to set JVM memory settings, you can set an environment variable to do so. See the next section for details.
5.1.2. Environment Variables
To influence the environment settings for a given application, you can use the spring.cloud.deployer.kubernetes.environmentVariables
deployer property.
For example, a common requirement in production settings is to influence the JVM memory arguments.
You can do so by using the JAVA_TOOL_OPTIONS
environment variable, as the following example shows:
deployer.<application>.kubernetes.environmentVariables=JAVA_TOOL_OPTIONS=-Xmx1024m
The environmentVariables property accepts a comma-delimited string. If an environment variable contains a value
that is also a comma-delimited string, it must be enclosed in single quotation marks — for example,
spring.cloud.deployer.kubernetes.environmentVariables=spring.cloud.stream.kafka.binder.brokers='somehost:9092,
anotherhost:9093'
|
This overrides the JVM memory setting for the desired <application>
(replace <application>
with the name of your application).
5.1.3. Liveness and Readiness Probes
The liveness
and readiness
probes use paths called /health
and /info
, respectively. They use a delay
of 10
for both and a period
of 60
and 10
respectively. You can change these defaults when you deploy the stream by using deployer properties. The liveness and readiness probes are applied only to streams.
The following example changes the liveness
probe (replace <application>
with the name of your application) by setting deployer properties:
deployer.<application>.kubernetes.livenessProbePath=/health
deployer.<application>.kubernetes.livenessProbeDelay=120
deployer.<application>.kubernetes.livenessProbePeriod=20
You can declare the same as part of the server global configuration for streams, as the following example shows:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
livenessProbePath: /health
livenessProbeDelay: 120
livenessProbePeriod: 20
Similarly, you can swap liveness
for readiness
to override the default readiness
settings.
By default, port 8080 is used as the probe port. You can change the defaults for both liveness
and readiness
probe ports by using deployer properties, as the following example shows:
deployer.<application>.kubernetes.readinessProbePort=7000
deployer.<application>.kubernetes.livenessProbePort=7000
You can declare the same as part of the global configuration for streams, as the following example shows:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
readinessProbePort: 7000
livenessProbePort: 7000
By default, the
To automatically set both
|
You can access secured probe endpoints by using credentials stored in a Kubernetes secret. You can use an existing secret, provided the credentials are contained under the credentials
key name of the secret’s data
block. You can configure probe authentication on a per-application basis. When enabled, it is applied to both the liveness
and readiness
probe endpoints by using the same credentials and authentication type. Currently, only Basic
authentication is supported.
To create a new secret:
-
Generate the base64 string with the credentials used to access the secured probe endpoints.
Basic authentication encodes a username and a password as a base64 string in the format of
username:password
.The following example (which includes output and in which you should replace
user
andpass
with your values) shows how to generate a base64 string:$ echo -n "user:pass" | base64 dXNlcjpwYXNz
-
With the encoded credentials, create a file (for example,
myprobesecret.yml
) with the following contents:apiVersion: v1 kind: Secret metadata: name: myprobesecret type: Opaque data: credentials: GENERATED_BASE64_STRING
-
Replace
GENERATED_BASE64_STRING
with the base64-encoded value generated earlier. -
Create the secret by using
kubectl
, as the following example shows:$ kubectl create -f ./myprobesecret.yml secret "myprobesecret" created
-
Set the following deployer properties to use authentication when accessing probe endpoints, as the following example shows:
deployer.<application>.kubernetes.probeCredentialsSecret=myprobesecret
Replace
<application>
with the name of the application to which to apply authentication.
5.1.4. Using SPRING_APPLICATION_JSON
You can use a SPRING_APPLICATION_JSON
environment variable to set Data Flow server properties (including the configuration of Maven repository settings) that are common across all of the Data Flow server implementations. These settings go at the server level in the container env
section of a deployment YAML. The following example shows how to do so:
env:
- name: SPRING_APPLICATION_JSON
value: "{ \"maven\": { \"local-repository\": null, \"remote-repositories\": { \"repo1\": { \"url\": \"https://repo.spring.io/libs-snapshot\"} } } }"
5.1.5. Private Docker Registry
You can pull Docker images from a private registry on a per-application basis. First, you must create a secret in the cluster. Follow the Pull an Image from a Private Registry guide to create the secret.
Once you have created the secret, you can use the imagePullSecret
property to set the secret to use, as the following example shows:
deployer.<application>.kubernetes.imagePullSecret=mysecret
Replace <application>
with the name of your application and mysecret
with the name of the secret you created earlier.
You can also configure the image pull secret at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
imagePullSecret: mysecret
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
imagePullSecret: mysecret
Replace mysecret
with the name of the secret you created earlier.
5.1.6. Annotations
You can add annotations to Kubernetes objects on a per-application basis. The supported object types are pod Deployment
, Service
, and Job
. Annotations are defined in a key:value
format, allowing for multiple annotations separated by a comma. For more information and use cases on annotations, see Annotations.
The following example shows how you can configure applications to use annotations:
deployer.<application>.kubernetes.podAnnotations=annotationName:annotationValue
deployer.<application>.kubernetes.serviceAnnotations=annotationName:annotationValue,annotationName2:annotationValue2
deployer.<application>.kubernetes.jobAnnotations=annotationName:annotationValue
Replace <application>
with the name of your application and the value of your annotations.
5.1.7. Entry Point Style
An entry point style affects how application properties are passed to the container to be deployed. Currently, three styles are supported:
-
exec
(default): Passes all application properties and command line arguments in the deployment request as container arguments. Application properties are transformed into the format of--key=value
. -
shell
: Passes all application properties and command line arguments as environment variables. Each of the applicationor command-line argument properties is transformed into an uppercase string and.
characters are replaced with_
. -
boot
: Creates an environment variable calledSPRING_APPLICATION_JSON
that contains a JSON representation of all application properties. Command line arguments from the deployment request are set as container args.
In all cases, environment variables defined at the server-level configuration and on a per-application basis are sent on to the container as is. |
You can configure an application as follows:
deployer.<application>.kubernetes.entryPointStyle=<Entry Point Style>
Replace <application>
with the name of your application and <Entry Point Style>
with your desired entry point style.
You can also configure the entry point style at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
entryPointStyle: entryPointStyle
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
entryPointStyle: entryPointStyle
Replace entryPointStyle
with the desired entry point style.
You should choose an Entry Point Style of either exec
or shell
, to correspond to how the ENTRYPOINT
syntax is defined in the container’s Dockerfile
. For more information and uses cases on exec
versus shell
, see the ENTRYPOINT section of the Docker documentation.
Using the boot
entry point style corresponds to using the exec
style ENTRYPOINT
. Command line arguments from the deployment request are passed to the container, with the addition of application properties being mapped into the SPRING_APPLICATION_JSON
environment variable rather than command line arguments.
When you use the boot Entry Point Style, the deployer.<application>.kubernetes.environmentVariables property must not contain SPRING_APPLICATION_JSON .
|
5.1.8. Deployment Service Account
You can configure a custom service account for application deployments through properties. You can use an existing service account or create a new one. One way to create a service account is by using kubectl
, as the following example shows:
$ kubectl create serviceaccount myserviceaccountname
serviceaccount "myserviceaccountname" created
Then you can configure individual applications as follows:
deployer.<application>.kubernetes.deploymentServiceAccountName=myserviceaccountname
Replace <application>
with the name of your application and myserviceaccountname
with your service account name.
You can also configure the service account name at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
deploymentServiceAccountName: myserviceaccountname
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
deploymentServiceAccountName: myserviceaccountname
Replace myserviceaccountname
with the service account name to be applied to all deployments.
5.1.9. Image Pull Policy
An image pull policy defines when a Docker image should be pulled to the local registry. Currently, three policies are supported:
-
IfNotPresent
(default): Do not pull an image if it already exists. -
Always
: Always pull the image regardless of whether it already exists. -
Never
: Never pull an image. Use only an image that already exists.
The following example shows how you can individually configure applications:
deployer.<application>.kubernetes.imagePullPolicy=Always
Replace <application>
with the name of your application and Always
with your desired image pull policy.
You can configure an image pull policy at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
imagePullPolicy: Always
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
imagePullPolicy: Always
Replace Always
with your desired image pull policy.
5.1.10. Deployment Labels
You can set custom labels on objects related to Deployment. See Labels for more information on labels. Labels are specified in key:value
format.
The following example shows how you can individually configure applications:
deployer.<application>.kubernetes.deploymentLabels=myLabelName:myLabelValue
Replace <application>
with the name of your application, myLabelName
with your label name, and myLabelValue
with the value of your label.
Additionally, you can apply multiple labels, as the following example shows:
deployer.<application>.kubernetes.deploymentLabels=myLabelName:myLabelValue,myLabelName2:myLabelValue2
5.1.11. Tolerations
Tolerations work with taints to ensure pods are not scheduled onto particular nodes. Tolerations are set into the pod configuration while taints are set onto nodes. See the Taints and Tolerations section of the Kubernetes reference for more information.
The following example shows how you can individually configure applications:
deployer.<application>.kubernetes.tolerations=[{key: 'mykey' operator: 'Equal', value: 'myvalue', effect: 'NoSchedule'}]
Replace <application>
with the name of your application and the key-value pairs according to your desired toleration configuration.
You can configure tolerations at the global server level as well.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
tolerations:
- key: mykey
operator: Equal
value: myvalue
effect: NoSchedule
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
tolerations:
- key: mykey
operator: Equal
value: myvalue
effect: NoSchedule
Replace the tolerations
key-value pairs according to your desired toleration configuration.
5.1.12. Secret References
Secrets can be referenced and their entire data contents can be decoded and inserted into the pod environment as individual variables. See the Configure all key-value pairs in a Secret as container environment variables section of the Kubernetes reference for more information.
The following example shows how you can individually configure applications:
deployer.<application>.kubernetes.secretRefs=testsecret
You can also specify multiple secrets, as follows:
deployer.<application>.kubernetes.secretRefs=[testsecret,anothersecret]
Replace <application>
with the name of your application and the secretRefs
attribute with the appropriate values for your application environment and secret.
You can configure secret references at the global server level as well.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
secretRefs:
- testsecret
- anothersecret
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
secretRefs:
- testsecret
- anothersecret
Replace the items of secretRefs
with one or more secret names.
5.1.13. Secret Key References
Secrets can be referenced and their decoded value can be inserted into the pod environment. See the Using Secrets as Environment Variables section of the Kubernetes reference for more information.
The following example shows how you can individually configure applications:
deployer.<application>.kubernetes.secretKeyRefs=[{envVarName: 'MY_SECRET', secretName: 'testsecret', dataKey: 'password'}]
Replace <application>
with the name of your application and the envVarName
, secretName
, and dataKey
attributes with the appropriate values for your application environment and secret.
You can configure secret key references at the global server level as well.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
secretKeyRefs:
- envVarName: MY_SECRET
secretName: testsecret
dataKey: password
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
secretKeyRefs:
- envVarName: MY_SECRET
secretName: testsecret
dataKey: password
Replace the envVarName
, secretName
, and dataKey
attributes with the appropriate values for your secret.
5.1.14. ConfigMap References
A ConfigMap can be referenced and its entire data contents can be decoded and inserted into the pod environment as individual variables. See the Configure all key-value pairs in a ConfigMap as container environment variables section of the Kubernetes reference for more information.
The following example shows how you can individually configure applications:
deployer.<application>.kubernetes.configMapRefs=testcm
You can also specify multiple ConfigMap instances, as follows:
deployer.<application>.kubernetes.configMapRefs=[testcm,anothercm]
Replace <application>
with the name of your application and the configMapRefs
attribute with the appropriate values for your application environment and ConfigMap.
You can configure ConfigMap references at the global server level as well.
The following example shows how to do so for streams. Edit the appropriate skipper-config-(binder).yaml
, replacing (binder)
with the corresponding binder in use:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
configMapRefs:
- testcm
- anothercm
The following example shows how to do so for tasks by editing the server-config.yaml
file:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
configMapRefs:
- testcm
- anothercm
Replace the items of configMapRefs
with one or more secret names.
5.1.15. ConfigMap Key References
A ConfigMap can be referenced and its associated key value inserted into the pod environment. See the Define container environment variables using ConfigMap data section of the Kubernetes reference for more information.
The following example shows how you can individually configure applications:
deployer.<application>.kubernetes.configMapKeyRefs=[{envVarName: 'MY_CM', configMapName: 'testcm', dataKey: 'platform'}]
Replace <application>
with the name of your application and the envVarName
, configMapName
, and dataKey
attributes with the appropriate values for your application environment and ConfigMap.
You can configure ConfigMap references at the global server level as well.
The following example shows how to do so for streams. Edit the appropriate skipper-config-(binder).yaml
, replacing (binder)
with the corresponding binder in use:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
configMapKeyRefs:
- envVarName: MY_CM
configMapName: testcm
dataKey: platform
The following example shows how to do so for tasks by editing the server-config.yaml
file:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
configMapKeyRefs:
- envVarName: MY_CM
configMapName: testcm
dataKey: platform
Replace the envVarName
, configMapName
, and dataKey
attributes with the appropriate values for your ConfigMap.
5.1.16. Pod Security Context
You can confiure the pod security context to run processes under the specified UID (user ID) or GID (group ID).
This is useful when you want to not run processes under the default root
UID and GID.
You can define either the runAsUser
(UID) or fsGroup
(GID), and you can configure them to work together.
See the Security Context section of the Kubernetes reference for more information.
The following example shows how you can individually configure application pods:
deployer.<application>.kubernetes.podSecurityContext={runAsUser: 65534, fsGroup: 65534}
Replace <application>
with the name of your application and the runAsUser
and/or fsGroup
attributes with the appropriate values for your container environment.
You can configure the pod security context at the global server level as well.
The following example shows how to do so for streams. Edit the appropriate skipper-config-(binder).yaml
, replacing (binder)
with the corresponding binder in use:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
podSecurityContext:
runAsUser: 65534
fsGroup: 65534
The following example shows how to do so for tasks by editing the server-config.yaml
file:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
podSecurityContext:
runAsUser: 65534
fsGroup: 65534
Replace the runAsUser
and/or fsGroup
attributes with the appropriate values for your container environment.
5.1.17. Service Ports
When you deploy applications, a kubernetes Service object is created with a default port of 8080
. If the server.port
property is set, it overrides the default port value. You can add additional ports to the Service object on a per-application basis. You can add multiple ports with a comma delimiter.
The following example shows how you can configure additional ports on a Service object for an application:
deployer.<application>.kubernetes.servicePorts=5000
deployer.<application>.kubernetes.servicePorts=5000,9000
Replace <application>
with the name of your application and the value of your ports.
5.1.18. StatefulSet Init Container
When deploying an application by using a StatefulSet, an Init Container is used to set the instance index in the pod.
By default, the image used is busybox
, which you can be customize.
The following example shows how you can individually configure application pods:
deployer.<application>.kubernetes.statefulSetInitContainerImageName=myimage:mylabel
Replace <application>
with the name of your application and the statefulSetInitContainerImageName
attribute with the appropriate value for your environment.
You can configure the StatefulSet Init Container at the global server level as well.
The following example shows how to do so for streams. Edit the appropriate skipper-config-(binder).yaml
, replacing (binder)
with the corresponding binder in use:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
statefulSetInitContainerImageName: myimage:mylabel
The following example shows how to do so for tasks by editing the server-config.yaml
file:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
statefulSetInitContainerImageName: myimage:mylabel
Replace the statefulSetInitContainerImageName
attribute with the appropriate value for your environment.
5.1.19. Init Containers
When you deploy applications, you can set a custom Init Container on a per-application basis. Refer to the Init Containers section of the Kubernetes reference for more information.
The following example shows how you can configure an Init Container for an application:
deployer.<application>.kubernetes.initContainer={containerName: 'test', imageName: 'busybox:latest', commands: ['sh', '-c', 'echo hello']}
Replace <application>
with the name of your application and set the values of the initContainer
attributes appropriate for your Init Container.
5.1.20. Lifecycle Support
When you deploy applications, you may attach postStart
and preStop
Lifecycle handlers to execute commands.
The Kubernetes API supports other types of handlers besides exec
. This feature may be extended to support additional actions in a future release.
To configure the Lifecycle handlers as shown in the linked page above,specify each command as a comma-delimited list, using the following property keys:
deployer.<application>.kubernetes.lifecycle.postStart.exec.command=/bin/sh,-c,'echo Hello from the postStart handler > /usr/share/message'
deployer.<application>.kubernetes.lifecycle.preStop.exec.command=/bin/sh,-c,'nginx -s quit; while killall -0 nginx; do sleep 1; done'
5.1.21. Additional Containers
When you deploy applications, you may need one or more containers to be deployed along with the main container. This would allow you to adapt some deployment patterns such as sidecar, adapter in case of multi container pod setup.
The following example shows how you can configure additional containers for an application:
deployer.<application>.kubernetes.additionalContainers=[{name: 'c1', image: 'busybox:latest', command: ['sh', '-c', 'echo hello1'], volumeMounts: [{name: 'test-volume', mountPath: '/tmp', readOnly: true}]},{name: 'c2', image: 'busybox:1.26.1', command: ['sh', '-c', 'echo hello2']}]