Configuration of DataCater
The container image datacater/datacater implements DataCater's control plane and exposes a RESTful API. It accepts configuration via environment variables, which allows you to mix it with Kubernetes ConfigMaps and Secrets.
The following sections provide the available configuration options, including a description, their default values if applicable, and information on whether they are mandatory.
The following configuration options are related to Deployments, which are internally powered by Kubernetes Deployments.
Container image name and version to use (default:
datacater/pipeline:2023.1
).Kubernetes Namespace to use for deploying the deployment (default:
default
).Number of replicas of the underlying Kubernetes Deployment (default:
1
). This resembles the number of deployment instances and enables you to parallelize the processing of your data.Path of the health endpoint of deployments (default:
/q/health
).Path of the metrics endpoint of deployments (default:
/q/metrics
).Timeout in milliseconds for requests to deployment statistics (default:
10000
).Memory request of the underlying Kubernetes Deployment (default:
300Mi
).Memory limit of the underlying Kubernetes Deployment (default:
800Mi
).CPU request of the underlying Kubernetes Deployment (default:
0.1
).CPU limit of the underlying Kubernetes Deployment (default: not set).
The following configuration options are related to DataCater 's Python runner, which is used for previewing and evaluating pipelines.
Name of the Kubernetes Service that makes the Python Runner pool accessible (default:
pythonrunner
).Timeout in milliseconds for single requests to the Python Runner pool when previewing pipelines (default:
10000
).Name of the container image (default:
datacater/python-runner
).Version/Tag of the container image (default:
2023.1
).The following configuration options are related to Streams, which are backed by Apache Kafka topics.
DataCater uses this default value for the number of partitions when creating an Apache Kafka topic for a Stream that does not specify a number of partitions (default:
3
).DataCater uses this default value for the replication factor when creating an Apache Kafka topic for a Stream that does not specify a replication factor (default:
1
).Timeout in milliseconds for all requests to Apache Kafka brokers (default:
5000
).Path to the local directory holding all filters. By default, filters are located at
/datacater/filters
in the container image.Path to the local directory holding all transforms. By default, transforms are located at
/datacater/transforms
in the container image.