To run the Desktop connector automatically in a regular basis, one option is to use the Docker version of the Desktop connector and to schedule the execution of a job in an orchestrator like Kubernetes.
This page brings some useful informations to setup this configuration.
How Kubernetes itself works is out of the scope of this page. For any question related to your orchestrator, please check the documentation and the support team of your editor (AWS EKS, Azure AKS, Red Hat OpenShift etc.). In this example, we use the minikube version of Kubernetes.
As a prerequisite, you should have read the documentation about the Docker version of the connector. You will need the Docker image of the connector and the plugin corresponding to your data source.
The plugin can be integrated into the Docker image, or mounted at the execution. In this example, we use a mounted volume for providing the plugin into the container.
Check that the image and the configuration work well
Before scheduling the job on your orchestrator, it is safe to check that all is working properly by launching the connector using the docker run command. In the example of the documentation about the Docker version of the connector (API mode), please check that the following step is running as expected:
docker run -v /path/to/connector:/workdir -v /path/to/plugins:/extra-libs \
datagalaxy.azurecr.io/public/docker/connector-cli:3.2.3 \
import-api \
--config /workdir/config/azuresql.properties \
--token /workdir/token/datagalaxy-token.properties \
--project-name Demo \
--source-name "DB AzureSQL" \
--password secretAfter executing this command, the objects of the Azure SQL database should have been created in the dictionary in DataGalaxy under the database "DB AzureSQL". Of course, you will have taken care to adapt this command to your context.
Building the Kubernetes job description .yaml file
You're now ready to create the .yaml file that will be used to create the scheduled job in your orchestrator.
Here is an example of a file that could be used as a starting point for your configuration. Depending on your context and the policies of the usage of Kubernetes in your organization, you could need to add some configuration elements in this file, however the statements presented here are the necessary ones you will need. This configuration has been tested in a minikube environment.
apiVersion: batch/v1
kind: Job
metadata:
name: azuresql-connector
spec:
template:
spec:
containers:
- name: azuresql-connector
image: datagalaxy.azurecr.io/public/docker/connector-cli:3.2.3
imagePullPolicy: Never
args:
- import-api
- --password
- secret
- --config
- /workdir/config/azuresql.properties
- --token
- /workdir/token/datagalaxy-token.properties
- --project-name
- Demo
- --source-name
- DB AzureSQL
- --create-source
volumeMounts:
- name: workdir
mountPath: /workdir
- name: plugin
mountPath: /extra-libs/datagalaxy-plugin-azuresql-3.0.0.jar
restartPolicy: Never
volumes:
- name: workdir
hostPath:
path: /minikube-host/path/to/connector
type: Directory
- name: plugin
hostPath:
path: /minikube-host/path/to/plugins/datagalaxy-plugin-azuresql-3.0.0.jar
type: File
backoffLimit: 0The configuration with the volumes mounted locally (hostPath) is appropriate in a minikube testing environment, but is not a recommended solution in production.
The password, which is an argument of the job, is given here in plain text in the configuration file. Using Kubernetes secrets is more secure. Then the secret can be given using an environment variable, which can be used as an argument of the job as explained in the Kubernetes documentation.
Running and supervising the job on Kubernetes with kubectl
To apply the configuration of the job on the orchestrator, you just have to apply the .yaml file configuration, usually using the kubectl apply command.
kubectl apply -f azuresql-connector.yaml
To follow the logs of the execution, first search the pod corresponding to the job:
kubectl get pods --selector=job-name=azuresql-connector
Then follow the logs of the pod:
kubectl logs azuresql-connector-abc12