Running the Desktop connector in an orchestrator like Kubernetes (k8s)

Modified on: Thu, 4 May, 2023 at 9:31 AM

To run the Desktop connector automatically in a regular basis, one option is to use the Docker version of the Desktop connector and to schedule the execution of a job in an orchestrator like Kubernetes.

This page brings some useful informations to setup this configuration.

How Kubernetes itself works is out of the scope of this page. For any question related to your orchestrator, please check the documentation and the support team of your editor (AWS EKS, Azure AKS, Red Hat OpenShift etc.). In this example, we use the minikube version of Kubernetes.

As a prerequisite, you should have read the documentation about the Docker version of the connector. You will need the Docker image of the connector and the plugin corresponding to your data source.

The plugin can be integrated into the Docker image, or mounted at the execution. In this example, we use a mounted volume for providing the plugin into the container.

Check that the image and the configuration work well

Before scheduling the job on your orchestrator, it is safe to check that all is working properly by launching the connector using the docker run command. In the example of the documentation about the Docker version of the connector (API mode), please check that the following step is running as expected:

docker run -v /path/to/connector:/workdir -v /path/to/plugins:/extra-libs \
      datagalaxy.azurecr.io/public/docker/connector-cli:3.2.3 \
      import-api \
      --config /workdir/config/azuresql.properties \
      --token /workdir/token/datagalaxy-token.properties \
      --project-name Demo \
      --source-name "DB AzureSQL" \
      --password secret

After executing this command, the objects of the Azure SQL database should have been created in the dictionary in DataGalaxy under the database "DB AzureSQL". Of course, you will have taken care to adapt this command to your context.

Building the Kubernetes job description .yaml file

You're now ready to create the .yaml file that will be used to create the scheduled job in your orchestrator.

Here is an example of a file that could be used as a starting point for your configuration. Depending on your context and the policies of the usage of Kubernetes in your organization, you could need to add some configuration elements in this file, however the statements presented here are the necessary ones you will need. This configuration has been tested in a minikube environment.

apiVersion: batch/v1
kind: Job
metadata:
  name: azuresql-connector
spec:
  template:
    spec:
      containers:
      - name: azuresql-connector
        image: datagalaxy.azurecr.io/public/docker/connector-cli:3.2.3
        imagePullPolicy: Never
        args: 
        - import-api
        - --password
        - secret
        - --config
        - /workdir/config/azuresql.properties
        - --token
        - /workdir/token/datagalaxy-token.properties
        - --project-name
        - Demo
        - --source-name
        - DB AzureSQL
        - --create-source
        volumeMounts:
          - name: workdir
            mountPath: /workdir
          - name: plugin
            mountPath: /extra-libs/datagalaxy-plugin-azuresql-3.0.0.jar
      restartPolicy: Never
      volumes:
        - name: workdir
          hostPath:
            path: /minikube-host/path/to/connector
            type: Directory
        - name: plugin
          hostPath:
            path: /minikube-host/path/to/plugins/datagalaxy-plugin-azuresql-3.0.0.jar
            type: File
  backoffLimit: 0

The configuration with the volumes mounted locally (hostPath) is appropriate in a minikube testing environment, but is not a recommended solution in production.

The password, which is an argument of the job, is given here in plain text in the configuration file. Using Kubernetes secrets is more secure. Then the secret can be given using an environment variable, which can be used as an argument of the job as explained in the Kubernetes documentation.

Running and supervising the job on Kubernetes with kubectl

To apply the configuration of the job on the orchestrator, you just have to apply the .yaml file configuration, usually using the kubectl apply command.

kubectl apply -f azuresql-connector.yaml

To follow the logs of the execution, first search the pod corresponding to the job:

kubectl get pods --selector=job-name=azuresql-connector

Then follow the logs of the pod:

kubectl logs azuresql-connector-abc12

English

Check that the image and the configuration work well

Building the Kubernetes job description .yaml file

Running and supervising the job on Kubernetes with kubectl

Table of contents

Related Articles