Submit a ticket My tickets
Welcome
Login  Sign up

DataGalaxy CSV connector in command line (CLI)

This article explains how to use the Desktop command line connector to import CSV files in DataGalaxy format into the different modules.

More information in the following article about running the connector from the command line: Running the connector from the command line (CLI)
This functionality is available by default in the desktop connector, so there is no need to download a plugin to use it.

The DataGalaxy CSV connector allows you to import metadata CSV files in DataGalaxy format, such as those generated during an export from the platform. This connector runs only in CLI mode (no graphical interface).

DataGalaxy CSV file structure

To import objects into the different modules, the content of the source CSV file must respect certain constraints, which will depend on the type of object targeted. Importing basic objects of the Glossary, Dictionary, Processing and Uses modules expects a generic structure, with the mandatory presence of the following columns:

  • Path : is the full path of the object in DataGalaxy
  • Type : allows to type each object of the Path

In addition to these mandatory columns, you can add a column for each additional attribute to be imported. They will be taken into account as soon as the targeted attribute technical name is correctly identified and is part of the corresponding object card (more information about object cards here)

The header of your CSV source file must contain the exact name of the target attributes you wish to import. If a name does not match any attribute in your workspace, the corresponding column will be ignored during the import.
The following attribute types are not currently supported for import: Time Series, Value List, Tag List and Hierarchy
If the same object appears multiple times in a CSV import (single and/or several files), only the first occurrence will be processed. Any subsequent entries, even if they contain additional or different information, will be ignored.

Example of CSV file structure for the Glossary module:

"Path";"Type";"Status";"DataOwners";"Summary";"CDPEtiquetteGlossaire"
"\Age";"\Indicator";"Proposed";"dataowner@datagalaxy.com";"client's age";"Sales"
"\Marketing\Commerce\Civility";"\Universe\Concept\Business Term";"Valided";"dataowner@datagalaxy.com";"";"Adresse"

In addition to the basic objets, other types of objects can also be imported with this connector. In order to be correctly identified and created, the files must respect a precise header so that the connector can deduce the targeted objet type, and thus determine the order of the imports to be performed. (i.e. foreign keys must be created after primary keys).

The import of sources and functional keys are not currently supported by the connector.

Here is the list of additional object types and their headers to be used in CSV files:

  • Primary Keys:
"TablePath";"TableType";"ColumnName";"PKTechnicalName";"PkOrder"
  • Foreign key:
"FKTechnicalName";"PKTechnicalName";"PkTablePath";"PkTableType";"PkTableTechnicalName";"PkColumnName";"ChildTablePath";"ChildTableType";"ChildTableName";"ColumnName";"FKDisplayName";"Summary"
  • Input elements:
"DataProcessingPath";"DataProcessingType";"CatalogInPath";"CatalogInType"
  • Output elements:
"DataProcessingPath";"DataProcessingType";"CatalogOutPath";"CatalogOutType"
  • Data Processing Items:
"DataProcessingPath";"DataProcessingType";"Technical Label";"Functional Label";"Type";"Summary";"Description";"CatalogInPath";"CatalogInType";"CatalogOutPath";"CatalogOutType"
  • Links (notion of linked objects):
"SourceEntityPath";"SourceEntityType";"EntityLinkType";"LinkedEntityPath";"LinkedEntityType"

Running the DataGalaxy CSV connector

DataGalaxy CSV connector works only in command line (CLI) and allows importing via API.

Following base command line can be used to import CSV files stored in a directory :

datagalaxy-cli-connector.bat import-api --csv-folder-path [path\folder] --token [fichier_token.properties] --project-name [Workspace]

In the context of an import to the dictionary, additional parameters may be requested:

--create-source: if the source is to be created

--csv-technology-code: technology code used when a new source is created

--source-name [Dictionary Source]: to name the source (whether it already exists or is created when using the connector)

--csv-source-type [Source Type] : to define the type of source. This parameter can have one of the following values:

  • Relational : corresponds to relational databases
  • NonRelational : corresponds to storage on a file system (filestore), used for example for DataLake
  • NoSQL : corresponds to NoSQL databases
  • TagBase : corresponds to TagBase databases, mainly used for IoT
The CSV import allows to feed only one data source per import (the one provided in parameter).

Examples of commands

Basic import

This first command line example aims importing all CSV files stored in an "Import" folder in order to consequently load all DataGalaxy modules.

For a start, let's consider the following basic Windows command line :

datagalaxy-cli-connector.bat import-api

It must be completed as follows to indicate the context of the import to be performed:

  • The --csv-folder-path option allows to give the path of the folder containing the CSV files we want to import into DataGalaxy
  • We want to use the token value stored in the "C:\token\token.properties" file using the --token option
  • The target of the import is the workspace identified by the project-name given in parameter (ESP-PROD)
  • The data source of the dictionary that will receive the result of this execution is called File-AWS-S3 and it is of type NonRelational (FileStore)
  • It will be created if it does not exist with the --create-source parameter.

Once all these parameters are identified, we get the following command line:

datagalaxy-cli-connector.bat import-api --csv-folder-path "C:\Documents\Import" --token "C:\token\token.properties" --project-name ESP-PROD --create-source --source-name File-AWS-S3 --csv-source-type NonRelational

Now the command line is ready to be executed!

Import with options

This second example allows you to define the separator and encoding of CSV files stored in an "Import" folder.

By default, encoding is UTF-8 and separator is set to semicolon (;).

The following command line works under Windows, and will import CSV files from the folder "C:\Documents\Import".

These files use comma as a separator and ANSI encoding.

It does an import via API using the token contained in the "C:\token\token.properties" file. 

The target of the import is the workspace named ESP-PROD and, as no CSV file targets the Dictionary module, it is not necessary to include the parameters specific to the Source object.

datagalaxy-cli-connector.bat import-api --csv-folder-path "C:\Documents\Import" --token "C:\token\token.properties" --project-name ESP-PROD --csv-separator "," --csv-encoding ISO_8859_1

Launch multiple executions of the connector

If you need to chain executions of the connector with different configurations, you can create a script (.bat on Windows or .sh on Linux) that calls the connector script.

On Windows, be sure to use the call command in your .bat. If not, the script will exit after the first line.

To chain three executions of the connector on Windows, you could create a script that looks like this :

call datagalaxy-cli-connector.bat import-api --config "config1.properties" ...
call datagalaxy-cli-connector.bat import-api --config "config2.properties" ...
call datagalaxy-cli-connector.bat import-api --config "config3.properties" ...

Releases

DatePlugin
Version
DataGalaxy
release
Desktop Connector
version 
Description
23/07/20245.0.1v3.61.05.0.3Migrated from java 11 to java 17 + CVE fixes

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.