Submit a ticket My tickets
Welcome
Login  Sign up

Snowflake Connector

This article explains how to use the Snowflake connector for DataGalaxy.

This connector is available in the following modes:

Desktop mode ✅SaaS Online mode ✅

This connector supports the following import modes:

Standard mode ✅URN mode

Technology

Created in 2012, Snowflake is a SaaS offering a unique platform for data warehousing, data lakes, data engineering, data science, data application development, and the secure sharing and use of real-time/shared data.

Scope, attributes and mapping with DataGalaxy

Objects

Some of the attributes listed here may not be present by default in your objects' screens configuration. To make them appear in DataGalaxy screens, it may be necessary to adapt the screens of the concerned objects before running the connector. See this article to learn more about screen customization. 

As described in details below, the process of listing Snowflake objects uses the Snowflake Account Usage views. A latency of two hours has to be taken into account between the changes made in Snowflake and their visibility in these views. It impacts directly the representation of the Snowflake objects in DataGalaxy. More information here.

Account

A Snowflake Account is represented by a Dictionary source (Relational DB).

The URN follows this syntax:

urn:snowflake-1:accountName

The following attributes are retrieved from the connection configuration:

DataGalaxy attributeSource/Value
Technical nameAccount name

Database

A Database is represented by a Model.

The URN follows this syntax:

urn:snowflake-1:accountName:database

The following attributes are retrieved from the SNOWFLAKE.ACCOUNT_USAGE.DATABASES view:

DataGalaxy attributeSource/Value
Technical nameDATABASE_NAME

Schema

A Schema is represented by a Model.

The URN follows this syntax:

urn:snowflake-1:accountName:database:schema

The following attributes are retrieved from the SNOWFLAKE.ACCOUNT_USAGE.SCHEMATA view (<SHARED_DATABASE>.INFORMATION_SCHEMA.SCHEMATA for shared databases):

DataGalaxy attributeSource/Value
Technical nameSCHEMA_NAME
DescriptionCOMMENT
External IdSCHEMA_ID
Creation date of the source objectCREATED
Last modification date of the source objectLAST_ALTERED

The following attributes are calculated:

DataGalaxy attributeSource/Value
Current storage sizeSum of Tables size
Historical storage sizeSum of Tables size at the date of connector's run
Storage size unit"bytes"

Table

A Table is represented by a Table.

The URN follows this syntax:

urn:snowflake-1:accountName:database:schema:table

The following attributes are retrieved from the SNOWFLAKE.ACCOUNT_USAGE.TABLES view (<SHARED_DATABASE>.INFORMATION_SCHEMA.TABLES for shared databases):

DataGalaxy attributeSource/Value
Technical nameTABLE_NAME
DescriptionCOMMENT
Current storage sizeBYTES
Historical storage sizeBYTES at the date of connector's run
Storage size unit"bytes"
Number of recordsROW_COUNT
Historical number of recordsROW_COUNT at the date of connector's run

View

A View is represented by a View.

The URN follows this syntax:

urn:snowflake-1:accountName:database:schema:view@view

The following attributes are retrieved from the SNOWFLAKE.ACCOUNT_USAGE.TABLES and SNOWFLAKE.ACCOUNT_USAGE.VIEWS views (<SHARED_DATABASE>.INFORMATION_SCHEMA.TABLES and <SHARED_DATABASE>.INFORMATION_SCHEMA.VIEWS for shared databases):

DataGalaxy attributeSource/Value
Technical nameTABLE_NAME
DescriptionCOMMENT
QueryVIEW_DEFINITION

Dynamic table

A Dynamic Table is represented by a Table.

The URN follows this syntax:

urn:snowflake-1:accountName:database:schema:dtable@dynamictable

WARNING: outside of these 3 retrieved types (Table, View and Dynamic Table), other table types are not yet retrieved by the connector (EXTERNAL TABLE and MATERIALIZED VIEW for example).

The following attributes are retrieved from the SNOWFLAKE.ACCOUNT_USAGE.TABLES view (and <SHARED_DATABASE>.INFORMATION_SCHEMA.TABLES for shared databases):

DataGalaxy attributeSource/Value
Technical nameTABLE_NAME
DescriptionCOMMENT
QueryGET_DDL()
Current storage sizeBYTES
Historical storage sizeBYTES at the date of connector's run
Storage size unit"bytes"
Number of recordsROW_COUNT
Historical number of recordsROW_COUNT at the date of connector's run

Column

A Column is represented by a Column.

The URN follows this syntax:

urn:snowflake-1:accountName:database:table:column

The following attributes are retrieved fom the SNOWFLAKE.ACCOUNT_USAGE.COLUMNS view (and <SHARED_DATABASE>.INFORMATION_SCHEMA.COLUMNS for shared databases):

DataGalaxy attributeSource/Value
Technical nameCOLUMN_NAME
DescriptionCOMMENT
External IdCOLUMN_ID
TypeDATA_TYPE
OrderORDINAL_POSITION
MandatoryIS_NULLABLE is not "YES"

Primary key

A Primary key is represented by a Primary key.

The URN follows this syntax:

urn:snowflake-1:accountName:database:schema:table:pk@primarykey

Primary keys are retrieved using SHOW PRIMARY KEYS statement.

Foreign key

A Foreign key is represented by a Foreign key.

The URN follows this syntax:

urn:snowflake-1:accountName:database:schema:table:fk@foreignkey

Foreign keys are retrieved using SHOW IMPORTED KEYS statement.

Tags bi-directional synchronization

The bi-directional tags synchronization mechanism allows to retrieve tags from Snowflake objects in the DataGalaxy "External Tags" attribute, and to push the tags from the DataGalaxy "Tags" attribute to Snowflake objects. 

The assignment of tags to objects in Snowflake is retrieved from the SNOWFLAKE.ACCOUNT_USAGE.TAG_REFERENCES view.
Please note that as DataGalaxy doesn't support key|value tags yet but only labels, the tags in DataGalaxy will be created by concatenating the tag name and the tag value from Snowflake.

DataGalaxy attributeSource/Value
External tagsTAG_NAME | TAG_VALUE

In Snowflake, the tags are first created in a specific schema to be provided to the connector (the only one for which the connector needs write permissions for the CREATE TAGS statement) and then applied to each object.

Please note that as DataGalaxy doesn't support key|value tags yet but only labels, the tags in Snowflake will be created using only the tag name from DataGalaxy.

DataGalaxy attributeTarget
TagsTAG_NAME

Links

The links created by the Snowflake connector are data lineage links between structures in the Dictionary, at table or column level depending on the connector's configuration. The lineage links are retrieved from Snowflake using the SNOWFLAKE.CORE.GET_LINEAGE() function and are represented in DataGalaxy with Uses/Is Used By links.

In URN mode, when running a single import (mono-db), only information related to the imported database will we retrieved. Links towards objects coming from other databases won't be retrieved in DataGalaxy. 

Detailed scope

    Input

  • Database, schema, table and view 

    In the “Worksheets” tab of the classic console, these elements appear on the left 

  • Column

  • Database, schema, table and shared view

       Location is the same as for "classic" tables, but these databases are distinguished by a small arrow to the left of the database symbol (for example here, “SNOWFLAKE” and “SNOWFLAKE_SAMPLE_DATA”)

  • Shared column

        These columns are found in the same location as classic columns 

  • Primary Key, Foreign Key

    This information is visible using the following queries (example result for primary keys below):

SHOW PRIMARY KEYS IN DATABASE "DEMO_DB";
SHOW IMPORTED KEYS IN DATABASE "DEMO_DB";

For these elements to appear in the "Worksheets" space of Snowflake, the role used must have the necessary permissions. This role is visible at the top right of the window. It is also a good way to test if the custom role you will create for the use of the connector has the necessary permissions (exemple with the role "DATAGALAXY_CONNECTOR_ROLE" here)

    Output

  • Source, schema, table, view and column 

  • Primary key, foreign key

Connexion configuration

    On the Snowflake side

In order to extract the first level of metadata, a database role defined by the Snowflake platform called OBJECT_VIEWER is sufficient. We recommand you to associate this default role to a custom role like this:

  • Creation of a dedicated role, assigning to this role the default role OBJECT_VIEWER, assigning the necessary permissions on a warehouse to this role, creating a dedicated user and assigning the role as well as a default warehouse to this user
    To perform these actions, please execute the following commands in the “Worksheets” tab of Snowflake, replacing values between <> (example below):
// Creation of a dedicated role
create or replace role <datagalaxy_role_name>;

// Assigning default Snowflake database role to this role
grant database role SNOWFLAKE.OBJECT_VIEWER to role <datagalaxy_role_name>;

// Assigning necessary permissions on a warehouse to this role
grant usage on warehouse "<warehouse_name>" to role <datagalaxy_role_name>;

// Creation of a dedicated user and assigning the role as well as a default warehouse to this user (do not put quotes around the default role!)
create user <datagalaxy_user_name> password="<datagalaxy_user_password>" default_role = <datagalaxy_role_name> default_warehouse = "<warehouse_name>";

// Assigning the role to the user
grant role <datagalaxy_role_name> to user <datagalaxy_user_name>;

Do not forget to check "All queries" next to the "Run" button to execute all queries at once, nor to have sufficient rights to perform these actions (via the ACCOUNTADMIN role in general). Once these are performed the statement "Statement executed successfully" must appear in the result window.

  • Level 1: Database, schema, table, view, dynamic table and column

        Once the role was created there is no need for extra access in order to retrieve your level 1 metadata, whether is is a single or multi database import, as long as none of them is a shared database.

SQL
  • Option 1: Shared database, schema, table, view, dynamic table and column

    To retrieve metadata from shared databases, you must give explicit access to them to the role via the following command:

grant imported privileges on all tables IN DATABASE "<shared_database_name>" to role <datagalaxy_role_name>;





SQL
It is currently not possible on Snowflake to more finely manage rights on shared databases, for example by giving usage rights only on the necessary views for the execution of the connector contained in the "ACCOUNT_USAGE" schema of the shared database "SNOWFLAKE". However, a workaround is available here for those who wish
  • Option 2: Primary key, foreign key

        In addition to these rights, when you wish to send primary and foreign keys, it is necessary to have certain rights directly on the schemas and tables of the databases using the following commands:

grant MONITOR on all schemas in DATABASE "<database_name>" to role <datagalaxy_role_name>; 
grant SELECT on all tables in DATABASE "<database_name>" to role <datagalaxy_role_name>;
SQL
  • Option 3: Dynamic Table creation requests

       In addition to these rights, when you wish to retrieve dynamic tables creation requests it is necessary to have certain rights directly on the schemas and dynamic tables of the databases using the following commands:

grant MONITOR on all schemas in DATABASE "<database_name>" to role <datagalaxy_role_name>; 
grant MONITOR on all dynamic tables in DATABASE "<database_name>" to role <datagalaxy_role_name>;
  • Option 4: Governance features

       In addition to these rights, when you wish to retrieve Snowflake tag references in non shared databases, the connector will need the SNOWFLAKE.GOVERNANCE_VIEWER database role. This can be done with the following query: 

GRANT DATABASE ROLE SNOWFLAKE.GOVERNANCE_VIEWER TO ROLE <ACCOUNT_ROLE>;
  • Option 5: Lineage retrieval
      In addition to these rights, when you wish to retrieve the lineage for tables, views, dynamic tables and columns, the connector will need the following priviliges:

- VIEW LINEAGE on the account.

- Any privilege on the objects for which you want to retrieve the lineage for (e.g. SELECT on a table).

- USAGE on the databases and schemas that contain the objects for which you would like to retrieve the lineage for.

You can find more information about the necessary priviliges for lineage retrieval in Snowflake's documentation here.

Any trouble configuring your connexion? Please have a look at our troubleshooting page for Snowflake!

    Summary

Snowflake objectDataGalaxy objectDataGalaxy attributeLevel 1Option 1Option 2Option 4Option 5
DatabaseSource (database)


Database (shared)Source (database)


SchemaSchema (container)


Schema (shared)Schema (container)



TableTable (structure)

Table (shared)Table (structure)


ViewView (structure)

View (shared)View (structure)


Dynamic TableView (structure)

Dynamic Table (shared)View (structure)



ColumnColumn (field)

Column (shared)Column (field)



Primary KeyPrimary Key


Foreign KeyPrimary Key



Snowflake tag references
External tags



    On the DataGalaxy side

The following information is required to set up a connection:

Parameter
Mandatory
Description
Example
Account nameYesThe name of the Snowflake account as described on the following page: Where are Account Identifiers Used?
Example: account_identifier.snowflakecomputing.com
Only the "account_identifier" part must be specified in the connection form.
eh09718.west-europe.azure
WarehouseYesThe name of the Snowflake data warehouseCOMPUTE_WH
Single import / Multiple databases importYesImport one or multiple databases 
Database (only for "Single import" mode)YesLimits the scope to the specified databaseDEMO_DB
Schemas (only for "Single import" mode)NoLimits the scope to the specified schemaPUBLIC
Databases (only for "Multiple databases" mode)NoLimits the scope to the specified databases. If empty, all databases will be retrieved.BANK,PARTNER
RoleNoLimits the scope to the specified roleACCOUNTADMIN
UserYesSnowflake user
Basic / Key Pair authenticationYesAuthentication mode
Password (for the "Basic" authentication mode)Yes 

Private key path (for the "Key Pair" autentication mode)YesThe Desktop Connector allows you to perform a "Pair of keys" authentication. The following article explains how to generate the private key needed for this authentication mode: Key Pair Authentication & Key Pair Rotation
Private key passphrase (for the "Key Pair" autentication mode) Yes 
Option 1: Retrieve information from shared databasesNoOption that allows you to retrieve shared databases information, see documentation for necessary grants
Option 2: Get PKs and FKsNoOption that allows you to retrieve primary keys and foreign keys linked to columns, see documentation for necessary grants
Option 3: Get Dynamic Tables queriesNoOption that allows you to retrieve Dynamic Table creation queries, see documentation for necessary grants
Option 4: Enable Governance featuresNoOption that enables the features requiring the  SNOWFLAKE.GOVERNANCE_VIEWER database role. These features are:
- Snowflake tag references retrieval for non shared databases

Option 5:
Get lineage*
NoOption that retrieves the lineage of tables, views and dynamic tables OR columns, based on the granularity option that was picked ("Table" or "Column"). This option is only available for URN imports.

WARNING: Some links only exist at "Table" granularity while others only exist at "Column" granularity. If some of the links you expected are not imported in DataGalaxy it might be necessary to perform 2 imports, one by granularity option available.

 Advanced mode (Desktop connector option) NoOption to customize the JDBC URL used to connect
⚠ Retrieving the lineage can increase the duration of the connector and generates more compute on Snowflake size, which will therefore have impacts on the costs of the associated warehouse. 

* Only in URN mode

Push DataGalaxy tags to Snowflake

If you wish to push DataGalaxy tags to Snowflake, here are the necessary prerequisites:

  • In Snowflake:
    • Provide the "APPLY TAG" grant on the account level to the role used by the connector. This can be achieved with the following query:
grant apply tag on account to role <connector_role>;
  • Create a schema that the connector will store the DataGalaxy tags. This can be achived with the CREATE SCHEMA query.
  • Provide the "USAGE" grant on the parent database of the schema that will store the DataGalaxy tags. This can be achieved with the following query:
grant usage on database <dg_database>;
  • Provide the "CREATE TAG" grant on the schema that will store the DataGalaxy tags. This can be achieved with the following query:
grant create tag on schema <dg_database>.<dg_schema> to role <connector_role>;
  • In DataGalaxy:
    1. Add the attribute "Tags" on the following screens: "Dictionary: Model", "Dictionary: Table", "Dictionary: View" and "Dictionary: Column"  
    2. Add values to the "Tags" attribute on objects representing Snowflake entities
  • In the Snowflake connector: 
    1. Tick the "Enable Governance features" checkbox
    2. Tick the "Push DataGalaxy tags to Snowflake" checkbox
    3. Provide in the "Push DataGalaxy tags to Snowflake" text field the path of the schema where the DataGalaxy tags will be stored 

Once the prerequisites are done, all you need to do is execute the Snowflake connector that you configured.

From Standard to URN mode

Differences

  1. In Standard mode, the name of your root object will be the one you give it when you create the connection (or the root object of the Dictionary module you target). In URN mode, the name of the root object will be the Snowflake account name used when setting up the connection.
    • Standard mode
    • URN mode
  2. In Standard mode, in the case of a "Single import" mode (that retrieves only one database), your database will be considered as the root object of the hierarchy. In URN mode this database will be one level lower, der your Snowflake account name which will become the new root object. 
    • Standard mode ("Single import" mode)
    • URN mode

Migration guide

The aim of this guide is to show you how to switch your root object and all the Snowflake objects it contains from Standard mode to URN mode. Once you've completed these steps, you'll be able to perform all your future imports in URN mode and take advantage of the new features associated with this mode.

  1. Bring one level below the schematas contained in your database (only in case of a previous import in "Single import" mode):
    • Open the menu associated with your root object (”Snowflake” here) and pick the option “+ Create a container”. It will be of type "Model" and you will name it after the name of your Database as it exists in Snowflake
    • Once it is done you will have to move all your schematas (”BURST_BANK” here) by opening their associated menus and choosing the "Move" option. You will target the previously created object, "DEMO_DB" for this example
    • From now on the steps will be the same to switch to URN mode, no matter the import mode initially chosen in Standard mode
    • If you do not do this, when you do the final URN import on your root object you will end up with duplicates of all the schemata objects of your database retrieved from Snowflake
  2. If is is not already the case, associate to "Database" sources from the "Dictionary" module the "URN" attribute
  3. Associate with your root object the right URN
    • Regarding this, we advise you to follow these steps in order to avoid any error:
      • Perform a new import in URN mode, which will create a new root object for which the URN attribute will be filled
      • Copy the URN attribute
      • Delete the root object and all its children that you just imported in URN mode (since a URN must be unique, if you do not delete this root object before trying to assign its URN to another object, the platform will return an error)
      • Paste the URN in order to fill the URN attribute field from your root object that is still in Standard mode
  4. Do a final import in URN mode
    • This time all the URN attributes from the child objects under your root object should be filled

Congratulations, you switched from the Standard mode to the URN mode and can now enjoy all the new features it offers!

Features

Only in URN mode

  • Lineage retrieval

Warnings

Account name format

Snowflake supports multiple formats for account names. When you use different formats for the same Snowflake account across connections, DataGalaxy creates separate source objects for each format rather than grouping the objects together. Furthermore, to properly utilize the cross-technology lineage feature with URN imports, you must maintain consistency by using the same account format across all technologies. Inconsistent formatting will result in duplicate representations of the same Snowflake objects in DataGalaxy.

Execution of the connector

    Step 1: Installation

  • Download DataGalaxy connector from the portal (see here)
  • Extract the connector archive in the directory of your choice
  • Download the Snowflake plug-in from the portal and copy it into the /lib directory of the connector

   Step 2: Run connector

  • After starting the connector, access the connectors of the Dictionary category

  • If it was correctly installed, the Snowflake plug-in will appear

  • Fill the corresponding fields using the connection information from above

  • Click on "Test" to test the connection
  • Once the connection test passed follow the steps to finalize your import

This connector is also available in online mode, more information on this page: [How to] Online Connector operating mode

Frequently asked questions

Does the connector support Iceberg tables?

Yes, the connector has been validated with Iceberg tables managed by the Snowflake catalog.

Releases

DatePlugin
Version
DataGalaxy
release
Desktop Connector
version (minimum)
Description
24/04/20269.1.7v3.332.15.15.9Updated internal dependencies
24/03/20269.1.5v3.323.15.15.7
Fixed a bug related to verbose logs
12/03/20269.1.4v3.317.25.15.6Fixed a lineage bug related to materialized views and external tables 
11/12/20259.1.0v3.290.05.15.3Optimizations on lineage retrieval reducing duration up to 95%
03/09/20259.0.0v3.247.05.9.0Support push tags feature in URN mode
01/08/20258.7.6v3.225.05.7.1Import PK/FK in URN mode only if option activated
01/08/20258.7.5v3.204.05.7.1

Replace Snowflake logo

02/07/20248.7.2v3.192.15.7.1Added the possibility of limiting a scope to multiple schemas for the single database import mode
04/06/20248.5.6v3.177.15.6.2- Fixed a security issue
02/06/20258.5.5v3.174.35.6.1- Added the possibility to only retrieve lineage between tables
- Fixed a bug related to duplicate selected schemas
- Fixed some bugs related to Snowflake quoted object identifiers 
22/05/20258.5.1v3.171.05.5.13Activated the possibility of using URN imports for everybody
24/04/20258.3.0v3.161.15.5.7- Added lineage retrieval for URN imports
- Improved performance of the connection test
10/04/20258.1.2v3.155.35.5.5Fixing the SQL query for retrieving tables and views.
Key pair authentication mode now available in Online mode.
07/04/20258.1.1v3.154.75.5.5Optimized how data is handled in URN mode
27/02/20257.3.2v3.140.1    5.4.7- Bumped Snowflake driver to 3.22.0
- Don't require OPERATE permission on warehouse anymore
24/02/20257.2.4v3.139.25.4.4Fixed a timeout issue for the connexion test
13/12/20247.2.3v3.109.05.3.6Fixed the label of a checkbox for the desktop version of the connector
29/11/20247.2.1v3.102.15.3.4- New feature: Push DataGalaxy tags to Snowflake
- The connector now supports all naming formats for databases, schemas, tables, views and columns
- The connector now checks the formats of the databases, schemas and paths provided as parameters for the connection
06/11/20247.1.6v3.93.15.2.12Changed the link for the documentation that is given when the connexion test fails
31/10/20247.1.5v3.91.15.2.11Add by default the warehouse in JDBC chain
28/10/20247.1.4v3.89.15.2.11Added the Snowflake tag references retrieval feature for URN imports
24/10/20247.1.3v3.88.25.2.11Changed the format of Snowflake tag references in DataGalaxy
29/11/20247.2.1v3.102.15.3.3New feature: Push DataGalaxy tags to Snowflake
06/11/20247.1.6v3.93.15.2.12Changed the link for the documentation that is given when the connexion test fails
18/10/20247.1.2v3.87.35.2.11Fixed connexion test for governance features
16/10/20247.1.1v3.85.15.2.11Typo fix
16/10/20247.1.0v3.85.05.2.11New feature: Snowflake tag references retrieval (does not work for URN imports)
08/10/20247.0.0v3.84.05.2.11Bugfix to avoid creation of empty schematas, optimization of SQL queries for view definition retrieval, addition of button to filter out shared databases
06/09/20246.0.4v3.72.15.2.7Bugfix on creation query for views retrieval
04/09/20246.0.3v3.72.05.2.7Simplification of necessary rights to retrieve metadata, retrieval of dynamic table and creation query for views by default, UI update to give options for dynamic tables creation query retrieval and PK/FK retrieval
04/09/20246.0.2v3.72.05.2.7Made the table type of URNs (preview) implicit
23/08/20246.0.1v3.69.05.2.3Updated the logger to show more information when using verbose mode  
02/07/20246.0.0v3.55.05.0.1Migrated from java 11 to java 17 + CVE fixes 
11/06/20245.4.0v3.51.04.10.0User can select to get or push tags independently (preview)
06/06/20245.3.1v3.50.0
Integration with Snowflake Data Metrics Functions for data quality (preview)
30/05/20245.2.0v3.48.0
Tag synchronization (preview)
16/05/20245.0.0v3.46.0
Addition of multi databases import
30/04/20244.3.6v3.44.0
Filter to optimize requests
25/03/20244.3.5

Bugfix
18/03/20244.3.4

Changed data type from BigDecimal to Number
14/03/20244.3.2

Manifest update
13/03/20244.3.1v3.36.0
Bugfix
28/02/20244.3.0v3.33.0
Addition of the query attribute
26/02/20244.2.1

Improvement of the connexion test
15/02/20244.2.0v3.31.0
Addition of dynamic tables
24/01/20244.1.1

Update of the retrieval method for tables on the selection screen

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.