Snowflake has many advantages:
Snowflake is a cloud database and, as such, brings continuous updates and behavioral changes. If you are interested in those changes, please follow the official Snowflake change log.
When loading data to a Snowflake transformation, beware that there are two different methods: copy and clone.
In some cases, you may need to abort the transformation execution and exit with an error message.
To abort the execution, set the
ABORT_TRANSFORMATION variable to any nonempty string value.
This example will set the
ABORT_TRANSFORMATION variable value to
'Integrity check failed' if the
contains one or more records with the
RESULT column equal to the value
The transformation engine checks the
ABORT_TRANSFORMATION after each successfully executed query and returns the value
of the variable as a user error,
Transformation aborted: Integrity check failed. in this case.
If you have a large amount of data in databases and complex queries, your transformation might run for a couple of hours. To speed it up, you can change the backend size in the configuration. Snowflake transformations suport the following sizes:
Scaling up the backend size allocates more resources to speed up your transformation.
Note: This feature is currently in public beta. Please bear with us and provide feedback at https://ideas.keboola.com. Also, dynamic backends are not available to you if you are on the Free Plan (Pay As You Go).
To create a simple Snowflake transformation, follow these steps:
source(as expected by the Snowflake script).
You can organize the script into blocks.
Snowflake is case sensitive. All unquoted table/column names are converted to upper case while quoted names keep their case.
So, if you want to create the following table,
all of these commands will work
while this one will not:
Be especially careful when setting up input and output mappings.
When writing your transformation script, quoting all table and column names is highly recommended. Snowflake converts all unquoted table/column identifiers to uppercase, which won’t match table/column identifiers created by Keboola Connection (unless they happen to be all uppercase).
Storage tables store data in character types. When you create a table used on output mapping, you can rely on implicit casting to char:
Or, you can create the table directly with character columns (and rely on implicit casting to char):
You can also explicitly cast the columns to char:
When using an unstructured data type, you always have to use the explicit cast:
The implicit cast does not work for the
VARIANT types, so the following code:
will lead to an error:
Expression type does not match column data type, expecting VARCHAR(16777216) but got OBJECT for column OBJ, SQL state 22000
By default, Snowflake uses the
YYYY-MM-DD HH24:MI:SS.FF3 format
when converting the
timestamp column to a character string.
This means that if you create a table in a transformation that uses a
the table value will come out as
2018-04-09 06:43:57.866 -0700 in Storage. If you
want to output it in a different format, you have to cast the column to a string first, for example:
Do not use
ALTER SESSION queries to modify the default timestamp format, as the loading and unloading sessions are separate
from your transformation/sandbox session and the format may change unexpectedly.
DY, DD MON YYYY HH24:MI:SS TZHTZM
Important: Snowflake works with time zones (and Daylight Savings Time), requiring you to distinguish between various conversion functions:
For more information on how a read-only input mapping works, visit the link.
Buckets in Snowflake are represented by schemas. You can find all available schemas for your account by calling
SHOW SCHEMAS IN ACCOUNT;. Each schema represents a bucket.
However, a read-only input mapping cannot access alias tables, because technically it is just a reference to an existing schema.
For a linked bucket, the schema is available in another database. That is, to access this linked bucket you have to include the database name of the project from which the bucket is linked.
For example, say your bucket
in.c-customers is linked from bucket
in.c-crm-extractor in project 123. You then need to reference the tables in the transformation like this:
When developing the transformation code, it’s easiest to create a workspace with read-only input mappings enabled and look directly in the database to find the correct database and schema names.