At this point, you already know how to quickly load data into Keboola Connection, and your Storage contains four new tables: account, opportunity, level and user. In this part of the tutorial, we will show you how to manipulate data in Storage using Transformations. Let’s create a denormalized table from the input tables and do some minor modifications to it.
To start, go to the Keboola Connection Transformations section:
Like tables, Transformations are organized into buckets. Each transformation bucket can contain any number of individual transformations. It should represent a logical set (container) of operations you want to perform together. Before you start with transformations, create a bucket and call it Opportunity.
Then click on the New Transformation button to create an individual transformation. Enter Denormalize opportunities as its Name and select Snowflake as its Backend. A backend is the engine running the transformation script; it is either a database server (Snowflake, Redshift) or a language interpreter (R, Python, Julia).
When you create a transformation, you need to set up
The concept of mapping is an important safeguard when you are manipulating your data. Thanks to it, there is no way to modify the wrong tables by accident. The only tables which are modified by your transformation are those explicitly specified in Output Mapping.
Let’s start with setting Input Mapping by clicking the New Input button.
The Source field in the input mapping refers to Storage. Select
in.c-csv-import.account as the source table.
You can do a full text search in the select field; typing
acc will give you the table as well.
In the Destination field, the table name
account is automatically filled for you.
This is the name of the source table inside the transformation. Use the Create Input button to create
the input mapping.
Add the remaining three tables:
level. If you loaded data using the
Database extractor or the Google Drive extractor,
feel free to use the tables created by them (e.g.,
in.c-keboola-ex-google-drive-548902224.level-level). In either case, make sure that the destinations
are set to
You will get the following configuration:
See additional information about Input Mapping (all available options, etc.).
Continue with setting up Output Mapping by clicking on the New Output button.
opportunity_denorm into the Source field in the output mapping;
the Source field refers to the transformation. This table does not exist yet.
We will create it in the transformation.
The Destination field refers to the name of the output table in Storage.
It will be auto-generated to
out.c-opportunity.opportunity_denorm, which is
perfectly fine. It will create the
opportunity_denorm table in the
opportunity bucket in the output stage
in Storage. Neither the table nor the bucket exist, but they will be created once the transformation runs.
After you finish Output Mapping, you will see this:
The size of the
opportunity_denorm table shows as N/A because the table does not exist yet.
See additional information about Output Mapping (all available options, etc.).
To produce that table from the tables
user, write a transformation script.
To save you some time, we have already prepared the necessary SQL queries for you:
In the first query, we change the user level descriptions into something more clear.
In the second query, we compute the quality level for each deal opportunity based on the estimated probability of closing the deal. Note that here we are excluding the system column “_timestamp” which appears in cloned tables on Snowflake backend.
In the third query, we denormalize all four tables into a single one. We have prepared the single table so that it will load nicely into Tableau.
Save the queries and then click on Run Transformation. This will create a background job which will
To see if the transformation job was successful, go to Jobs, or click on the small Transformations job has been scheduled window that pops up after a transformation starts running.
Having learned to set up a transformation, you can now