At this point, you already know how to quickly load data into KBC, and your Storage contains four new tables: account, opportunity, level and user. In this part of the tutorial, we will show you how to manipulate data in Storage using Transformations. Let’s create a denormalized table from the input tables and do some minor modifications to it.
To start, go to the KBC Transformations section:
Like tables, Transformations are organized into buckets. Each transformation bucket can contain any number of individual transformations. It should represent a logical set (container) of operations you want to perform together. Before you start with transformations, create a bucket and call it Opportunity.
Then click on New Transformation button to create an individual transformation. Make sure to enter its Name and select Backend. A backend is the engine running the transformation script; it is either a database server (Snowflake, Redshift, MySQL) or a language interpreter (R, Python). Name your transformation Denormalize opportunities, and choose the Snowflake backend.
When you create a transformation, you need to set up
The concept of mapping is an important safeguard when you are manipulating your data. Thanks to it, there is no way to modify the wrong tables by accident. The only tables which are modified by your transformation are those explicitly specified in Output Mapping.
Let’s start with setting Input Mapping by clicking the New Input button.
The Source field in the input mapping refers to Storage. Select
in.c-csv-import.account as the source table.
You can do a full text search in the select field; typing
acc will give you the table as well.
In the Destination field, the table name
account is automatically filled for you.
This is the name of the source table inside the transformation. Use the Create Input to create
the input mapping.
Add the remaining three tables:
level. If you took loaded data using the
Database extractor or Google Drive extractor
feel free to use the tables create by them (e.g
in.c-keboola-ex-google-drive-334272278.level-level). In either case, make sure that the desinations
are set to
You will get the following configuration:
See additional information about Input Mapping (all available options, etc.).
Continue with setting up Output Mapping by clicking on the New Output button.
opportunity_denorm into the Source field in the output mapping;
the Source field refers to the transformation. This table does not exist yet.
We will create it in the transformation.
The Destination field refers to the name of the output table in Storage.
Start typing in and find our tutorial output bucket and
add the source table name to it:
It will create the
opportunity_denorm table in the
tutorial bucket in the output stage
in Storage. This table does not exist either, but it will be created once the transformation runs.
After you finish Output Mapping, you will see this:
The size of the
opportunity_denorm table shows as N/A because the table does not exist yet.
See additional information about Output Mapping (all available options, etc.).
To produce that table from tables
user, write the transformation script.
To save you some time, we have already prepared the necessary SQL queries for you:
In the first query, we change the user level descriptions into something more clear. In the second query, we compute the quality level for each deal opportunity based on the estimated probability of closing the deal. In the third query, we denormalize all four tables into a single one. We have prepared the single table so that it will load nicely into Tableau.
Save the queries and then click on Run Transformation. This will create a background job which will
To see if the transformation job was successful, go to Jobs, or click on the small Transformations job has been scheduled window that pops up after a transformation starts running.
Having learned to set up a transformation, you can now