At this juncture, you’re acquainted with the swift process of loading data into Keboola, resulting in four new tables in your Storage:
account, opportunity, level
, and user
.
In this segment of the tutorial, we’ll guide you through data manipulation in Storage using transformations. Our objective is to create a denormalized table from the input tables and make some minor modifications to it.
To start, navigate to the Keboola Transformations section.
Next, click the Create Transformation button and choose Snowflake SQL Transformation (or another SQL transformation, depending on your project’s backend).
Enter Denormalize opportunities
as its name, and remember to provide a description. Transformations can be organized into folders;
you can either add it to a folder during creation or move it to any folder later. Now, enter Opportunity
as the folder name.
Keboola transformations operate on a few fundamental building blocks. It’s crucial to note that the transformation process occurs in a dedicated database schema, meaning queries are not executed directly against your Storage tables. Instead, the system clones selected tables into the dedicated transformation schema, executes queries, and finally unloads created/modified objects back to the Storage.
The mapping concept serves as a crucial safeguard when manipulating your data. It ensures that there is no accidental modification of the wrong tables. The only tables modified by your transformation are those explicitly specified in the Output Mapping. Additionally, this concept plays a vital role in maintaining a detailed data lineage across your project.
Let’s start with setting Input Mapping by clicking the New Input button.
The Source field in the input mapping refers to Storage. Select in.c-csv-import.account
as the source table. You can do a full text search in the select
field; typing acc
will give you the table as well. In the Table name field, the table name account
is automatically filled for you. This is the name of the
source table inside the transformation. Use the Add Input button to create the input mapping.
Please note that there are additional settings you can configure, such as the Changed in Last filter, which aids in incremental processing of large tables. However, for the purpose of our tutorial, we won’t delve into those details. See additional information about input mapping (all available options, etc.).
Add the remaining three tables: opportunity
, user
and level
. You can add multiple tables at once:
You will get to the following configuration:
Continue with setting up output mapping by clicking on the New Output button.
Enter opportunity_denorm
into the Table name field in the output mapping; the Table name field refers to the transformation. This table does not exist yet.
We will create it in the transformation query.
The Destination field refers to the name of the output table in Storage. It will be auto-generated to create the opportunity_denorm
table
in the denormalize-opportunity
bucket in the output stage in Storage.
Neither the table nor the bucket exist, but they will be created once the transformation runs.
After you finish the output mapping, you will see this:
See additional information about output mapping (all available options, etc.).
To produce that table from the tables account
, opportunity
and user
, write a transformation script.
To save you some time, we have already prepared the necessary SQL queries for you:
For BigQuery, the query will look like this:
Click the New Code button. Begin by entering a query name – input Opportunity denorm. Next, paste the queries into the editor, and then click Save.
In the first query, we enhance user level descriptions for better clarity.
In the second query, we calculate the quality level for each deal opportunity based on the estimated probability of closing the deal.
In the third query, we denormalize all tables into a single one.
Click Run Transformation. This will create a background job which will
To see if the transformation job was successful, navigate to Jobs, or click on the small Snowflake SQL job has been scheduled window that pops up after a transformation starts running.
After a successful execution of the transformation you’ll see a new table created in your Storage. Please notice also the Recently updated by where you can see what component configuration recently updated that table.
Having learned to set up a transformation, you can now
Feel free to reach out to our support team if there’s anything we can help with.