- Data Manipulation
Input and Output Mapping
To configure input and output mappings in the process of creating a transformation,
go to our Getting Started tutorial.
When manipulating your data, there is no need to worry about accidentally modifying the wrong tables.
Mapping separates the source data from your transformation script. It creates a secure workspace
with data copied from specified tables, and brings only specified tables and files back to Storage.
There are two types of mapping that have to be set up before running a transformation:
- Input Mapping — what Storage tables are used in your transformation;
tables not mentioned in Input Mapping cannot be used.
- Output Mapping — what tables are written into Storage after running the transformation;
tables not mentioned in Output Mapping are never modified nor permanently stored (i.e. they are temporary).
They are deleted from the transformation workspace when the execution finishes.
Table names referenced by mappings are automatically quoted by KBC.
This is especially important for Snowflake, which is case sensitive.
Input Mapping defines data you have in Storage and want to use in a transformation.
This data is made available as a table for SQL, or as a CSV file for R and Python.
Any input mapping has the following options (to see all options, click on the Show details button):
- Source — Select a table in Storage as the source table for your transformation.
- Destination — This is the Source table/file name to be used inside the transformation
(it fills in automatically, but can be changed); file names should end with
- Load Type — Available only in Snowflake transformations. This parameter can switch the input mapping to use
the faster internal
- Columns — Select specific columns if you do not want to import them all; this saves processing time for larger tables.
- Changed in last — If you are into incremental processing, this comes in handy; import only rows changed or created within the selected time period.
Supported time dimensions are
- Data filter — Download only rows that will match this single column multiple values filter.
You can combine these options freely. Input mappings for Snowflake and Redshift include more options specific to the particular backend:
Important: You can create Input Mappings for as many tables as you need for your transformation.
See an example of setting up Input Mapping in our tutorial.
The output mapping takes results (tables and files) from your transformation and stores them back in Storage.
It can create, overwrite, and append any table.
These tables are typically derived from the tables/files in the Input mapping. In SQL transformations,
you can use any
DELETE queries to create the desired result.
An output mapping has the following options (to see all options,
the Show details checkbox above the Source field must be checked):
- Source — Enter the name of a table that should be created in the transformation;
the table does not exist until the transformation runs.
(Enter either a table name in the transformation database or a file name, including
- Destination — Select or type in the name of the Storage output table that will contain the results
of your transformation (i.e. contents of the Output Mapping Source table).
- If this table does not exist yet, it will be created once the transformation runs.
- If this table already exists in Storage, it will be either overwritten or extended.
- Incremental — Checking this option makes sure that in case the Destination table already exists,
it is not overwritten, but resulting data are appended to it.
- Primary key — The primary key of the destination table; if the table already exists,
the primary key must match. Feel free to use a multi-column primary key.
- Delete rows — Delete rows matching the criteria from the destination table before importing the data. Using this option makes sense when incremental is switched on.
Important: Multiple Output Mappings can be created for your transformation.
See our tutorial for an example of setting up Output Mapping.