Using the Data Takeout feature, it is possible to export the entire contents of your project. It can come in handy when
Important: Do not confuse this feature with the project backup. Takeout data cannot be automatically imported into KBC (recovering a project must be done by KBC Support).
Things to have before you start:
We strongly recommend that you create a dedicated user with a dedicated bucket to prevent an accidental leak of credentials or your project management.
Create an S3 Bucket following the Amazon documentation. Name it keboola-data-takeout, for example, and choose the region suitable with your Amazon subscription. If unsure, choose US Standard (N. Virginia).
Create an IAM User (Identity and Access Management) following the Amazon documentation. You get an Access Key ID and Secret Access Key with the new user. You can always regenerate them in the IAM Management Console:
In the next dialog, select Custom Policy:
Name the policy and paste the following definition in JSON format as the policy document:
Make sure to replace keboola-data-takeout in the above document with the true name of your S3 bucket.
Having your S3 bucket named and its region set, along with the Access Key ID and Secret Access Key of the IAM user, you are ready to export the project.
Go to KBC and select Users & Settings. In the Settings tab, click the Data Takeout button:
Now fill the Access Key ID, Secret Access Key, S3 Region and S3 Bucket name:
You can also configure the path inside the S3 bucket. For instance, set the path to
my-take/ and the data will be stored in
To store the export in the bucket root, leave the path empty.
Important: The existing files will be overwritten. Optionally, you can select to export the project structure only and no actual data will be exported.
When ready, Run Export:
To monitor the progress of the data export, click the Export started link. The data takeout may take a considerable amount of time if your project is large.
The exported project has the following general structure:
buckets.json— all buckets in the project and their metadata
tables.json— all tables in the project and their metadata (bucket, columns, description, etc.). Table aliases are not exported.
configurations.json— all components used in the project and main properties of their configurations
/sys/folders — all project tables in the CSV format (compressed with gzip). Important:
systables contain only configuration of older components and may not be present in your project.
If you tick the
Export project structure only checkbox when exporting, no actual data will be exported.
Only configurations and
sys tables as configurations of legacy components will be exported.
The Data Takeout tool will overwrite existing files in your S3 bucket (no files will be deleted though). Make sure your S3 bucket is empty, or use an appropriate S3 path.
If your configurations contain encrypted values (such as password to database server), these values will be exported encrypted. Exported encrypted values cannot be decrypted.
Once the files are written to your S3 bucket, make sure they are kept safely and only authorized persons can access them. Also, deactivate the AWS Key.