In the previous step, you learned how to quickly load data into Keboola Connection using manual import. In real production projects, this is seldom used as most of the data is obtained automatically using extractors. In this part of the tutorial, you will use a Google Drive extractor to load data from an external data sheet.
Google Drive is a common method for sharing small reference tables between different organizations. For our purposes, create a Google spreadsheet from the level.csv file. Let’s pretend someone shared the level table with you through Google Drive.
Go to Google Spreadsheets and Start a new Blank Spreadsheet. Then go to File - Open and Upload the level.csv file.
Go to Components – Extractors in Keboola Connection and click the Add New Extractor button:
Use the search box to find the Google Drive extractor. Once you find it, click on it.
Each Keboola Connection extractor can have multiple configurations. This concept allows you to extract data from, for example, multiple Google accounts. So far, there are no configurations of the Google Drive Extractor.
Click on New Configuration and name the new configuration User Levels; the file we want to extract contains the seniority level of each user.
Then authorize the extractor to access the spreadsheet by clicking the Authorize Account button.
There are two basic authorization options: Instant Authorization and External Authorization. The latter is useful when someone wants to share their document with you without sharing their account directly. Use Instant Authorization now.
On the following screen, click Allow.
Now you want to select the Google Drive files to import.
First, you need to select a spreadsheet.
Find and select your spreadsheet document named level.
Then select the individual sheet. Our ‘level’ document contains only one ‘level’ sheet, so select that one. It will appear on the right side of the screen as one of the Selected sheets to be added to the project.
When you Add Sheet, you should obtain a result like the one below. Then click on the Run Extraction command on the right. This will create a background job, extracting the selected sheet from the Google Drive document and loading it into Storage.
When a job is running, a small orange circle appears under Last runs, along with RunId and other info on the job. Green is for success, red for failure. Click on the indicator, or the info next to it for more details.
The extractor automatically creates an output bucket and a table; here it is
in.c-keboola-ex-google-drive-548902224.level-level
. Click on the name of the output table to check its contents.
Continue with the rest of the tutorial, or take a side step to configure a database extractor.