Google BigQuery

The BigQuery extractor loads data from BigQuery and brings it into Keboola Connection (KBC). Running the extractor creates a background job that

  • executes the queries in Google BigQuery.
  • saves the results to Google Cloud Storage.
  • exports the results from Google Cloud Storage and stores them in specified tables in Keboola Connection Storage.
  • removes the results from Google Cloud Storage.

Note: Using the Google BigQuery extractor is also described in our Getting Started Tutorial.

Initial Setup

Service Account

To access and extract data from your BigQuery dataset, you need to set up a Google Service Account. Go to Google Cloud Platform Console > IAM & admin > Service accounts and select the project you want the extractor to have access to. Click Create Service Account amd enter a Service account name (e.g. Keboola Connection BigQuery Extractor).

Screenshot - Create service account

Then add the BigQuery Data Editor, BigQuery Job User and Storage Object Admin roles.

Screenshot - Set admin roles

Finally, create a new JSON key (click + Create key) and download it to your computer (click Create).

Screenshot - Create JSON key

Bucket

The extractor uses Google Storage Bucket as a temporary storage for off-loading the data from BigQuery. Go to the Google Cloud Platform Console > Storage > Browser and click Create Bucket. Enter Name of your bucket and select Location (must be the same as of your dataset).

Screenshot - Google Cloud Platform

Configure Extraction

Create a new configuration of the BigQuery extractor. Click the Set Service Account Key button.

Screenshot - Set Service Account Key

  • Open the downloaded key in a text editor, copy & paste it in the input field, click Submit.

Screenshot - Copy & Paste Service Account Key

Click on the Save button to store the credentials. Important: The private key is stored in an encrypted form and only the non-sensitive parts are visible in the UI for your verification. The key can be deleted or replaced by a new one at any time.

Screenshot - Save Service Account Key

In the Unload Configuration section, fill Cloud Storage Bucket Name as the name of the bucket you have created earlier, and select the correct Dataset Location. Click Save.

Screenshot - Configure Bucket

Configure Queries

Start by clicking the Add Query button.

Screenshot - Add Query

Name the query and click Create.

Screenshot - Name Query

Specify the SQL code in the SQL Query field and Save the query configuration. In the example below a public dataset to test the extractor was used:

SELECT * FROM `bigquery-public-data.utility_us.country_code_iso` LIMIT 10;

Screenshot - Configure Query