Power internal tools, in-app experiences, and more
Overview
Hightouch lets you send data from your warehouse into your Google BigQuery.
Connecting Hightouch to BigQuery requires some setup in both platforms. It's recommended to set up a service account with the correct permissions in BigQuery before configuring the connection in Hightouch.
Supported syncing
Sync Type | Description | Supported Sync Modes |
---|---|---|
Any data set | Sync data from any source to a Big Query table | Insert, Upsert |
BigQuery credential setup
Setup in BigQuery has three main steps:
- Enable BigQuery for your Google Cloud project
- Create a service account
- Grant the Hightouch service account access to your project
Create a project and enable BigQuery API
- Login to the Google Developers Console.
- Configure the Cloud Platform:
- If you don't have a project already, create one.
- Once you have a project, enable the BigQuery API for it.
- Copy your Project ID for later use.
- Find the location of your BigQuery dataset or sets. You can find this by querying the
INFORMATION_SCHEMA.SCHEMATA
view or by visiting the Google Cloud web console and clicking on a BigQuery dataset in the Explorer panel. You need both Project ID and Data location when connecting Hightouch to BigQuery.
Make sure billing is enabled on your project, otherwise Hightouch can't write into the cluster.
Create a service account
To create a service account, follow the setup instructions in our Google Cloud Platform (GCP) documentation.
Grant access
Your GCP service account will need permission to read and write data to BigQuery. You can set up your service account to have full access to your project using a predefined role. Otherwise, you can create a custom role and provide limited access according to a user-specified list of permissions.
Grant full access
You can grant full access by assigning the bigquery.user
, bigquery.dataViewer
, and bigquery.dataEditor
roles to your service account.
You can do this in the Google Cloud web console or by running these snippets in the Cloud Shell.
Grant permission to read metadata and list tables:
gcloud projects add-iam-policy-binding <YOUR_PROJECT_NAME> \
--member serviceAccount:<YOUR_SERVICE_ACCOUNT> \
--role roles/bigquery.user
Grant permission to read data from tables and views:
gcloud projects add-iam-policy-binding <YOUR_PROJECT_NAME> \
--member serviceAccount:<YOUR_SERVICE_ACCOUNT> \
--role roles/bigquery.dataViewer
Grant permission to write data to tables and views:
gcloud projects add-iam-policy-binding <YOUR_PROJECT_NAME> \
--member serviceAccount:<YOUR_SERVICE_ACCOUNT> \
--role roles/bigquery.dataEditor
Grant limited access
If you don't want to grant full access to your BigQuery service account, you can opt to grant limited access instead.
You can do this by assigning the bigquery.dataViewer
and bigquery.dataEditor
roles only to the specific datasets, tables, or views you want to use in Hightouch.
Since you are assigning these roles only to specific resources, you need to assign the bigquery.user
role and grant the bigquery.tables.get
permission at the project level.
For this, you can create a custom role in the Google Cloud web console based on an existing predefined role (bigquery.user
), which you can name custom.bigquery.user
.
When setting up the custom role, click Add permissions to add the bigquery.tables.get
permission to this custom role.
Then assign this role to your service account at the project level. You can do this in the Google Cloud web console or by running this snippet in the Cloud Shell:
gcloud projects add-iam-policy-binding <YOUR_PROJECT_NAME> \
--member serviceAccount:<YOUR_SERVICE_ACCOUNT> \
--role roles/custom.bigquery.user
If this custom.bigquery.user
role still isn't limited enough, you can try assigning the bigquery.jobUser
role and granting the bigquery.dataset.get
, bigquery.tables.get
, and bigquery.tables.list
permissions at the project level.
Hightouch needs to be able to list the schemas and tables in your BigQuery project when setting up your sync configuration.
Connect to Google BigQuery
Go to the Destinations overview page and click the Add destination button. Select Google Big Query and click Continue.
Configure your service account
Select the GCP credentials you previously created or click Create new. To learn more about these credentials, see the Google Cloud Provider (GCP) documentation.
Configure your destination
Enter the Project ID for the project you enabled the BigQuery API for and the Dataset location.
Sync configuration
Once you've set up your Google BigQuery destination and have a model to pull data from, you can set up your sync configuration to begin syncing data. Go to the Syncs overview page and click the Add sync button to begin. Then, select the relevant model and the Google BigQuery destination you want to sync to.
Sync mode
Hightouch supports Upsert mode using the MERGE
statement, with the option to delete removed rows,
and Insert mode using the insertAll
endpoint.
Record matching
To match rows from your model to rows in BigQuery when upserting, Hightouch requires you to select a unique identifier in the table you are syncing to. This statement will fail if there are duplicate identifier values in your table.
Tips and troubleshooting
Common errors
If you encounter an error or question not listed below and need assistance, don't hesitate to . We're here to help.
Live debugger
Hightouch provides complete visibility into the API calls made during each of your sync runs. We recommend reading our article on debugging tips and tricks to learn more.
Sync alerts
Hightouch can alert you of sync issues via Slack, PagerDuty, SMS, or email. For details, please visit our article on alerting.