ChangelogBook a demoSign up

Athena

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

Overview

Hightouch lets you pull data stored in Amazon Athena and push it to downstream destinations. Connecting Hightouch to Amazon Athena requires some setup in both platforms. In Amazon, you need to make sure the AWS credentials you will use to connect Hightouch has the correct permissions.

You may need to allowlist Hightouch's IP addresses to let our systems connect to your warehouse. Reference our networking docs to determine which IP addresses you need to allowlist.

Required permissions

The AWS user that you use to connect to Athena must have the following permissions:

Read and list permissions for Athena resources and execute permissions for Athena queries:

  • ListWorkGroups
  • ListDataCatalogs
  • ListDatabases
  • ListTableMetadata
  • StartQueryExecution
  • GetQueryExecution
  • GetQueryResults
  • GetQueryResultsStream

Read and write permissions for the output location in S3 to both write the results to S3, read them back, and perform cleanup:

  • PutObject
  • ListBucket
  • ListBucketMultipartUploads
  • ListMultipartUploadParts
  • GetBucketLocation
  • GetObject
  • AbortMultipartUpload
  • DeleteObject

See the AWS managed policy AmazonAthenaFullAccess for reference.

If you are using Lake Formation for managing permissions you need to grant database permissions to the AWS user that you use to connect to Athena, otherwise no databases load when configuring the source.

Connection configuration

To get started, go to the Sources overview page and click the Add source button. Select Amazon Athena and follow the steps below.

Configure your credentials

You can either select AWS credentials you've previously configured in Hightouch or choose to add New credentials directly from this page.

The credentials must be for a user who has permission to access Athena and the S3 output location. Refer to the required permissions section for permission details and to the AWS credential configuration docs for credential setup information.

Configure your Amazon Athena source

Enter the following fields into Hightouch:

  • Athena workgroup you previously set up in AWS.
  • (Optional) Query output location—you only need to enter this if your workgroup doesn't have an output location configured. This is an S3 path.
  • Data catalog name
  • Database name
  • (Advanced) VPC connection endpoint-if you wish to connect to Athena through a private VPC endpoint, please contact Hightouch and we can provide the endpoint necessary for this step.

Workgroup and data catalog options populate depending on your AWS credentials. If you don't see the expected values, confirm your AWS credentials and click Refresh. Your database options populate depending on your selected data catalog.

Choose your sync engine

For optimal performance, Hightouch tracks incremental changes in your data model—such as added, changed, or removed rows—and only syncs those records. You can choose between two different sync engines for this work.

The Basic engine requires read-only access to Amazon Athena. Hightouch executes a query in your database, reads all query results, and then determines incremental changes using Hightouch's infrastructure. This engine is easier to set up since it requires read—not write—access to Amazon Athena.

The Lightning engine requires read and write access to Amazon Athena. The engine stores previously synced data in a separate schema in Amazon Athena managed by Hightouch. In other words, the engine uses Amazon Athena to track incremental changes to your data rather than performing these calculations in Hightouch. Therefore, these computations are completed more quickly.

Standard vs Lightning engine comparison
Hightouch recommends using the Lightning sync engine when syncing more than 100 thousand rows of data.

If you select the Basic engine, you can switch to the Lightning engine later. Once you've configured the Lightning engine, you can't move back to the Basic engine without recreating Amazon Athena as a source.

To learn more, including migration steps and tips, check out the Lightning sync engine docs.

Basic versus Lightning engine comparison

CriteriaBasic sync engineLightning sync engine
PerformanceSlowerQuicker
Ideal for large data models (over 100 thousand rows)NoYes
ReliabilityNormalHigh
Resilience to sync interruptionsNormalHigh
Extra featuresNoneWarehouse Sync Logs, Match Booster, Identity Resolution
Ease of setupSimplerMore involved
Location of change data captureHightouch infrastructureAmazon Athena schemas managed by Hightouch
Required permissions in Amazon AthenaRead-onlyRead and write
Ability to switchYou can move to the Lightning engine at any timeYou can't move to the Basic engine once Lightning is configured

Lightning engine setup

Hightouch uses two schemas (hightouch_planner and hightouch_audit) for storing logs of previously synced data. Hightouch must be able to read and write to these schemas, but the specific schema names might vary.

-- Required for the Lightning engine
CREATE SCHEMA IF NOT EXISTS <catalog>.hightouch_planner;
-- For sync logs, optional
CREATE SCHEMA IF NOT EXISTS <catalog>.hightouch_audit;

Test your connection

When setting up a source for the first time, Hightouch validates the following:

  • Network connectivity
  • Amazon Athena credentials
  • Permission to list schemas and tables
  • Permission to write to hightouch_planner schema
  • Permission to write to hightouch_audit schema

All configurations must pass the first three, while those with the Lightning engine must pass all of them.

Some sources may initially fail connection tests due to timeouts. Once a connection is established, subsequent API requests should happen more quickly, so it's best to retry tests if they first fail. You can do this by clicking Test again.

If you've retried the tests and verified your credentials are correct but the tests are still failing, don't hesitate to .

Next steps

Once your source configuration has passed the necessary validation, your source setup is complete. Next, you can set up models to define which data you want to pull from Athena.

The Athena source supports these modeling methods:

Data types

Hightouch parses most Athena data types into JavaScript types before sending them to your destination. Hightouch leaves the following Athena data types as strings:

  • BIGINT
  • ARRAY
  • MAP
  • BINARY
  • STRUCT

Read more about Athena data types in the Athena documentation.

Tips and troubleshooting

To date, our customers haven't experienced any errors while using this source. If you run into any issues, please don't hesitate to . We're here to help.

Ready to get started?

Jump right in or a book a demo. Your first destination is always free.

Book a demoSign upBook a demo

Need help?

Our team is relentlessly focused on your success. Don't hesitate to reach out!

Feature requests?

We'd love to hear your suggestions for integrations and other features.

Privacy PolicyTerms of Service

Last updated: May 16, 2025

On this page
  • Overview
  • Required permissions
  • Connection configuration
  • Configure your credentials
  • Configure your Amazon Athena source
  • Choose your sync engine
  • Test your connection
  • Next steps
  • Data types
  • Tips and troubleshooting

Was this page helpful?