Prepare your data for Identity Resolution

Identity resolution is only available on Business tier plans. You can use it with or without Customer Studio.

Audience	How you’ll use this article
Data teams	Validate data structure and quality before creating an identity graph.
Platform admins	Identify and resolve data issues that could affect identity resolution accuracy.

Before creating an identity graph in Hightouch, it’s important to confirm that your data is structurally ready and well-suited for identity resolution.

This article focuses on data readiness, not UI configuration. You won’t configure models, identifiers, or rules here—but you’ll learn what to check before starting setup so identity resolution behaves predictably.

What Identity Resolution expects from your data

Identity Resolution groups records from one or more models into resolved identities. To do this reliably, your data should meet a few core expectations:

Tables are defined as models in Hightouch
Each model has a stable primary key
Each model includes a timestamp for incremental processing
Identifiers are present, meaningful, and reasonably consistent

If these expectations aren’t met, identity graph setup may fail—or produce confusing or unstable results.

Model requirements

Identity Resolution only works with Hightouch models, not raw warehouse tables.

Before you begin, confirm that:

All tables you want to use exist as models
Model SQL is finalized and deterministic
Primary keys are truly unique per row

Identity Resolution relies on primary keys to track records across runs. Non-unique keys can cause rows to appear unresolved or behave unpredictably.

Timestamp readiness

Every model used in identity resolution must include a timestamp column. Timestamps allow Hightouch to process new or updated records incrementally.

Use the timestamp that best represents when the record changed:

Event models → event time (for example, event_time)
Entity models → last updated time (for example, updated_at or loaded_at)

Avoid timestamps that are:

Null
Always CURRENT_TIMESTAMP
Unrelated to record changes

Identifier readiness (conceptual)

Identifiers are the signals Identity Resolution uses to decide whether records belong to the same real-world entity.

Before setup, it’s helpful to understand:

Which identifiers exist across your datasets
Which identifiers are stable vs ephemeral
Which identifiers represent:
- A person (email, user ID)
- A device or session (anonymous ID)
- An account or organization

You’ll map identifier columns during setup, but having this context ahead of time helps you make better decisions.

You’ll select and configure identifier columns during the Add identity graph workflow. This article focuses on understanding your data—not performing setup.