Rebuilding session attribution in GA4 BigQuery without guesswork

Key Takeaway

Rebuilding session attribution in BigQuery requires understanding that event-level source/medium is not the same as session-level attribution. Use collected_traffic_source fields and session scoping logic, not just the first event in each session.
Intermediate

The mistake most teams make in BigQuery is assuming they can recreate GA4 session attribution with a few basic fields and a last-touch shortcut. Session attribution in export data is possible to analyse, but only if you are clear about what GA4 gives you and what you are reconstructing yourself.

Collected

event-level traffic fields describe what was captured on the event

Session last click

BigQuery schema includes session_traffic_source_last_click where available

Different scopes

raw export and GA4 reports are related, but not identical

Why session attribution in BigQuery is harder than it looks

GA4 export is event-level. Session attribution analysis requires careful use of session identifiers, traffic-source fields, and timestamp logic. If you shortcut that logic, you can easily create a warehouse model that looks precise but is still wrong.

The core problem is not just session reconstruction. It is scope confusion. Teams often mix user acquisition fields, collected event fields, and session last-click fields as if they were interchangeable. They are not, and they map differently toGA4's reporting attribution models.

GA4 UI Attribution
BigQuery Raw Attribution
Session source assignment
GA4 reports use session-scoped rules and reporting logic
Use session_traffic_source_last_click where available, or document your reconstruction approach
Modeled conversions
Can affect some reporting layers
Raw export should be interpreted from exported data only
Cross-device attribution
Applied via Reporting Identity
Not applied, device-level only
Data freshness
Interface logic and export timing differ by surface
Intraday is provisional; daily tables are safer for parity work
Attribution model
Depends on scope and property settings
Depends on which export fields and business rules you use
Self-referral handling
Reporting logic can reflect property configuration
Your SQL model needs to account for the relevant exclusion logic explicitly

The safer starting point: define your question first

Start by defining the exact question your analysis needs to answer:

  • Do you want session-level source attribution?
  • Do you want purchase attribution tied to a conversion event?
  • Do you want parity with a specific GA4 UI report?

These are related but different models. Trying to make one SQL query answer all three usually produces false confidence. Build separate models for each question, run aparity checkagainst the matching GA4 surface, and document which question each model answers.

1

Define which attribution question you are answering

Decide whether you need user acquisition, session last-click attribution, or collected campaign context. Each uses different fields and different logic.

2

Prefer the schema field that already matches the question

If your export includes session_traffic_source_last_click and it fits the analysis, use it before rebuilding session logic from lower-level fields.

3

Use collected fields carefully

Treat collected_traffic_source as captured event context, not as a drop-in replacement for every GA4 session report.

4

Handle intraday tables separately

Do not use intraday tables for parity work unless you clearly label the output as provisional.

5

Validate against a small known sample

Compare one date range and one reporting question at a time. Document any remaining delta instead of forcing the warehouse model to look identical to every GA4 report.

6

Document self-referral and cross-domain exclusions

Referral inflation in BigQuery often comes from missing exclusion logic that GA4 applies in the interface.

What to avoid in BigQuery session attribution

These are the most common mistakes that produce plausible-looking but incorrect results in BigQuery attribution models. Many of them surface in the same way as(not set) valuesorsource/medium suddenly turning Directin GA4 reports.

BigQuery attribution audit action plan

Run this audit when building or validating a BigQuery session attribution model. The validate steps confirm your model inputs are correct before you invest in building BI infrastructure on top.

Validate

  • The analysis states clearly whether it is using collected_traffic_source, traffic_source, or session_traffic_source_last_click
  • Session key uses both user_pseudo_id AND ga_session_id (ga_session_id alone is not globally unique)
  • Intraday tables are handled separately from finalized daily tables in all models
  • Self-referral exclusion logic matches the domains configured in GA4 Admin > Data Streams > Configure Tag Settings
  • Parity testing is done against one GA4 report scope at a time, not against every surface simultaneously
  • Any remaining export-versus-interface gap is documented before stakeholders use the model operationally

Fix

  • If session count is inflated: verify you are not double-counting intraday events that also appear in finalized tables
  • If Referral source is inflated: add self-referral exclusion for your own domains to match GA4 UI behaviour
  • If source attribution differs significantly from UI: confirm you are using the export field that matches the report scope you are trying to compare
  • If cross-device counts differ: note that BigQuery is device-level only. GA4 UI applies cross-device stitching via Reporting Identity

Watch for

  • Warehouse documentation that says BigQuery is a perfect recreation of every GA4 attribution report
  • Referral traffic in BigQuery significantly higher than in GA4 UI (indicates missing self-referral exclusion)
  • BigQuery and GA4 interface totals being treated as interchangeable without a scope note

BigQuery session attribution checklist

  • The chosen export field matches the attribution question being answered
  • Session key uses user_pseudo_id AND ga_session_id together
  • Intraday tables handled separately from finalized daily tables
  • Self-referral exclusion logic implemented in SQL
  • Parity testing is documented against a specific GA4 report scope
  • Export-versus-interface differences are disclosed to stakeholders

Review attribution rebuild assumptions before you operationalize them

GA4Audits can flag likely parity and attribution risks, but warehouse attribution models still need analyst review before they are treated as a source of truth.

Audit findings should be reviewed by a qualified analyst before they are used for major reporting, media, or implementation decisions. Review your findings

GA4 Audits Team

GA4 Audits Team

Analytics Engineering

Specialising in GA4 architecture, consent mode implementation, and multi-layer audit frameworks.

Share