The mistake most teams make in BigQuery is assuming they can recreate GA4 session attribution with a few basic fields and a last-touch shortcut. Session attribution in export data is possible to analyse, but only if you are clear about what GA4 gives you and what you are reconstructing yourself.
event-level traffic fields describe what was captured on the event
BigQuery schema includes session_traffic_source_last_click where available
raw export and GA4 reports are related, but not identical
Why session attribution in BigQuery is harder than it looks
GA4 export is event-level. Session attribution analysis requires careful use of session identifiers, traffic-source fields, and timestamp logic. If you shortcut that logic, you can easily create a warehouse model that looks precise but is still wrong.
The core problem is not just session reconstruction. It is scope confusion. Teams often mix user acquisition fields, collected event fields, and session last-click fields as if they were interchangeable. They are not, and they map differently toGA4's reporting attribution models.
The safer starting point: define your question first
Start by defining the exact question your analysis needs to answer:
- Do you want session-level source attribution?
- Do you want purchase attribution tied to a conversion event?
- Do you want parity with a specific GA4 UI report?
These are related but different models. Trying to make one SQL query answer all three usually produces false confidence. Build separate models for each question, run aparity checkagainst the matching GA4 surface, and document which question each model answers.
Define which attribution question you are answering
Decide whether you need user acquisition, session last-click attribution, or collected campaign context. Each uses different fields and different logic.
Prefer the schema field that already matches the question
If your export includes session_traffic_source_last_click and it fits the analysis, use it before rebuilding session logic from lower-level fields.
Use collected fields carefully
Treat collected_traffic_source as captured event context, not as a drop-in replacement for every GA4 session report.
Handle intraday tables separately
Do not use intraday tables for parity work unless you clearly label the output as provisional.
Validate against a small known sample
Compare one date range and one reporting question at a time. Document any remaining delta instead of forcing the warehouse model to look identical to every GA4 report.
Document self-referral and cross-domain exclusions
Referral inflation in BigQuery often comes from missing exclusion logic that GA4 applies in the interface.
What to avoid in BigQuery session attribution
These are the most common mistakes that produce plausible-looking but incorrect results in BigQuery attribution models. Many of them surface in the same way as(not set) valuesorsource/medium suddenly turning Directin GA4 reports.
BigQuery attribution audit action plan
Run this audit when building or validating a BigQuery session attribution model. The validate steps confirm your model inputs are correct before you invest in building BI infrastructure on top.
Validate
- The analysis states clearly whether it is using collected_traffic_source, traffic_source, or session_traffic_source_last_click
- Session key uses both user_pseudo_id AND ga_session_id (ga_session_id alone is not globally unique)
- Intraday tables are handled separately from finalized daily tables in all models
- Self-referral exclusion logic matches the domains configured in GA4 Admin > Data Streams > Configure Tag Settings
- Parity testing is done against one GA4 report scope at a time, not against every surface simultaneously
- Any remaining export-versus-interface gap is documented before stakeholders use the model operationally
Fix
- If session count is inflated: verify you are not double-counting intraday events that also appear in finalized tables
- If Referral source is inflated: add self-referral exclusion for your own domains to match GA4 UI behaviour
- If source attribution differs significantly from UI: confirm you are using the export field that matches the report scope you are trying to compare
- If cross-device counts differ: note that BigQuery is device-level only. GA4 UI applies cross-device stitching via Reporting Identity
Watch for
- Warehouse documentation that says BigQuery is a perfect recreation of every GA4 attribution report
- Referral traffic in BigQuery significantly higher than in GA4 UI (indicates missing self-referral exclusion)
- BigQuery and GA4 interface totals being treated as interchangeable without a scope note
BigQuery session attribution checklist
- The chosen export field matches the attribution question being answered
- Session key uses user_pseudo_id AND ga_session_id together
- Intraday tables handled separately from finalized daily tables
- Self-referral exclusion logic implemented in SQL
- Parity testing is documented against a specific GA4 report scope
- Export-versus-interface differences are disclosed to stakeholders
Related guides to read next
GA4 Attribution Models Explained
How GA4's attribution models work and how to choose the right one for your business.
GA4 Session Counting Explained
Why GA4 session counts differ from Universal Analytics and how to interpret them correctly.
GA4 Data Hygiene Audit
Systematic guide to identifying and resolving the most common GA4 data quality problems.
Review attribution rebuild assumptions before you operationalize them
GA4Audits can flag likely parity and attribution risks, but warehouse attribution models still need analyst review before they are treated as a source of truth.