What's in the GA4 BigQuery export schema?
The GA4 BigQuery export contains 30+ top-level fields per event row, organised into seven groups: identity (user_pseudo_id, user_id), timing (event_timestamp, event_date), event metadata (event_name, event_params), traffic source (traffic_source.*, session_traffic_source_last_click.* added 2024), device and geo (device.*, geo.*), e-commerce (ecommerce.*, items array), and derived fields (privacy_info.*, is_active_user, session_traffic_source_last_click.*).
Three of those fields are nested arrays, event_params, user_properties, and items, which need UNNEST() to query. The schema is consistent across all GA4 properties; the only variation is whether session_traffic_source_last_click is present (depends on export configuration date).
This post is the reference. Bookmark it.
The complete schema (30+ top-level fields)
Identity fields
| Field | Type | What it contains |
|---|---|---|
user_pseudo_id | STRING | Anonymous user ID (the GA cookie value). Persistent across sessions for the same browser. |
user_id | STRING | Custom user ID set via the User-ID feature. NULL if not implemented. |
pseudo_user_properties | RECORD (repeated) | User-scoped properties set with set_user_properties calls. |
user_first_touch_timestamp | INT64 | Timestamp of the user's first touch (first session start). |
Timing fields
| Field | Type | What it contains |
|---|---|---|
event_timestamp | INT64 | Microseconds since Unix epoch when the event was logged client-side. |
event_date | STRING | Event date in YYYYMMDD format (the table suffix). |
event_previous_timestamp | INT64 | Microseconds since Unix epoch of the previous event from the same user. |
event_server_timestamp_offset | INT64 | Difference between client and server timestamp in microseconds. |
Event metadata
| Field | Type | What it contains |
|---|---|---|
event_name | STRING | The event name (e.g., page_view, purchase, session_start). |
event_params | RECORD (repeated) | Array of key-value parameters attached to the event. Critical field, most analysis lives here. |
event_value_in_usd | FLOAT64 | Currency-converted value if event has a value parameter. |
event_bundle_sequence_id | INT64 | The sequential ID of the bundle this event was uploaded in. |
stream_id | STRING | Identifies the data stream the event came from. |
platform | STRING | The platform that generated the event (WEB, IOS, ANDROID). |
Traffic source
| Field | Type | What it contains |
|---|---|---|
traffic_source.source | STRING | First-touch source (lifetime user attribution). |
traffic_source.medium | STRING | First-touch medium. |
traffic_source.name | STRING | First-touch campaign name. |
session_traffic_source_last_click.manual_campaign | RECORD | Last-click source for the current session (added 2024). |
collected_traffic_source | RECORD | Source data from the actual collected event (newer field). |
The session_traffic_source_last_click field, added in 2024, gives session-level last-click attribution directly. Before this field, you had to extract source/medium from the session_start event's params.
Device and geo
| Field | Type | What it contains |
|---|---|---|
device.category | STRING | desktop, mobile, or tablet. |
device.mobile_brand_name | STRING | Apple, Samsung, etc. |
device.mobile_model_name | STRING | iPhone 15, Galaxy S24, etc. |
device.operating_system | STRING | iOS, Android, Windows, macOS, Linux. |
device.operating_system_version | STRING | OS version string. |
device.web_info.browser | STRING | Chrome, Safari, Firefox, Edge. |
device.web_info.browser_version | STRING | Browser version. |
device.language | STRING | Browser language preference (en-us, fr-fr). |
geo.continent | STRING | The continent. |
geo.country | STRING | The country (e.g., "United Kingdom"). |
geo.region | STRING | Subnational region (state, province). |
geo.city | STRING | City. |
geo.metro | STRING | Metro area code (US-specific). |
E-commerce
| Field | Type | What it contains |
|---|---|---|
ecommerce.transaction_id | STRING | Transaction ID for purchase events. |
ecommerce.purchase_revenue | FLOAT64 | Revenue from the transaction (excluding tax/shipping). |
ecommerce.purchase_revenue_in_usd | FLOAT64 | Revenue in USD. |
ecommerce.refund_value | FLOAT64 | Refund amount. |
ecommerce.shipping_value | FLOAT64 | Shipping cost. |
ecommerce.tax_value | FLOAT64 | Tax amount. |
ecommerce.unique_items | INT64 | Number of unique items in the transaction. |
ecommerce.total_item_quantity | INT64 | Total quantity across all items. |
items | RECORD (repeated) | Array of items in the event (product details). |
Derived and metadata
| Field | Type | What it contains |
|---|---|---|
is_active_user | BOOLEAN | Whether the user is considered active per GA4's definition. |
privacy_info.analytics_storage | STRING | Consent state for analytics_storage. |
privacy_info.ads_storage | STRING | Consent state for ad_storage. |
privacy_info.uses_transient_token | STRING | Whether the user uses a transient token (consent-denied modelled events). |
app_info.id | STRING | App package name (mobile only). |
app_info.version | STRING | App version (mobile only). |
How to unpack nested arrays
Three fields are repeated records that need UNNEST() to query:
event_params
Want to see which hidden implementation gaps are affecting your GA4 data quality?
The value.string_value, value.int_value, value.float_value, and value.double_value fields hold the actual parameter values depending on type. Use the COALESCE pattern when the type is mixed:
items
user_properties
Standard event_params worth knowing
Every GA4 event includes a default set of parameters. The most common:
| Param key | Value type | What it contains |
|---|---|---|
ga_session_id | int_value | The session identifier (used for sessionisation) |
ga_session_number | int_value | Sequential session number for the user (1, 2, 3...) |
page_location | string_value | Full URL of the page |
page_title | string_value | The page title |
page_referrer | string_value | The referring URL |
engagement_time_msec | int_value | Engagement time in milliseconds for this event |
session_engaged | int_value | Whether the session is engaged (1 or 0) |
entrances | int_value | Whether this is the entry event of the session (1 or 0) |
source, medium, campaign | string_value | Traffic source from the URL on the event |
gclid, dclid | string_value | Click ID parameters when present |
For purchase events specifically, additional params:
| Param key | Value type | What it contains |
|---|---|---|
currency | string_value | Currency code (USD, GBP, EUR) |
value | float_value or int_value | Total transaction value |
transaction_id | string_value | Transaction ID (mirrors ecommerce.transaction_id) |
coupon | string_value | Coupon code applied |
What changed in 2024 to 25
Three meaningful schema additions in the past 18 months:
`session_traffic_source_last_click` (added 2024). Top-level field giving session-level last-click attribution without needing to extract from session_start event params. Older properties may not have it for historical data, only forward.
`is_active_user` (added 2024). Boolean indicating whether GA4 considers the user active per its internal definition. Useful for filtering bot or spam events that GA4 has flagged as inactive.
`privacy_info.uses_transient_token` (added 2025). Indicates whether the user uses a transient token, the marker for consent-denied modelled events. Useful for separating modelled from observed data in BigQuery analyses.
Older properties may have legacy fields no longer documented (e.g., event_dimensions.hostname was deprecated in 2023 in favour of extracting hostname from page_location). When in doubt, query the schema directly: SELECT column_name, data_type FROM project.analytics_PROPERTY_ID.INFORMATION_SCHEMA.COLUMNS WHERE table_name LIKE 'events_%' LIMIT 100;
How to use this in a GA4 audit
Use this topic to support a BigQuery export audit and reporting-governance review. This article is about validating exported analytics data without assuming raw tables reproduce every GA4 reporting behavior exactly. Where possible, separate API-verified findings, browser-verified findings, and findings that depend on access to linked platforms.
What to verify
- Use stable daily exports before drawing conclusions from intraday or best-effort tables.
- Compare raw event and transaction slices before comparing attribution or modeled reporting outputs.
- Confirm which fields are available only after processing delays or only in specific export modes.
- State clearly when warehouse logic is an approximation of the GA4 interface rather than a perfect recreation.
Known limitations
- BigQuery export, the Data API, and the GA4 interface can differ for legitimate processing reasons.
- Attribution, identity, thresholding, and modeled features may not map one-to-one to raw export tables.
Before acting on the result
Use the visible evidence behind the finding before changing reporting, bidding, privacy controls, or executive dashboards. GA4 Audits findings should be reviewed by a qualified analyst before major business decisions are made.