GA4 Sampling and Thresholds: When to Trust Your Reports

Key Takeaway

GA4 standard reports are unsampled but may apply data thresholds that suppress rows. Explore reports can be sampled at high cardinality. Understanding which reports are affected prevents false conclusions from incomplete data.
Intermediate

GA4 has two distinct mechanisms that can cause reports to show estimated rather than exact data: sampling and thresholding. They look similar on the surface, both produce numbers that are not precisely accurate, but they have entirely different causes, different implications, and different fixes. Understanding the difference is fundamental to knowing when to trust a report and when to investigate further.

What is sampling in GA4

Sampling occurs when GA4 has too much data to process for a given report query and instead analyses a representative subset of events to produce the result. The sample is statistically designed to be representative, but it introduces margin of error that grows as the sampling rate decreases.

In GA4 standard reports, sampling is relatively rare because Google pre-aggregates data. Standard reports are based on aggregated tables computed in the background, not on raw event queries. Sampling is much more common in GA4 Explorations, where you are effectively running ad-hoc queries against raw data. The same query can also return different totals across surfaces — seewhy GA4 API, UI, and Explore numbers diverge.

When an Exploration is sampled, a yellow warning badge appears in the top right corner of the report. The badge shows the sampling rate, such as 42 percent, meaning the report is based on 42 percent of available events. For a property with 10 million events in the selected date range, the report is based on 4.2 million events. Conversion rates, revenue figures, and segment comparisons from a heavily sampled report can be meaningfully inaccurate.

What is thresholding in GA4

Thresholding is fundamentally different from sampling. It is a privacy protection mechanism. When a dimension combination in your report represents a group of users too small to be reported without potentially identifying individuals, GA4 suppresses that data entirely rather than showing it. The mechanism is closely tied toGoogle Signals, which is the source of most demographic dimensions affected by it.

Thresholding is applied automatically when Google Signals is enabled and a report includes demographic dimensions (age, gender, interests, or any dimension that relies on Signals data). If a particular row in your report (for example, Female, 25-34, Safari, Berlin) has fewer users than Google's minimum threshold, that row is removed from the report.

Unlike sampling, thresholding is surfaced through GA4's data quality indicator when applicable, but teams still often miss it in practice. The data may simply appear incomplete: segmented rows can sum to less than the overall total because some rows have been suppressed.

Sampling
Thresholds
What causes it
Query covers too many events for GA4 to process in full
A dimension row has too few users to show safely under Google Signals
Where it applies
Explorations primarily; rarely in standard reports
Standard reports and Explorations when demographic dimensions are included
How to detect
Yellow sampling warning badge in Explorations top-right corner
GA4 shows a data quality indicator for thresholding in some report contexts, but it is less prominent than the sampling indicator and often missed
How to mitigate
Narrow date range, reduce dimensions, or use BigQuery
Disable Google Signals or switch Reporting Identity to Device-based
BigQuery impact
BigQuery export is never sampled, full raw event data
BigQuery is not subject to Signals thresholding

How to reduce or eliminate sampling

The most effective approaches:

  • Narrow the date range: The most impactful single change. Reducing from 90 days to 30 days typically reduces the event volume substantially and may move the query below the sampling threshold.
  • Reduce dimension count: Each additional dimension multiplies the cardinality of the result set. Fewer dimensions means smaller result sets and less likelihood of hitting sampling thresholds.
  • Use GA4 to 360: GA4 to 360 (the enterprise version) provides significantly higher unsampled data thresholds and access to unsampled exports. Standard properties have lower thresholds beyond which sampling kicks in.
  • Use BigQuery: The BigQuery export contains all raw events without any sampling. Analytical queries on BigQuery return exact results regardless of data volume. For high-volume properties where sampling is frequent, BigQuery is the definitive solution. The trade-offs and reconciliation pitfalls are covered inGA4 BigQuery export parity.

If your reporting layer isLooker Studio, the safest design is to source long-range queries from BigQuery rather than the GA4 API to avoid hitting sampling on heavy dashboards.

Want to know if your GA4 reports are being sampled or thresholded right now?

How to identify thresholding

Create the same report twice: once with demographic dimensions and once without. If the totals differ, thresholding is removing some rows from the demographic version. The difference between the two totals represents the hidden data.

Alternatively, change the Reporting Identity to Device-based in Admin > Data Settings > Reporting Identity. Device-based reporting does not use Google Signals and therefore does not apply the same thresholding. Compare the results. If Device-based shows higher totals in segmented reports, Signals thresholding was hiding rows under the Blended identity.

Standard reports vs explorations

Standard reports in GA4 are rarely sampled because they use pre-aggregated data tables. However, they can be subject to thresholding. Explorations can be sampled (depending on data volume and query complexity) and can also be subject to thresholding (if demographic dimensions are used).

The GA4 UI does not always make the distinction clear. A report that appears to show exact numbers may be thresholded without a warning. Always validate critical metrics in multiple report configurations before treating them as definitive.

The (other) row

A third data limitation mechanism (distinct from sampling and thresholding) is the (other) row that appears in standard reports when a dimension has too many unique values. GA4 groups overflow values into (other). This is a cardinality management feature, not a sampling or privacy mechanism. High-cardinality dimensions like specific page URLs or product SKUs are most affected. The fix is to reduce cardinality in the dimension itself (through better parameter design) or use BigQuery where the (other) aggregation does not apply.

Sampling and thresholds: validate, fix, and watch

Validate

  • Check Exploration reports for the yellow sampling indicator badge in the top-right corner
  • Check if Google Signals is enabled under Admin > Data Collection and Modification > Google Signals
  • Compare demographic segmented totals against non-demographic totals to detect threshold suppression
  • Switch Reporting Identity to Device-based and compare segmented report totals

Fix

  • Reduce date range in Explorations to lower event volume and reduce sampling
  • Use BigQuery export for large queries that consistently sample in Explorations
  • Disable Google Signals if demographic thresholds are causing (other) rows to hide significant data
  • Set Reporting Identity to Device-based to eliminate Signals-based thresholding

Watch for

  • Segmented report totals that don't add up to the unsegmented total, silent thresholding at work
  • Heavily sampled Explorations where the query should be rebuilt or validated against a more reliable source
  • Standard reports showing demographic data that doesn't match Exploration totals

Sampling and thresholding checklist

  • Exploration reports are checked for the yellow sampling warning badge
  • Date ranges in Explorations are as narrow as needed for the analysis
  • Standard reports with demographic dimensions are cross-checked against non-demographic versions to detect thresholding
  • BigQuery is used for high-precision analysis when sampling rates in Explorations are below 50 percent
  • Stakeholders are informed when a report is based on sampled data and the approximate margin of error

Get exact data, not estimates

G4 Audits detects sampling, thresholding, and (other) row cardinality issues across your GA4 property automatically.

Audit findings should be reviewed by a qualified analyst before they are used for major reporting, media, or implementation decisions. Review your findings

GA4 Audits Team

GA4 Audits Team

Analytics Engineering

Specialising in GA4 architecture, consent mode implementation, and multi-layer audit frameworks.

Share