GA4 has two distinct mechanisms that can cause reports to show estimated rather than exact data: sampling and thresholding. They look similar on the surface, both produce numbers that are not precisely accurate, but they have entirely different causes, different implications, and different fixes. Understanding the difference is fundamental to knowing when to trust a report and when to investigate further.
What is sampling in GA4
Sampling occurs when GA4 has too much data to process for a given report query and instead analyses a representative subset of events to produce the result. The sample is statistically designed to be representative, but it introduces margin of error that grows as the sampling rate decreases.
In GA4 standard reports, sampling is relatively rare because Google pre-aggregates data. Standard reports are based on aggregated tables computed in the background, not on raw event queries. Sampling is much more common in GA4 Explorations, where you are effectively running ad-hoc queries against raw data. The same query can also return different totals across surfaces — seewhy GA4 API, UI, and Explore numbers diverge.
When an Exploration is sampled, a yellow warning badge appears in the top right corner of the report. The badge shows the sampling rate, such as 42 percent, meaning the report is based on 42 percent of available events. For a property with 10 million events in the selected date range, the report is based on 4.2 million events. Conversion rates, revenue figures, and segment comparisons from a heavily sampled report can be meaningfully inaccurate.
What is thresholding in GA4
Thresholding is fundamentally different from sampling. It is a privacy protection mechanism. When a dimension combination in your report represents a group of users too small to be reported without potentially identifying individuals, GA4 suppresses that data entirely rather than showing it. The mechanism is closely tied toGoogle Signals, which is the source of most demographic dimensions affected by it.
Thresholding is applied automatically when Google Signals is enabled and a report includes demographic dimensions (age, gender, interests, or any dimension that relies on Signals data). If a particular row in your report (for example, Female, 25-34, Safari, Berlin) has fewer users than Google's minimum threshold, that row is removed from the report.
Unlike sampling, thresholding is surfaced through GA4's data quality indicator when applicable, but teams still often miss it in practice. The data may simply appear incomplete: segmented rows can sum to less than the overall total because some rows have been suppressed.
How to reduce or eliminate sampling
The most effective approaches:
- Narrow the date range: The most impactful single change. Reducing from 90 days to 30 days typically reduces the event volume substantially and may move the query below the sampling threshold.
- Reduce dimension count: Each additional dimension multiplies the cardinality of the result set. Fewer dimensions means smaller result sets and less likelihood of hitting sampling thresholds.
- Use GA4 to 360: GA4 to 360 (the enterprise version) provides significantly higher unsampled data thresholds and access to unsampled exports. Standard properties have lower thresholds beyond which sampling kicks in.
- Use BigQuery: The BigQuery export contains all raw events without any sampling. Analytical queries on BigQuery return exact results regardless of data volume. For high-volume properties where sampling is frequent, BigQuery is the definitive solution. The trade-offs and reconciliation pitfalls are covered inGA4 BigQuery export parity.
If your reporting layer isLooker Studio, the safest design is to source long-range queries from BigQuery rather than the GA4 API to avoid hitting sampling on heavy dashboards.
Want to know if your GA4 reports are being sampled or thresholded right now?
How to identify thresholding
Create the same report twice: once with demographic dimensions and once without. If the totals differ, thresholding is removing some rows from the demographic version. The difference between the two totals represents the hidden data.
Alternatively, change the Reporting Identity to Device-based in Admin > Data Settings > Reporting Identity. Device-based reporting does not use Google Signals and therefore does not apply the same thresholding. Compare the results. If Device-based shows higher totals in segmented reports, Signals thresholding was hiding rows under the Blended identity.
Standard reports vs explorations
Standard reports in GA4 are rarely sampled because they use pre-aggregated data tables. However, they can be subject to thresholding. Explorations can be sampled (depending on data volume and query complexity) and can also be subject to thresholding (if demographic dimensions are used).
The GA4 UI does not always make the distinction clear. A report that appears to show exact numbers may be thresholded without a warning. Always validate critical metrics in multiple report configurations before treating them as definitive.
The (other) row
A third data limitation mechanism (distinct from sampling and thresholding) is the (other) row that appears in standard reports when a dimension has too many unique values. GA4 groups overflow values into (other). This is a cardinality management feature, not a sampling or privacy mechanism. High-cardinality dimensions like specific page URLs or product SKUs are most affected. The fix is to reduce cardinality in the dimension itself (through better parameter design) or use BigQuery where the (other) aggregation does not apply.
Sampling and thresholds: validate, fix, and watch
Validate
- Check Exploration reports for the yellow sampling indicator badge in the top-right corner
- Check if Google Signals is enabled under Admin > Data Collection and Modification > Google Signals
- Compare demographic segmented totals against non-demographic totals to detect threshold suppression
- Switch Reporting Identity to Device-based and compare segmented report totals
Fix
- Reduce date range in Explorations to lower event volume and reduce sampling
- Use BigQuery export for large queries that consistently sample in Explorations
- Disable Google Signals if demographic thresholds are causing (other) rows to hide significant data
- Set Reporting Identity to Device-based to eliminate Signals-based thresholding
Watch for
- Segmented report totals that don't add up to the unsegmented total, silent thresholding at work
- Heavily sampled Explorations where the query should be rebuilt or validated against a more reliable source
- Standard reports showing demographic data that doesn't match Exploration totals
Sampling and thresholding checklist
- Exploration reports are checked for the yellow sampling warning badge
- Date ranges in Explorations are as narrow as needed for the analysis
- Standard reports with demographic dimensions are cross-checked against non-demographic versions to detect thresholding
- BigQuery is used for high-precision analysis when sampling rates in Explorations are below 50 percent
- Stakeholders are informed when a report is based on sampled data and the approximate margin of error
Related guides
GA4 BigQuery Export Parity
Why BigQuery numbers differ from the GA4 UI and how to reconcile them.
GA4 Standard Reports vs Explorations vs Data API
When to use each reporting surface and what limitations apply to each.
GA4 API Quotas and Limits
Rate limits, quota errors, and how to design around them when querying at scale.
GA4 Data Retention Settings
How retention limits affect Explorations and what to do before data expires.
Get exact data, not estimates
G4 Audits detects sampling, thresholding, and (other) row cardinality issues across your GA4 property automatically.