Head and tail sampling — how to reduce telemetry volume without losing visibility, configure samplers via SDK or environment variables, and avoid the silent always_off gotcha.
Use this file to discover all available pages before exploring further.
A span is sampled when it’s processed and exported. Sampling is how you reduce telemetry volume — and the cost that goes with it — without losing the visibility you need to debug production issues.The principle is representativeness: a smaller, well-chosen group can accurately represent a larger one. Sampling 20% of traces doesn’t mean you only see 20% of failures — it means you see a statistically representative sample of your application’s behavior.There are two approaches: head sampling and tail sampling. They differ in when the sampling decision is made.
The decision to sample is made at the start of the trace, before any spans complete.
Pros
Cons
Efficient — no buffering required.
Can’t filter by latency, error status, or any post-hoc attribute.
Scales easily across distributed systems.
May drop important traces (the slow ones, the failing ones).
Reduces traffic early in the pipeline.
Less precise control.
Easy to configure — built into the OTel SDK.
Head sampling is the default in the OTel SDK. The most common samplers:
ALWAYS_ON — sample every trace.
ALWAYS_OFF — sample nothing.
TraceIdRatioBased(p) — sample p fraction of traces, deterministically based on trace ID.
ParentBased(...) — defer to the parent’s sampling decision when there is one; otherwise fall through to the wrapped sampler.
ParentBased is what keeps distributed traces consistent — if the root service decides to sample a trace, downstream services see that decision in the trace context and respect it, so the whole trace lives or dies together.
A float in [0.0, 1.0] — required by ratio-based samplers.
The static samplers:
always_on — sample every trace.
always_off — sample no traces.
The ratio-based samplers:
traceidratio — sample a fraction of traces. Each span is sampled independently, which can break distributed traces (downstream services may drop spans the root service kept).
parentbased_traceidratio — sample a fraction at the root, then respect the parent’s decision everywhere else. Keeps or drops the entire trace consistently. Use this for distributed systems.
If your traces aren’t showing up, check OTEL_TRACES_SAMPLER first.A real-world failure mode: pre-set environment variables in shared environments (Kubernetes pods, base Docker images, CI runners) sometimes contain OTEL_TRACES_SAMPLER=always_off. This silently disables tracing — no error, no warning, no spans exported. Application code looks correct. Logs look fine. Nothing reaches Arize AX.When “no traces are showing up,” check the environment variables before anything else.
Most teams should start with one of these configurations:
Environment
Recommended setting
Development
ALWAYS_ON — see every trace while you’re building.
Staging
ALWAYS_ON — full visibility for validation.
Production, low/medium traffic
ALWAYS_ON — Arize AX handles the volume; you keep full fidelity.
Production, high traffic
ParentBased(TraceIdRatioBased(0.1)) — head sample 10%, keep distributed traces consistent. Combine with tail sampling at the Collector for must-keep traces (errors, slow requests).
The last topic in this section — the OpenTelemetry Collector. Useful for tail sampling, centralized policy, multi-backend routing, and a few other production patterns: