> ## Documentation Index
> Fetch the complete documentation index at: https://docs.honeyhive.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Sampling

> Control which requests get traced in high-volume applications.

## Why Sample

HoneyHive traces asynchronously and batches span exports, so overhead is minimal. At very high volumes, however, you may want to selectively trace to control cost and storage.

The goal is to capture every trace that matters while reducing noise from routine traffic.

***

## Strategies

### Always Trace Errors

Failed requests are the most valuable traces. Always capture them regardless of sampling rate.

```python theme={null}
def should_trace(request):
    if request.get("error"):
        return True
    if request.get("status_code", 200) >= 400:
        return True
    return False
```

### Always Trace High-Priority Requests

Premium users, flagged accounts, or specific endpoints should always be traced.

```python theme={null}
def should_trace(request):
    if request.get("customer_tier") == "premium":
        return True
    if request.get("endpoint") in ["/api/checkout", "/api/agent"]:
        return True
    return False
```

### Percentage Sampling

For regular traffic, sample a fixed percentage:

```python theme={null}
import random

SAMPLE_RATE = 0.01  # 1% of traffic

def should_trace(request):
    return random.random() < SAMPLE_RATE
```

### Combined

In practice, combine these strategies:

```python theme={null}
import random

def should_trace(request):
    # Always trace errors
    if request.get("error") or request.get("status_code", 200) >= 400:
        return True
    # Always trace premium users
    if request.get("customer_tier") == "premium":
        return True
    # Sample 1% of regular traffic
    return random.random() < 0.01
```

***

## Applying Sampling

Use your sampling function to conditionally initialize the tracer or skip tracing for a given request:

```python theme={null}
from honeyhive import HoneyHiveTracer, trace

tracer = HoneyHiveTracer.init(
    project="my-app",
    source="production"
)

@app.route("/api/process", methods=["POST"])
def handle_request():
    if not should_trace(request):
        return process(request.json)

    @trace(tracer=tracer)
    def traced_process(data):
        return process(data)

    return traced_process(request.json)
```

***

## Payload Size

Independent of sampling, avoid tracing large objects directly. Trace metadata about the data instead:

```python theme={null}
from honeyhive import trace, enrich_span

@trace()
def process(large_data: dict):
    enrich_span({
        "data.size_mb": len(str(large_data)) / 1024 / 1024,
        "data.keys_count": len(large_data)
    })
    # Don't do this:
    # enrich_span({"data.full_content": large_data})
```

***

## Related

<CardGroup cols={2}>
  <Card title="Production Deployment" icon="rocket" href="/v2/tutorials/production-deployment">
    Initialization patterns for production environments
  </Card>

  <Card title="Tracing Introduction" icon="book" href="/v2/tracing/introduction">
    Data model and architecture overview
  </Card>

  <Card title="Span Filtering" icon="ban" href="/v2/tracing/filtering">
    Drop noisy framework spans using prefix-based rules
  </Card>
</CardGroup>
