evaluate(), Lambda, and a long-running web server because session state is handled differently in each pattern.
Which Pattern?
| Runtime | Where to initialize | Session strategy | Why |
|---|---|---|---|
| Scripts / notebooks | Module-level in the entry point | One shared session is often enough | Simple single execution flow |
| AWS Lambda / Cloud Functions | Outside the handler with lazy init | create_session() per invocation | Reuse warm containers without sharing invocation state |
| FastAPI / Flask / Django | Once at app startup | create_session() or acreate_session() per request | Reuse one tracer while isolating concurrent requests |
Initialize the tracer before any instrumentor. Call
HoneyHiveTracer.init(...) first, then pass tracer.provider into instrumentor.instrument(...).What changes by runtime is not whether you initialize the tracer. What changes is where you place that initialization and how you create session context.For request-scoped and invocation-scoped runtimes, create_session() and acreate_session() put the active session ID in OpenTelemetry baggage. HoneyHive resolves that baggage session before falling back to the tracer instance’s startup session.Scripts and Notebooks
Initialize once at module level. All traced operations share the same session.Evaluation / Experiments
When running experiments withevaluate(), don’t create your own tracer. The SDK creates a new tracer per datapoint automatically, giving each datapoint its own isolated session.
Serverless
In serverless environments like Lambda and Cloud Functions, initialize the tracer outside the handler and reuse it across warm starts. Then callcreate_session() inside the handler so each invocation gets its own active session. The invocation-scoped baggage session takes precedence over any default session on the shared tracer.
Batched export and serverless: By default, the SDK batches spans before exporting. In serverless environments where the runtime freezes between invocations, we recommend setting
disable_batch=True so spans are exported immediately rather than queued. You can also set this via the HH_DISABLE_BATCH=true environment variable. Alternatively, you can keep the default batched mode and call tracer.force_flush() before returning to drain the queue, but disable_batch=True is simpler since it removes the dependency on remembering to flush.Linking Lambda Invocations
To link multiple invocations into the same session (e.g., multi-turn conversations), pass asession_id through your event payload and reuse get_tracer() from the lazy-init pattern above:
Web Servers
For long-running servers (FastAPI, Flask, Django), initialize one tracer at startup and create a new session per request usingcreate_session() or its async variant acreate_session().
How session isolation works:
create_session() and acreate_session() store the active session ID in OpenTelemetry baggage, which uses Python context propagation and ContextVar for async/task-local state. HoneyHive reads the baggage session first and only falls back to the tracer instance when no request-scoped session is present, so one shared tracer can safely serve concurrent requests.FastAPI
Flask
For synchronous frameworks, usecreate_session() instead of acreate_session():
Multi-Turn Conversations
For multi-turn conversations, the first request creates a session and returns the ID to the client. Subsequent requests link to that session usingskip_api_call=True, which sets the session context without making an API call.
| Scenario | Code | When |
|---|---|---|
| Auto-generate ID | create_session(session_name="request") | New session, let HoneyHive assign the ID |
| Custom ID | create_session(session_id="my-id") | Use your own ID scheme |
| Link to existing | create_session(session_id="existing-id", skip_api_call=True) | Session already exists in HoneyHive |
Scoped Sessions
For single-use scripts, dedicated worker runs, or batch tasks where the rest of the current execution context belongs to the same logical unit of work,with_session can be convenient. For web requests, prefer create_session() or acreate_session() in middleware:
Thread and Process Safety
The global tracer +create_session() pattern is safe for:
- Multi-threaded servers (FastAPI, Flask with threads) — baggage uses
ContextVar, which is inherently thread-local - Multi-process deployments (Gunicorn workers, uWSGI) — each process gets its own tracer instance; processes don’t share state
Span Export Modes
By default, the SDK exports spans asynchronously in batches using a background thread. This meansspan.end() returns immediately and spans are sent in the background, so export latency never blocks your application.
| Mode | Setting | How it works | Best for |
|---|---|---|---|
| Batched async (default) | disable_batch=False | Spans queue in memory and flush in a background thread (batch size: 100, flush interval: 5s) | Web servers, long-running services |
| Immediate sync | disable_batch=True | Each span is exported inline when it ends, so span.end() blocks until the HTTP request completes | AWS Lambda, Cloud Functions, short-lived scripts |
Batched Async Export (Default)
Immediate Sync Export
disable_batch=True when the runtime may freeze or terminate immediately after the handler returns, such as AWS Lambda, Google Cloud Functions, or one-off CLI scripts. In these environments, a background thread may not get a chance to flush before the process is frozen.
With
disable_batch=True, each span exports synchronously when it ends, so force_flush() is effectively a no-op for spans that have already completed. If your handler spawns child threads or async tasks, make sure all work finishes (and spans end) before the handler returns - otherwise those spans may be lost when the runtime freezes.Flushing
Both modes support explicit flushing:flush() / force_flush():
- At the end of a Lambda handler, before returning the response
- At the end of a Jupyter notebook cell
- Before process exit in scripts
- In
atexithandlers or signal handlers for graceful shutdown
Best Practices
Pass an explicit tracer to @trace
Pass an explicit tracer to @trace
Passing
tracer=tracer makes the binding explicit and avoids relying on implicit tracer discovery.Create sessions per logical unit of work
Create sessions per logical unit of work
Even with a global tracer, create sessions to isolate traces by request, user, or job.
Match session creation to the runtime
Match session creation to the runtime
Use the tracer placement that matches your runtime:
- Scripts and notebooks: initialize once in the module that starts the run
- Lambda and other serverless runtimes: lazy-init outside the handler, then create a session per invocation
- Web servers: initialize once at startup, then create a session per request
evaluate(): let the SDK create and manage tracers for you
Use test_mode for local development
Use test_mode for local development
test_mode=True (or the HH_TEST_MODE=true environment variable) disables OTLP export and generates a local session ID instead of creating one in HoneyHive. Use it for local development and tests when you want tracer setup without exporting spans over OTLP.Related
Production Deployment
Error handling, environment config, and deployment checklist
Multi-Instance Tracing
Run multiple tracer instances for multi-tenant or A/B testing
Distributed Tracing
Propagate trace context across service boundaries
Experiments
Run evaluations with automatic per-datapoint tracing
Span Filtering
Drop noisy framework spans using prefix-based rules
Environment Variables
Full reference for SDK configuration via environment variables

