> ## Documentation Index
> Fetch the complete documentation index at: https://docs.honeyhive.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# How to Initialize the HoneyHive Tracer

> Learn where to initialize HoneyHiveTracer for scripts, evaluate(), Lambda, and web servers, and how to create per-request session context.

Initialize `HoneyHiveTracer` once per process in the right place for your runtime so every LLM call and custom span lands in the correct session. Placement differs for scripts, `evaluate()`, Lambda, and web servers because session state is handled differently in each pattern. If you are new to tracing, complete the [tracing quickstart](/v2/introduction/tracing-quickstart) first; for cross-service setups, pair this with [distributed tracing](/v2/tracing/distributed-tracing).

## Which initialization pattern should you use?

| Runtime                          | Where to initialize                | Session strategy                                      | Why                                                    |
| -------------------------------- | ---------------------------------- | ----------------------------------------------------- | ------------------------------------------------------ |
| **Scripts / notebooks**          | Module-level in the entry point    | One shared session is often enough                    | Simple single execution flow                           |
| **AWS Lambda / Cloud Functions** | Outside the handler with lazy init | `create_session()` per invocation                     | Reuse warm containers without sharing invocation state |
| **FastAPI / Flask / Django**     | Once at app startup                | `create_session()` or `acreate_session()` per request | Reuse one tracer while isolating concurrent requests   |

<Note>
  **Initialize the tracer before any instrumentor.** Call `HoneyHiveTracer.init(...)` first, then pass `tracer.provider` into `instrumentor.instrument(...)`.

  What changes by runtime is not whether you initialize the tracer. What changes is where you place that initialization and how you create session context.

  For request-scoped and invocation-scoped runtimes, `create_session()` and `acreate_session()` put the active session ID in OpenTelemetry baggage. HoneyHive resolves that baggage session before falling back to the tracer instance's startup session.
</Note>

<Tip>
  `evaluate()` is the main exception to the table above. When you're running experiments with `evaluate()`, do not initialize your own tracer. The SDK creates and manages a separate tracer for each datapoint.
</Tip>

## How do you initialize the tracer in scripts and notebooks?

Initialize once at module level. All traced operations share the same session.

```python theme={null}
from honeyhive import HoneyHiveTracer, trace
import os

tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    session_name="local-dev-session"
)

@trace(event_type="tool", tracer=tracer)
def process_data(input_text):
    result = transform(input_text)
    tracer.enrich_span(metadata={"input_length": len(input_text)})
    return result

if __name__ == "__main__":
    result1 = process_data("Hello")
    result2 = process_data("World")
```

This is the simplest pattern. Use it for scripts, notebooks, and quick debugging.

***

## Should you initialize a tracer with evaluate()?

When running experiments with `evaluate()`, **don't** create your own tracer. The SDK creates a new tracer per datapoint automatically, giving each datapoint its own isolated session.

```python theme={null}
from honeyhive import trace
from honeyhive.experiments import evaluate
import os

# No HoneyHiveTracer.init() here

@trace(event_type="tool")  # No tracer parameter
def my_rag_pipeline(datapoint: dict):
    inputs = datapoint["inputs"]
    response = generate_response(inputs["query"], inputs["context"])
    return {"answer": response}

result = evaluate(
    function=my_rag_pipeline,
    dataset=my_dataset,
    api_key=os.getenv("HH_API_KEY"),
    name="rag-experiment-1"
)
```

<Warning>
  **Don't initialize a global tracer alongside `evaluate()`.**

  A global tracer can conflict with the per-datapoint tracers that `evaluate()` creates. If you see traces landing in the wrong session, remove the global `HoneyHiveTracer.init()` call.

  ```python theme={null}
  # Wrong -- global tracer conflicts with evaluate()
  tracer = HoneyHiveTracer.init(...)

  @trace(event_type="tool", tracer=tracer)  # Forces all datapoints to share one session
  def my_function(input):
      pass

  # Correct -- let evaluate() manage tracers
  @trace(event_type="tool")  # evaluate() provides isolated tracer per datapoint
  def my_function(input):
      pass
  ```
</Warning>

***

## How do you initialize the tracer in serverless?

In serverless environments like Lambda and Cloud Functions, initialize the tracer outside the handler and reuse it across warm starts. Then call `create_session()` inside the handler so each invocation gets its own active session. The invocation-scoped baggage session takes precedence over any default session on the shared tracer.

```python theme={null}
from honeyhive import HoneyHiveTracer, trace
import os
from typing import Optional

_tracer: Optional[HoneyHiveTracer] = None  # Survives warm starts

def get_tracer() -> HoneyHiveTracer:
    global _tracer
    if _tracer is None:
        _tracer = HoneyHiveTracer.init(
            api_key=os.getenv("HH_API_KEY"),
            source="lambda",
            disable_batch=True  # Recommended for serverless - export spans immediately
        )
    return _tracer

def lambda_handler(event, context):
    tracer = get_tracer()

    # Create a new session for this invocation
    session_id = tracer.create_session(
        session_name=f"lambda-{context.aws_request_id}",
        inputs={"event": event}
    )

    result = process_event(event)

    tracer.enrich_session(
        outputs={"result": result},
        metadata={"request_id": context.aws_request_id}
    )
    tracer.force_flush(timeout_millis=5000)  # No-op with disable_batch=True, but harmless safety net
    return result

@trace(event_type="tool")
def process_event(event):
    get_tracer().enrich_span(metadata={"event_type": event.get("type")})
    return {"status": "success"}
```

<Note>
  **Batched export and serverless:** By default, the SDK batches spans before exporting. In serverless environments where the runtime freezes between invocations, we recommend setting `disable_batch=True` so spans are exported immediately rather than queued. You can also set this via the `HH_DISABLE_BATCH=true` environment variable. Alternatively, you can keep the default batched mode and call `tracer.force_flush()` before returning to drain the queue, but `disable_batch=True` is simpler since it removes the dependency on remembering to flush.

  ```python theme={null}
  _tracer = HoneyHiveTracer.init(
      api_key=os.getenv("HH_API_KEY"),
      source="lambda",
      disable_batch=True  # Export spans immediately instead of batching
  )
  ```
</Note>

**LRU cache alternative** for lazy initialization:

```python theme={null}
from functools import lru_cache

@lru_cache(maxsize=1)
def get_tracer():
    return HoneyHiveTracer.init(
        api_key=os.getenv("HH_API_KEY"),
        disable_batch=True,  # Recommended for serverless
    )
```

### Linking Lambda Invocations

To link multiple invocations into the same session (e.g., multi-turn conversations), pass a `session_id` through your event payload and reuse `get_tracer()` from the lazy-init pattern above:

```python theme={null}
import uuid

def lambda_handler(event, context):
    tracer = get_tracer()
    existing_session_id = event.get("session_id")

    if existing_session_id:
        # Link to an existing session (no API call)
        tracer.create_session(session_id=existing_session_id, skip_api_call=True)
        session_id = existing_session_id
    else:
        # Create a new session with your own ID
        session_id = tracer.create_session(
            session_id=str(uuid.uuid4()),
            session_name=f"lambda-{context.function_name}",
            inputs={"event": event}
        )

    result = process_event(event)
    tracer.enrich_session(outputs={"result": result})
    return {"session_id": session_id, "result": result}
```

### Skipping Init-Time Session Creation

Set `skip_backend_session_creation=True` when you do not want `HoneyHiveTracer.init()` to create a backend session synchronously. This is useful when another service already created the session, or when you create request-scoped sessions later with `create_session(skip_api_call=True)`.

```python theme={null}
tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    skip_backend_session_creation=True
)
```

If you pass a valid `session_id`, the tracer attaches spans to that existing session without making a creation call during initialization:

```python theme={null}
tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    session_id="existing-session-uuid",
    skip_backend_session_creation=True
)
```

If you omit `session_id`, the tracer still skips the init-time creation call and does not generate a session ID during initialization. Set the request-scoped session later, for example with `create_session(session_id=..., skip_api_call=True)`. Spans emitted before a request-scoped session is set do not carry that session ID. Default behavior is unchanged when `skip_backend_session_creation` is not set.

<Note>
  This is different from `create_session(skip_api_call=True)`, which skips the API call for a *per-request* session. `skip_backend_session_creation` skips the API call during *tracer initialization* itself.
</Note>

***

## How do you initialize the tracer in web servers?

For long-running servers (FastAPI, Flask, Django), initialize **one** tracer at startup and create a **new session per request** using `create_session()` or its async variant `acreate_session()`.

<Note>
  **How session isolation works:** `create_session()` and `acreate_session()` store the active session ID in OpenTelemetry baggage, which uses Python context propagation and `ContextVar` for async/task-local state. HoneyHive reads the baggage session first and only falls back to the tracer instance when no request-scoped session is present, so one shared tracer can safely serve concurrent requests.
</Note>

### FastAPI

```python theme={null}
from fastapi import FastAPI, Request
from honeyhive import HoneyHiveTracer, trace
import os

tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    source="production"
)

app = FastAPI()

@app.middleware("http")
async def session_middleware(request: Request, call_next):
    session_id = await tracer.acreate_session(
        session_name=f"api-{request.url.path}",
        inputs={
            "method": request.method,
            "path": str(request.url),
            "user_id": request.headers.get("X-User-ID")
        }
    )

    response = await call_next(request)

    tracer.enrich_session(outputs={"status_code": response.status_code})

    if session_id:
        response.headers["X-Session-ID"] = session_id

    return response

@app.post("/api/chat")
@trace(event_type="chain", tracer=tracer)
async def chat_endpoint(message: str):
    tracer.enrich_span(metadata={"message_length": len(message)})
    response = await process_message(message)
    return {"response": response}

@trace(event_type="tool", tracer=tracer)
async def process_message(message: str):
    result = await llm_call(message)
    return result
```

### Flask

For synchronous frameworks, use `create_session()` instead of `acreate_session()`:

```python theme={null}
from flask import Flask, request
from honeyhive import HoneyHiveTracer, trace
import os

tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    source="production"
)

app = Flask(__name__)

@app.before_request
def create_session_for_request():
    tracer.create_session(
        session_name=f"flask-{request.path}",
        inputs={"method": request.method}
    )

@app.after_request
def enrich_session_after_request(response):
    tracer.enrich_session(outputs={"status_code": response.status_code})
    return response

@app.route("/api/process", methods=["POST"])
@trace(event_type="tool", tracer=tracer)
def process_endpoint():
    return {"result": "ok"}
```

<Warning>
  **Don't use `session_start()` for web servers.** `session_start()` stores the session ID on the tracer instance itself, which causes race conditions when multiple requests run concurrently. Use `create_session()` or `acreate_session()` instead. They store the session ID in request-scoped baggage.
</Warning>

### Multi-Turn Conversations

For multi-turn conversations, the first request creates a session and returns the ID to the client. Subsequent requests link to that session using `skip_api_call=True`, which sets the session context without making an API call.

```python theme={null}
@app.middleware("http")
async def session_middleware(request: Request, call_next):
    existing_session = request.headers.get("X-Session-ID")

    if existing_session:
        # Link to existing session (no API call)
        await tracer.acreate_session(
            session_id=existing_session,
            skip_api_call=True
        )
    else:
        # Create new session
        session_id = await tracer.acreate_session(
            session_name=f"conversation-{request.url.path}"
        )
        request.state.new_session_id = session_id

    response = await call_next(request)

    if hasattr(request.state, "new_session_id"):
        response.headers["X-Session-ID"] = request.state.new_session_id

    return response
```

| Scenario         | Code                                                           | When                                     |
| ---------------- | -------------------------------------------------------------- | ---------------------------------------- |
| Auto-generate ID | `create_session(session_name="request")`                       | New session, let HoneyHive assign the ID |
| Custom ID        | `create_session(session_id="my-id")`                           | Use your own ID scheme                   |
| Link to existing | `create_session(session_id="existing-id", skip_api_call=True)` | Session already exists in HoneyHive      |

### Scoped Sessions

For single-use scripts, dedicated worker runs, or batch tasks where the rest of the current execution context belongs to the same logical unit of work, `with_session` can be convenient. For web requests, prefer `create_session()` or `acreate_session()` in middleware:

```python theme={null}
with tracer.with_session("batch-job", inputs={"batch_id": batch_id}) as session_id:
    process_batch(items)
    tracer.enrich_session(outputs={"processed": len(items)})
```

### Thread and Process Safety

The global tracer + `create_session()` pattern is safe for:

* **Multi-threaded servers** (FastAPI, Flask with threads) -- baggage uses `ContextVar`, which is inherently thread-local
* **Multi-process deployments** (Gunicorn workers, uWSGI) -- each process gets its own tracer instance; processes don't share state

***

## Which span export mode should you use?

By default, the SDK exports spans asynchronously in batches using a background thread. This means `span.end()` returns immediately and spans are sent in the background, so export latency never blocks your application.

| Mode                        | Setting               | How it works                                                                                       | Best for                                         |
| --------------------------- | --------------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------ |
| **Batched async** (default) | `disable_batch=False` | Spans queue in memory and flush in a background thread (batch size: 100, flush interval: 5s)       | Web servers, long-running services               |
| **Immediate sync**          | `disable_batch=True`  | Each span is exported inline when it ends, so `span.end()` blocks until the HTTP request completes | AWS Lambda, Cloud Functions, short-lived scripts |

### Batched Async Export (Default)

```python theme={null}
tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    # disable_batch=False is the default
)
```

With batched export, spans accumulate in an internal queue and are sent in bulk every \~5 seconds (or when the queue fills up). This is the best mode for web servers and long-running services because it minimizes the performance impact of tracing on your application.

<Tip>
  Call `tracer.flush()` or `tracer.force_flush()` at the end of your process or notebook cell to drain any remaining spans from the queue before the process exits.
</Tip>

### Immediate Sync Export

```python theme={null}
tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    disable_batch=True,  # Export each span immediately
)
```

Use `disable_batch=True` when the runtime may freeze or terminate immediately after the handler returns, such as AWS Lambda, Google Cloud Functions, or one-off CLI scripts. In these environments, a background thread may not get a chance to flush before the process is frozen.

<Note>
  With `disable_batch=True`, each span exports synchronously when it ends, so `force_flush()` is effectively a no-op for spans that have already completed. If your handler spawns child threads or async tasks, make sure all work finishes (and spans end) before the handler returns - otherwise those spans may be lost when the runtime freezes.
</Note>

### Flushing

Both modes support explicit flushing:

```python theme={null}
# Drain all queued spans (batched mode) or confirm in-flight exports (immediate mode)
tracer.flush()                          # Alias for force_flush()
tracer.force_flush(timeout_millis=5000) # With explicit timeout
```

Use `flush()` / `force_flush()`:

* At the end of a Lambda handler, before returning the response
* At the end of a Jupyter notebook cell
* Before process exit in scripts
* In `atexit` handlers or signal handlers for graceful shutdown

***

## What are tracer initialization best practices?

<AccordionGroup>
  <Accordion title="Pass an explicit tracer to @trace">
    Passing `tracer=tracer` makes the binding explicit and avoids relying on implicit tracer discovery.

    ```python theme={null}
    tracer = HoneyHiveTracer.init(...)

    @trace(event_type="tool", tracer=tracer)  # Explicit
    def my_function():
        tracer.enrich_span(...)
    ```
  </Accordion>

  <Accordion title="Create sessions per logical unit of work">
    Even with a global tracer, create sessions to isolate traces by request, user, or job.

    ```python theme={null}
    # Per user request
    session_id = tracer.create_session(session_name=f"user-{user_id}")

    # Per batch job
    session_id = tracer.create_session(session_name=f"batch-{batch_id}")
    ```
  </Accordion>

  <Accordion title="Match session creation to the runtime">
    Use the tracer placement that matches your runtime:

    * Scripts and notebooks: initialize once in the module that starts the run
    * Lambda and other serverless runtimes: lazy-init outside the handler, then create a session per invocation
    * Web servers: initialize once at startup, then create a session per request
    * `evaluate()`: let the SDK create and manage tracers for you
  </Accordion>

  <Accordion title="Use test_mode for local development">
    `test_mode=True` (or the `HH_TEST_MODE=true` environment variable) disables OTLP export and generates a local session ID instead of creating one in HoneyHive. Use it for local development and tests when you want tracer setup without exporting spans over OTLP.

    ```python theme={null}
    tracer = HoneyHiveTracer.init(
        api_key=os.getenv("HH_API_KEY"),
        test_mode=True
    )
    ```
  </Accordion>
</AccordionGroup>

***

## Where should you go next?

<CardGroup cols={2}>
  <Card title="Production Deployment" icon="rocket" href="/v2/tutorials/production-deployment">
    Error handling, environment config, and deployment checklist
  </Card>

  <Card title="Multi-Instance Tracing" icon="layer-group" href="/v2/tracing/multi-instance">
    Run multiple tracer instances for multi-tenant or A/B testing
  </Card>

  <Card title="Distributed Tracing" icon="network-wired" href="/v2/tracing/distributed-tracing">
    Propagate trace context across service boundaries
  </Card>

  <Card title="Experiments" icon="flask-vial" href="/v2/evaluation/introduction">
    Run evaluations with automatic per-datapoint tracing
  </Card>

  <Card title="Span Filtering" icon="ban" href="/v2/tracing/filtering">
    Drop noisy framework spans using prefix-based rules
  </Card>

  <Card title="Environment Variables" icon="gear" href="/v2/sdk-reference/environment-variables">
    Full reference for SDK configuration via environment variables
  </Card>
</CardGroup>