HoneyHiveTracer class

The HoneyHiveTracer class is a utility designed to initialize and manage a tracing session with the HoneyHive API, and utilize OpenTelemetry. This class encapsulates initialization of the tracing environment, capturing telemetry, and sending updates related to feedback, metrics, and metadata.

Our tracer uses OpenTelemetry as a base to auto-trace Python code. A full explanation of how this works in Python can be found here.

A general explanation of what OpenTelemetry is can be found here.

Attributes

The code for the tracer is open source and can be found here.

  • session_id (str or None): Stores the current session ID of the instance. This is None until a session is successfully initialized by the constructor.
  • api_key (str or None, static): Class-level attribute storing the API key used for authentication. Set during the first initialization (HoneyHiveTracer(...) or HoneyHiveTracer.init(...)) or via the HH_API_KEY environment variable. It is shared across all instances and tracer contexts.

Example Usage

from honeyhive import HoneyHiveTracer, enrich_session

# Initialize a session
tracer = HoneyHiveTracer.init(
    api_key="your-api-key",
    project="Project Name",
    session_name="Session Name", # Optional: Defaults to the script name
    source="source_identifier"   # Optional: Defaults to 'dev'
)

# Set feedback, metrics, and metadata during the session using the module-level function
# You can optionally pass the session_id if needed, otherwise it uses the current context's session.
enrich_session(feedback={'some_domain_expert': "Session feedback"}, session_id=tracer.session_id) # Example with explicit session_id
enrich_session(metrics={"metric_name": "metric_value"}) # Example using implicit session_id from context
enrich_session(metadata={"key": "value"})

# Set two or more of the following at once
enrich_session(
    feedback={'some_domain_expert': "Session feedback"},
    metrics={"metric_name": "metric_value"},
    metadata={"key": "value"}
)

# (optional) Flush trace data before ending the session
# You might need to run this in a separate thread if in an async context:
# await asyncio.to_thread(HoneyHiveTracer.flush)
HoneyHiveTracer.flush()

Methods

init (Static Method)

Initializes a HoneyHive tracing session by creating an instance of HoneyHiveTracer and sets up the tracing environment.

Automatic Git Information: During initialization, if the code is run inside a Git repository and the git command is available, the tracer will attempt to automatically capture information like the current commit hash, branch name, repository URL, and whether there are uncommitted changes. This information is added to the session’s initial metadata. This behavior can be disabled by setting the HONEYHIVE_TELEMETRY environment variable to false.

Parameters:

  • api_key (str, optional): API key for authenticating with the HoneyHive service. If not provided, it checks the HH_API_KEY environment variable.
  • project (str, optional): Name of the project associated with this tracing session. If not provided, it checks the HH_PROJECT environment variable.
  • session_name (str, optional): Name for this specific session. Defaults to the name of the main Python script if possible, otherwise ‘unknown’.
  • source (str, optional): Source identifier, typically describing the environment or component that initiates the session. Defaults to the HH_SOURCE environment variable or ‘dev’.
  • server_url (str, optional): HoneyHive server URL. Defaults to the HH_API_URL environment variable or "https://api.honeyhive.ai".
  • session_id (str, optional): A specific session ID to use. If provided, the tracer attempts to resume or link to this existing session’s context (e.g., from a parent trace). If not provided, a new session ID is generated, or an existing one is potentially retrieved from the current context if automatically propagated.
  • disable_http_tracing (bool, optional): When set to True, spans for requests from common Python HTTP libraries (like requests, urllib3) will not be automatically traced (default: False). This can also be controlled via context propagation.
  • disable_batch (bool, optional): Whether to disable batching of trace data (default: False). Sending spans individually can increase network overhead.
  • verbose (bool, optional): Whether to print detailed debug information, including trace initialization details and potential errors, to the console (default: False).

Usage Example:

# Preferred initialization using the constructor
tracer = HoneyHiveTracer(
    api_key="<YOUR_API_KEY>",  # Or set HH_API_KEY env var
    project="My Project",      # Or set HH_PROJECT env var
    session_name="Data Processing Run",
    source="production-worker-1",
    disable_batch=True,
    verbose=True
)

print(f"Initialized HoneyHive session: {tracer.session_id}")

Raises: Generally, initialization failures (e.g., missing API key or project) will raise an Exception and print an error message. Specific SDKError types might be raised for configuration issues. If verbose is False, some non-critical errors might only be logged internally without raising an exception immediately, but essential failures like missing credentials will still halt execution.


enrich_session (Module-Level Function)

Adds context (metadata, feedback, metrics, etc.) to an existing session. This documentation primarily covers the module-level function enrich_session, which is the recommended way to enrich sessions.

There is also an instance method tracer_instance.enrich_session(...) available on HoneyHiveTracer objects, but the module-level function is generally preferred as it can automatically determine the relevant session from the current context.

# Assuming HoneyHiveTracer has been initialized previously
from honeyhive import enrich_session

# Enrich the session associated with the current context
enrich_session(feedback={"user_rating": 5, "comment": "Very helpful!"})

# Enrich a specific session by providing its ID
enrich_session(session_id="specific-session-uuid", metrics={"accuracy": 0.95})

Parameters: Can take any or all of the following parameters:

  • session_id (str, optional): The ID of the session to enrich. If not provided, the function attempts to retrieve the session_id from the active tracing context (established during HoneyHiveTracer initialization or context propagation).
  • config (dict, optional): Dictionary of configuration settings related to the session.
  • feedback (dict, optional): Dictionary of feedback data.
  • metrics (dict, optional): Dictionary of metrics data.
  • metadata (dict, optional): Dictionary of metadata.
  • outputs (dict, optional): Dictionary of session outputs.
  • user_properties (dict, optional): Dictionary of user properties.

Note: The inputs parameter is currently not supported for enriching sessions via this function.

Raises:

  • Exception: If the HoneyHiveTracer was not initialized successfully (needed to make API calls).
  • Exception: If session_id is not provided and cannot be found in the current context.

enrich_span

Adds context to the current active span. Important: This function must be called from within a function that is decorated with @trace or @atrace. If called outside of an active span managed by these decorators, it will have no effect and log a warning.

Adds context to the span. Can take any or all of the following parameters:

  • config (dict): Dictionary of configuration settings related to the function.
  • feedback (dict): Dictionary of feedback to be sent to HoneyHive.
  • metrics (dict): Dictionary of metrics to be sent to HoneyHive.
  • metadata (dict): Dictionary of metadata to be sent to HoneyHive.
  • inputs (dict): Dictionary of inputs to be sent to HoneyHive.
  • outputs (dict): Dictionary of outputs to be sent to HoneyHive.
  • error (string): String describing the error that occurred.

flush

Flushes all pending trace data.

Usage Example:

HoneyHiveTracer.flush()

@trace decorator

The @trace decorator is a utility provided by HoneyHive to easily add custom spans to your application. It captures function inputs, outputs, and additional metadata, providing deeper insights into your application’s behavior.

Usage

To use the @trace decorator, first initialize the HoneyHiveTracer:

from honeyhive import HoneyHiveTracer

HoneyHiveTracer.init(
    api_key="your-api-key",
    project="Project Name",
    source="source_identifier",
    session_name="Session Name"
)

Then, import and apply the trace decorator to any function you want to trace:

from honeyhive import trace

@trace(
    event_type="model",
    config={"important_setting": True},
    metadata={"version": "1.0.0"},
    event_name="my-event-name"
)
def my_function(param1, param2):
    # Function code here
    return result

Parameters

The trace decorator accepts the following parameters:

  • event_type (str, optional): Type of the event. Must be one of ‘tool’, ‘model’, or ‘chain’.
  • config (dict, optional): A dictionary of configuration settings related to the function.
  • metadata (dict, optional): A dictionary of additional metadata to be associated with the span.
  • event_name (str, optional): A custom name for the span created for this trace. If not provided, it defaults to the name of the decorated function.

Behavior

When applied to a function, the trace decorator:

  • Creates a new span with the function’s name.
  • Captures all function inputs (parameters) as span attributes.
  • Adds any provided config and metadata as span attributes.
  • Executes the function.
  • Captures the function’s return value as a span attribute.
  • Ends the span.

Async Support: @atrace

For tracing asynchronous functions (defined with async def), use the @atrace decorator instead. It functions identically to @trace but correctly handles async execution contexts.

import asyncio
from honeyhive import atrace

@atrace(
    event_type="model",
    config={"async_setting": True},
    metadata={"version": "1.1.0"},
    event_name="my-async-event"
)
async def my_async_function(param1, param2):
    # Async function code here
    await asyncio.sleep(0.1) # Example async operation
    result = f"Processed {param1} and {param2} asynchronously"
    return result

# Example of running the async function
# asyncio.run(my_async_function("data1", "data2"))