Skip to main content
Use trace enrichment to tag requests with experiment information, then analyze results using HoneyHive’s charting tools. This approach works with any feature flag system (Statsig, LaunchDarkly, custom) - just pass the experiment ID and variant to your traces.

How to Run Online Experiments

Prerequisites: HoneyHive tracing set up per the Quickstart.
1

Tag traces with experiment metadata and capture feedback

When a request is part of an experiment, enrich the span with the experiment ID, variant, and user feedback:
import os
from honeyhive import HoneyHiveTracer, trace

tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    project=os.getenv("HH_PROJECT")
)

@trace
def generate_response(user_input: str, experiment_id: str, variant: str):
    # Tag this span with experiment info
    tracer.enrich_span(metadata={
        "experiment_id": experiment_id,
        "variant": variant  # e.g., "control" or "treatment"
    })
    
    # Your LLM logic here...
    response = call_llm(user_input)
    
    # Later, when user provides feedback, attach it to this span
    tracer.enrich_span(feedback={"liked": user_liked_response})
    
    return response
The feedback object accepts any structure. Use boolean for thumbs up/down (liked: true), numbers for ratings (score: 4), or strings for comments.
2

Analyze results in HoneyHive

  1. Go to Discover to build charts
  2. Select All Events to analyze at the span level
  3. Filter by metadata.experiment_id to isolate experiment data
  4. Set metric to your feedback field (e.g., feedback.liked)
  5. Group by metadata.variant to compare control vs treatment
This shows you how each variant performs on your chosen metric.

Learn More