Add HoneyHive observability to your LiteLLM applications
LiteLLM provides a unified interface for calling 100+ LLMs using the OpenAI format. HoneyHive integrates with LiteLLM via the OpenInference instrumentor, automatically capturing all completion calls, token usage, and model metadata across providers.
Add HoneyHive tracing in just 4 lines of code. Add this to your existing LiteLLM app and all completion calls are automatically traced, regardless of the underlying provider.
To see where to initialize the tracer for your environment, including AWS Lambda and long-running servers, see Tracer Initialization.
pip install "honeyhive[openinference-litellm]>=1.0.0rc0"# Or install separatelypip install "honeyhive>=1.0.0rc0" openinference-instrumentation-litellm litellm
import osimport litellmfrom honeyhive import HoneyHiveTracerfrom openinference.instrumentation.litellm import LiteLLMInstrumentortracer = HoneyHiveTracer.init( api_key=os.getenv("HH_API_KEY"), project=os.getenv("HH_PROJECT"),)LiteLLMInstrumentor().instrument(tracer_provider=tracer.provider)# Your existing LiteLLM code works unchanged
Completions - litellm.completion() and litellm.acompletion() with inputs, outputs, and token usage
Embeddings - litellm.embedding() and litellm.aembedding() requests
Image generation - litellm.image_generation() and litellm.aimage_generation() calls
Multi-provider routing - Model name and provider metadata for every call
No manual instrumentation required.
Use module-level calls. The instrumentor patches litellm.completion(), not directly imported functions. Always use litellm.completion() instead of from litellm import completion.
import osimport litellmfrom honeyhive import HoneyHiveTracerfrom openinference.instrumentation.litellm import LiteLLMInstrumentortracer = HoneyHiveTracer.init( api_key=os.getenv("HH_API_KEY"), project=os.getenv("HH_PROJECT"),)LiteLLMInstrumentor().instrument(tracer_provider=tracer.provider)# OpenAI via LiteLLMresponse = litellm.completion( model="openai/gpt-4o-mini", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"}, ], max_tokens=50,)print(response.choices[0].message.content)# Anthropic via LiteLLM - same interface, automatically tracedresponse2 = litellm.completion( model="anthropic/claude-haiku-4-5-20251001", messages=[ {"role": "user", "content": "Tell me a fun fact about Paris."}, ], max_tokens=100,)print(response2.choices[0].message.content)
Avoid duplicate spans. If you also use provider-specific instrumentors (e.g., OpenAIInstrumentor), LiteLLM calls to that provider may produce duplicate spans. Use only the LiteLLM instrumentor when routing calls through LiteLLM.