This guide demonstrates how to integrate HoneyHive tracing with LiteLLM, a unified interface for calling 100+ LLMs using the OpenAI format, to monitor and optimize your LLM operations.
Start by initializing the HoneyHive tracer at the beginning of your application:
Copy
Ask AI
import osfrom honeyhive import HoneyHiveTracer# Set your API keysHONEYHIVE_API_KEY = "your honeyhive api key"OPENAI_API_KEY = "your openai api key"# Set OpenAI API key for LiteLLMlitellm.api_key = OPENAI_API_KEY# Initialize HoneyHive tracerHoneyHiveTracer.init( api_key=HONEYHIVE_API_KEY, project="your project name", source="dev", session_name="litellm_example")
Here’s a complete example of using LiteLLM with HoneyHive tracing:
Copy
Ask AI
import osimport litellmfrom honeyhive import HoneyHiveTracer, trace# Set your API keysHONEYHIVE_API_KEY = "your honeyhive api key"OPENAI_API_KEY = "your openai api key"# Set OpenAI API key for LiteLLMlitellm.api_key = OPENAI_API_KEY# Initialize HoneyHive tracerHoneyHiveTracer.init( api_key=HONEYHIVE_API_KEY, project="your project name", source="dev", session_name="litellm_example")@tracedef initialize_litellm(): # Implementation as shown above pass@tracedef generate_completion(prompt, model="gpt-4o-mini", temperature=0.7, max_tokens=500): # Implementation as shown above pass@tracedef generate_chat_completion(messages, model="gpt-3.5-turbo", temperature=0.7, max_tokens=500): # Implementation as shown above pass@tracedef generate_embedding(text, model="text-embedding-ada-002"): # Implementation as shown above pass@tracedef process_with_fallback(messages, primary_model="gpt-3.5-turbo", fallback_model="gpt-4"): """Process messages with a fallback model if the primary model fails.""" try: # Try primary model first print(f"Attempting to use primary model: {primary_model}") return generate_chat_completion(messages, model=primary_model) except Exception as primary_error: print(f"Primary model failed: {primary_error}") try: # Fall back to secondary model print(f"Falling back to secondary model: {fallback_model}") return generate_chat_completion(messages, model=fallback_model) except Exception as fallback_error: print(f"Fallback model also failed: {fallback_error}") raise@tracedef batch_process_prompts(prompts, model="gpt-3.5-turbo"): """Process multiple prompts in batch with tracing.""" results = [] for i, prompt in enumerate(prompts): try: print(f"Processing prompt {i+1}/{len(prompts)}") result = generate_completion(prompt, model=model) results.append({"prompt": prompt, "completion": result, "status": "success"}) except Exception as e: print(f"Error processing prompt {i+1}: {e}") results.append({"prompt": prompt, "completion": None, "status": "error", "error": str(e)}) return resultsdef main(): # Initialize LiteLLM initialize_litellm() # Example 1: Simple completion prompt = "Explain the concept of vector databases in simple terms." completion = generate_completion(prompt) print("\n=== Simple Completion ===") print(completion) # Example 2: Chat completion messages = [ {"role": "system", "content": "You are a helpful assistant that explains technical concepts clearly."}, {"role": "user", "content": "What is HoneyHive and how does it help with AI observability?"} ] chat_completion = generate_chat_completion(messages) print("\n=== Chat Completion ===") print(chat_completion) # Example 3: Generate embedding text = "HoneyHive provides tracing and monitoring for AI applications." embedding = generate_embedding(text) print(f"\n=== Embedding ===") print(f"Generated embeddings: {embedding}") # Example 4: Process with fallback fallback_messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a short poem about AI observability."} ] fallback_result = process_with_fallback(fallback_messages) print("\n=== Fallback Processing ===") print(fallback_result) # Example 5: Batch processing batch_prompts = [ "What are vector databases?", "Explain the concept of RAG in AI applications.", "How does tracing help improve AI applications?" ] batch_results = batch_process_prompts(batch_prompts) print("\n=== Batch Processing Results ===") for i, result in enumerate(batch_results): print(f"Prompt {i+1} Status: {result['status']}")if __name__ == "__main__": main()
LiteLLM supports fallback mechanisms when a primary model fails. You can trace this behavior to understand failure patterns:
Copy
Ask AI
@tracedef process_with_fallback(messages, primary_model="gpt-3.5-turbo", fallback_model="gpt-4"): try: # Try primary model first print(f"Attempting to use primary model: {primary_model}") return generate_chat_completion(messages, model=primary_model) except Exception as primary_error: print(f"Primary model failed: {primary_error}") try: # Fall back to secondary model print(f"Falling back to secondary model: {fallback_model}") return generate_chat_completion(messages, model=fallback_model) except Exception as fallback_error: print(f"Fallback model also failed: {fallback_error}") raise
Experiment with different LLM providers through LiteLLM
Add custom metrics to your traces
Implement A/B testing of different models
Explore HoneyHive’s evaluation capabilities for your LLM responses
By integrating HoneyHive with LiteLLM, you gain valuable insights into your LLM operations and can optimize for better performance, cost-efficiency, and response quality.