Introduction

Our Python SDK allows you to trace your log individual LLM requests as well as full pipeline traces. This allows you to monitor your LLM’s performance and log user feedback and ground truth labels associated with it.

For an in-depth overview of how our logging data is structured, please see our Logging Overview page.

Get API key

After signing up on the app, you can find your API key in the Settings page under Account.

Install the SDK

We currently support a native Python SDK. For other languages, we encourage using HTTP request libraries to send requests.

!pip install honeyhive -q

Capture relevant details on your completion requests

This method allows you to log any arbitrary LLM requests on the client-side without proxying your requests via HoneyHive servers. Using this method, evaluation metrics such as custom metrics and AI feedback functions will be automatically computed based on the metrics you’ve defined and enabled in the Metrics page. Learn more about defining evaluation metrics here.

Let’s start by running an OpenAI Chat Completion request and calculate a basic metric like latency.

We’re using OpenAI, Anthropic and Hugging Face models in this guide to simply demonstrate how to log requests with HoneyHive. You can alternatively use the SDK to log any arbitrary model completion requests across other model providers such as Cohere, AI21 Labs, or your own custom, self-hosted models.
import honeyhive
from honeyhive.sdk.utils import fill_template
from openai import OpenAI
import time

honeyhive.api_key = "HONEYHIVE_API_KEY"
client = OpenAI(api_key="OPENAI_API_KEY")

USER_TEMPLATE = "Write me an email on {{topic}} in a {{tone}} tone."
user_inputs = {
    "topic": "AI Services",
    "tone": "Friendly"
}
#"Write an email on AI Services in a Friendly tone."
user_message = fill_template(USER_TEMPLATE, user_inputs)

start = time.perf_counter()

openai_response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    temperature=0.7,
    max_tokens=100,
    messages=[
      {"role": "system", "content": "You are a helpful assistant who writes emails."},
      {"role": "user", "content": user_message}
    ]
)

end = time.perf_counter()

request_latency = (end - start)*1000
generation = openai_response.choices[0].message.content
token_usage = openai_response.usage

Log your completion request

Now that you’ve run the request, let’s try logging the request and some user metadata in HoneyHive.

Adding a session_id field on the metadata field will enable session tracking on completions
response = honeyhive.generations.log(
    project="Sandbox - Email Writer",
    source="staging",
    model="gpt-3.5-turbo",
    hyperparameters={
        "temperature": 0.7,
        "max_tokens": 100,
    },
    prompt_template=USER_TEMPLATE,
    inputs=user_inputs,
    generation=generation,
    metadata={
        "session_id": session_id  # Optionally specify a session id to track related completions
    },
    usage=token_usage,
    latency=request_latency,
    user_properties={
        "user_device": "Macbook Pro",
        "user_Id": "92739527492",
        "user_country": "United States",
        "user_subscriptiontier": "Enterprise",
        "user_tenantID": "Acme Inc."
    }
)

Using this method, you will not be able to use our Prompt CI/CD capabilities within the platform or calculate cost and latency metrics automatically. In order to update prompts, you will need to manually update the prompt, model provider and hyperparamater settings in your codebase when deploying new variants to production.

Log user feedback and ground truth labels

Now that you’ve logged a request in HoneyHive, let’s try logging user feedback and ground truth labels associated with that completion.

Using the generation_id that is returned, you can send arbitrary feedback to HoneyHive using the feedback endpoint.

from honeyhive.sdk.feedback import generation_feedback
generation_feedback(
    project="Sandbox - Email Writer",
    generation_id=response.generation_id,
    ground_truth="INSERT_GROUND_TRUTH_LABEL",
    feedback_json={
        "provided": True,
        "accepted": False,
        "edited": True
    }
)

[Optional] Proxy requests via HoneyHive

Alternatively, you can automatically call the current deployed prompt-model configuration within a specified project without specifying all the parameters. Using this method, we automatically route your requests to the model-prompt configuration that is currently active within the platform and capture some basic metrics like cost and latency.

More documentation can be found on our saved prompt generations API page.

import honeyhive

honeyhive.api_key = "HONEYHIVE_API_KEY"
honeyhive.openai_api_key = "OPENAI_API_KEY"

response = honeyhive.generations.generate(
    project="Sandbox - Email Writer",
    source="staging",
    input={
        "topic": "Model evaluation for companies using GPT-4",
        "tone": "friendly"
    },
)