Query your trace data programmatically using the HoneyHive SDK. This is useful for building custom analytics, exporting data for fine-tuning, or integrating with external systems.
Query Parameters
| Parameter | Type | Required | Description |
|---|
project | String | Yes | Project name to query |
filters | List[EventFilter] | Yes | Filters to apply |
limit | Integer | No | Max results per page (default: 1000, max: 7500) |
page | Integer | No | Page number (default: 1) |
date_range | Object | No | Date range with $gte and $lte ISO 8601 strings |
EventFilter Fields
| Field | Type | Description |
|---|
field | String | Field to filter on (e.g., event_type, session_id, metadata.cost) |
value | String | Value to match |
operator | Operator | One of: is_, is_not, contains, not_contains, greater_than |
type | Type | Data type: string, number, boolean, id |
Setup
import os
from honeyhive import HoneyHive
from honeyhive.models import EventFilter
from honeyhive.models.generated import Operator, Type
client = HoneyHive(api_key=os.environ["HH_API_KEY"])
Query Model Events
Retrieve all LLM model events from your project:
result = client.events.get_events(
project="My Project",
filters=[
EventFilter(
field="event_type",
value="model",
operator=Operator.is_,
type=Type.string,
)
],
limit=100,
)
print(f"Total events: {result['totalEvents']}")
for event in result["events"]:
print(f" {event.event_name}: {event.duration}ms")
Query Events in a Session
Get all events belonging to a specific trace/session:
result = client.events.get_events(
project="My Project",
filters=[
EventFilter(
field="session_id",
value="<SESSION_ID>",
operator=Operator.is_,
type=Type.string,
)
],
)
for event in result["events"]:
print(f"{event.event_type}: {event.event_name}")
Query Sessions
Get session-level data (root events only):
result = client.events.get_events(
project="My Project",
filters=[
EventFilter(
field="event_type",
value="session",
operator=Operator.is_,
type=Type.string,
)
],
limit=50,
)
for session in result["events"]:
# Session metadata includes aggregated stats
print(f"Session: {session.session_id}")
print(f" Events: {session.metadata.get('num_events', 0)}")
print(f" Cost: ${session.metadata.get('cost', 0):.4f}")
Filter by Evaluator Score
Query events that have a specific evaluator score:
result = client.events.get_events(
project="My Project",
filters=[
EventFilter(
field="event_type",
value="model",
operator=Operator.is_,
type=Type.string,
),
EventFilter(
field="metrics.Context Relevance",
value="3",
operator=Operator.greater_than,
type=Type.number,
),
],
limit=100,
)
Filter by User Feedback
Query events with specific user feedback:
result = client.events.get_events(
project="My Project",
filters=[
EventFilter(
field="feedback.rating",
value="5",
operator=Operator.is_,
type=Type.number,
),
],
limit=100,
)
Query events by custom metadata fields:
result = client.events.get_events(
project="My Project",
filters=[
EventFilter(
field="metadata.environment",
value="production",
operator=Operator.is_,
type=Type.string,
),
],
limit=100,
)
Filter by Date Range
Query events within a specific time period:
result = client.events.get_events(
project="My Project",
filters=[
EventFilter(
field="event_type",
value="model",
operator=Operator.is_,
type=Type.string,
),
],
date_range={
"$gte": "2024-01-01T00:00:00.000Z",
"$lte": "2024-01-31T23:59:59.999Z",
},
limit=1000,
)
Available Filter Operators
| Operator | Python | Description |
|---|
| Equals | Operator.is_ | Exact match |
| Not equals | Operator.is_not | Exclude matches |
| Contains | Operator.contains | Substring match |
| Not contains | Operator.not_contains | Exclude substring |
| Greater than | Operator.greater_than | Numeric comparison |
Common Filterable Fields
| Field | Description |
|---|
event_type | session, model, tool, or chain |
event_name | Name of the event/span |
session_id | Session/trace ID |
metrics.<name> | Evaluator scores |
feedback.<name> | User feedback values |
metadata.<name> | Custom metadata |
user_properties.<name> | User properties |
Session events include aggregated metadata like num_events, cost, total_tokens. See Session Aggregations for details.
Export Timeouts and Retries
Export operations (export(), export_async(), get_by_session_id()) use a default read timeout of 300 seconds to handle large result sets. You can override this with the HH_EXPORT_TIMEOUT_SECONDS environment variable:
export HH_EXPORT_TIMEOUT_SECONDS=600 # 10 minutes
import os
from honeyhive import HoneyHive
os.environ["HH_EXPORT_TIMEOUT_SECONDS"] = "600"
# Must be set before creating the client
client = HoneyHive(api_key=os.environ["HH_API_KEY"])
The environment variable must be set before the HoneyHive client is instantiated. The timeout value must be a positive number (in seconds). If an invalid value is provided, the SDK falls back to the default of 300 seconds.
The export_async() method automatically retries on transient HTTP errors (502, 503, 504), matching the behavior of export().