HoneyHive Docs

HoneyHive’s Data Schema Language (DSL) allows you to query your LLM application logs for deep insights and analytics. In this guide, we’ll walk you through the essential concepts and schemas for querying your data effectively.

Query Parameters

The following parameters can be defined to filter the events

Field	Subfield	Type	Required	Description
project		String	Yes	The unique identifier or name of the project you want to query.
filters		List of filters	No	An array of filter objects to narrow down the results based on specific criteria.
	field	String	Yes*	The name of the field you want to filter by, such as `metadata.cost`, `inputs.chat_history.content`.
	value	SDK specific	Yes*	The value that the specified field should match or satisfy based on the operator.
	operator	SDK specific	Yes*	The comparison operator for the filter. Supported operators include “is”, “is not”, “contains”, “not contains”, and “greater than”.
dateRange		Object	No	An object specifying the date range to filter the results by.
	$gte	ISO 8601 DateTime String	No	The start of the date range filter, represented as an ISO 8601 formatted date-time string (e.g., `2024-04-01T22:38:19.000Z`).
	$lte	ISO 8601 DateTime String	No	The end of the date range filter, represented as an ISO 8601 formatted date-time string (e.g., `2024-04-01T22:38:19.000Z`).
limit		Integer	No	The maximum number of results to return per page. Must be an integer between 1 and 1000 (inclusive). If not provided, the default limit is used.
page		Integer	No	The page number of the results to retrieve. Must be a positive integer. If not provided, the first page is returned.

*Required if using filters

Initialize the honeyhive SDK

import honeyhive
from honeyhive.models import components, operations

sdk = honeyhive.HoneyHive(
    bearer_auth='<HONEYHIVE_API_KEY>',
    server_url='HONEYHIVE_SERVER_URL' # Optional / Required for self-hosted or dedicated deployments
)

LLM data export

Below are listed the most common log export flows for LLM data that is logged to our platform. These are useful for both fine-tuning your models and understanding how your users are interacting with your application.

Query LLM events based on evaluator score

Assuming that you have logged events with the Context Relevance evaluator running on them. To retrieve all the ‘model’ events that have a Context Relevance score above a specific threshold (e.g., 3), use the following query:

req = operations.GetEventsRequestBody(
    project='<PROJECT_NAME>',
    filters=[
        components.EventFilter(
            field="event_type",
            value="model",
            operator=components.Operator.IS,
        ),
        components.EventFilter(
            field="metrics.Context Relevance",  # Evaluator scores can found under metrics
            value=3,
            operator=components.Operator.GREATER_THAN,
        )
    ],
    limit=10,  # Max limit set to 1000
    page=1,
)
res = sdk.events.get_events(request=req)

Query LLM events based on end-user feedback

Assuming that you have logged events and further set feedback rating on them. To retrieve all the ‘model’ events that have a rating score above a specific threshold (e.g., 3), use the following query:

req = operations.GetEventsRequestBody(
    project='<PROJECT_NAME>',
    filters=[
        components.EventFilter(
            field="event_type",
            value="model",
            operator=components.Operator.IS,
        ),
        components.EventFilter(
            field="feedback.rating",  # All feedbacks are stored under feedback field.
            value=3,
            operator=components.Operator.GREATER_THAN,
        )
    ],
    limit=10,  # Max limit set to 1000
    page=1,
)
res = sdk.events.get_events(request=req)

Query LLM events based on metadata

Assuming you are running an experiment and want to retrieve events of an experiment. We pass the unique identifier under metadata.experiment-id. To retrieve all the ‘model’ events that have a specific value for experiment-id, use the following query:

req = operations.GetEventsRequestBody(
    project='<PROJECT_NAME>',
    filters=[
        components.EventFilter(
            field="event_type",
            value="model",
            operator=components.Operator.IS,
        ),
        components.EventFilter(
            field="metadata.experiment-id",  # Tags are passed in the metadata field.
            value="<EXPERIMENT_ID>",
            operator=components.Operator.IS,
        )
    ],
    limit=10,  # Max limit set to 1000
    page=1,
)
res = sdk.events.get_events(request=req)

Event based export

Very often for multi-step or agentic applications, you might want to export a particular event’s input-output data to setup evaluator or fine-tuning datasets on that step in your pipeline. To retrieve a specific event using its name, you can use the following query:

req = operations.GetEventsRequestBody(
    project='<PROJECT_NAME>',
    filters=[
        components.EventFilter(
            field="event_name",
            value="<EVENT_NAME>",
            operator=components.Operator.IS,
        ),
    ],
    limit=10,  # Max limit set to 1000
    page=1,
)
res = sdk.events.get_events(request=req)

In order to further filter by evaluator scores, feedback or metadata, you can the same filters as mentioned in the LLM data export section.

Single session data export

In the case of agents or complex RAG pipelines, you might want to query the full session data to understand the context of the conversation or the state of the agent. To retrieve all children events belonging to a session, use the following query:

req = operations.GetEventsRequestBody(
    project='<PROJECT_NAME>',
    filters=[
        components.EventFilter(
            field="event_type",  # To get all non-session events which share the same session_id
            value="session",
            operator=components.Operator.IS_NOT,
        ),
        components.EventFilter(
            field="session_id",  
            value="<SESSION_ID>",
            operator=components.Operator.IS,
        )
    ],
)
res = sdk.events.get_events(request=req)

Multiple session data export

In order to careful multi-step analysis or to understand the context of the conversation across multiple sessions, you might want to query multiple sessions.

We automatically aggregate certain properties automatically for all sessions as described here.

Query session based on session length

Assuming that you have logged sessions. To retrieve sessions that have more than a specific number of children events (e.g., 3), use the following query:

req = operations.GetEventsRequestBody(
    project='<PROJECT_NAME>',
    filters=[
        components.EventFilter(
            field="event_type",
            value="session",
            operator=components.Operator.IS,
        ),
        components.EventFilter(
            field="metadata.num_events",  # Session aggregates stored under metadata.
            value=3,
            operator=components.Operator.GREATER_THAN,
        )
    ],
    limit=10,  # Max limit set to 1000
    page=1,
)
res = sdk.events.get_events(request=req)

You can still add on more filters for user feedback or evaluator scores as necessary as explained in the LLM data export section. The above query will return just the root session event for all the relevant sesssions. In order to retrieve all the children events, you can use the same query as mentioned in the single session data export section.

res = sdk.session.get_session(session_id='<SESSION_ID>')

Introduction

Guides

Tutorials

Learn more

Export Traces

Query Parameters

Initialize the honeyhive SDK

LLM data export

Query LLM events based on evaluator score

Query LLM events based on end-user feedback

Query LLM events based on metadata

Event based export

Single session data export

Multiple session data export

Query session based on session length

Introduction

Guides

Tutorials

Learn more

​Query Parameters

​Initialize the honeyhive SDK

​LLM data export

​Query LLM events based on evaluator score

​Query LLM events based on end-user feedback

​Query LLM events based on metadata

​Event based export

​Single session data export

​Multiple session data export

​Query session based on session length

Query Parameters

Initialize the honeyhive SDK

LLM data export

Query LLM events based on evaluator score

Query LLM events based on end-user feedback

Query LLM events based on metadata

Event based export

Single session data export

Multiple session data export

Query session based on session length