HoneyHive’s Data Schema Language (DSL) allows you to query your LLM application logs for deep insights and analytics. In this guide, we’ll walk you through the essential concepts and schemas for querying your data effectively.

Query Parameters

The following parameters can be defined to filter the events

FieldSubfieldTypeRequiredDescription
projectStringYesThe unique identifier or name of the project you want to query.
filtersList of filtersNoAn array of filter objects to narrow down the results based on specific criteria.
fieldStringYes*The name of the field you want to filter by, such as metadata.cost, inputs.chat_history.content.
valueSDK specificYes*The value that the specified field should match or satisfy based on the operator.
operatorSDK specificYes*The comparison operator for the filter. Supported operators include “is”, “is not”, “contains”, “not contains”, and “greater than”.
dateRangeObjectNoAn object specifying the date range to filter the results by.
$gteISO 8601 DateTime StringNoThe start of the date range filter, represented as an ISO 8601 formatted date-time string (e.g., 2024-04-01T22:38:19.000Z).
$lteISO 8601 DateTime StringNoThe end of the date range filter, represented as an ISO 8601 formatted date-time string (e.g., 2024-04-01T22:38:19.000Z).
limitIntegerNoThe maximum number of results to return per page. Must be an integer between 1 and 1000 (inclusive). If not provided, the default limit is used.
pageIntegerNoThe page number of the results to retrieve. Must be a positive integer. If not provided, the first page is returned.
*Required if using filters

Initialize the honeyhive SDK

LLM data export

Below are listed the most common log export flows for LLM data that is logged to our platform.

These are useful for both fine-tuning your models and understanding how your users are interacting with your application.

Query LLM events based on evaluator score

Assuming that you have logged events with the Context Relevance evaluator running on them. To retrieve all the ‘model’ events that have a Context Relevance score above a specific threshold (e.g., 3), use the following query:

Query LLM events based on end-user feedback

Assuming that you have logged events and further set feedback rating on them. To retrieve all the ‘model’ events that have a rating score above a specific threshold (e.g., 3), use the following query:

Query LLM events based on metadata

Assuming you are running an experiment and want to retrieve events of an experiment. We pass the unique identifier under metadata.experiment-id. To retrieve all the ‘model’ events that have a specific value for experiment-id, use the following query:

Event based export

Very often for multi-step or agentic applications, you might want to export a particular event’s input-output data to setup evaluator or fine-tuning datasets on that step in your pipeline.

To retrieve a specific event using its name, you can use the following query:

In order to further filter by evaluator scores, feedback or metadata, you can the same filters as mentioned in the LLM data export section.

Single session data export

In the case of agents or complex RAG pipelines, you might want to query the full session data to understand the context of the conversation or the state of the agent.

To retrieve all children events belonging to a session, use the following query:

Multiple session data export

In order to careful multi-step analysis or to understand the context of the conversation across multiple sessions, you might want to query multiple sessions.

We automatically aggregate certain properties automatically for all sessions as described here.

Query session based on session length

Assuming that you have logged sessions. To retrieve sessions that have more than a specific number of children events (e.g., 3), use the following query:

You can still add on more filters for user feedback or evaluator scores as necessary as explained in the LLM data export section.

The above query will return just the root session event for all the relevant sesssions. In order to retrieve all the children events, you can use the same query as mentioned in the single session data export section.