> ## Documentation Index
> Fetch the complete documentation index at: https://docs.honeyhive.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Export Data

> Programmatically query and export trace data from HoneyHive.

Query your trace data programmatically using the HoneyHive SDK. This is useful for building custom analytics, exporting data for fine-tuning, or integrating with external systems.

## Query Parameters

| Parameter     | Type               | Required | Description                                                                                                        |
| ------------- | ------------------ | -------- | ------------------------------------------------------------------------------------------------------------------ |
| `filters`     | List\[EventFilter] | No       | Filters to apply                                                                                                   |
| `project`     | String             | No       | Project name. Deprecated - the backend infers the project from filters. Still accepted for backwards compatibility |
| `limit`       | Integer            | No       | Max results per page (default: 1000, max: 7500)                                                                    |
| `page`        | Integer            | No       | Page number (default: 1)                                                                                           |
| `date_range`  | Object             | No       | Date range with `$gte` and `$lte` ISO 8601 strings                                                                 |
| `projections` | List\[String]      | No       | Fields to include in the response                                                                                  |

### EventFilter Fields

| Field      | Type     | Description                                                            |
| ---------- | -------- | ---------------------------------------------------------------------- |
| `field`    | String   | Field to filter on (e.g., `event_type`, `session_id`, `metadata.cost`) |
| `value`    | String   | Value to match                                                         |
| `operator` | Operator | One of: `is_`, `is_not`, `contains`, `not_contains`, `greater_than`    |
| `type`     | Type     | Data type: `string`, `number`, `boolean`, `datetime`                   |

## Setup

```python theme={null}
import os
from honeyhive import HoneyHive
from honeyhive.models import EventFilter
from honeyhive.models.generated import Operator, Type

client = HoneyHive(api_key=os.environ["HH_API_KEY"])
```

<Note>
  The response object (`result`) uses attribute access (e.g., `result.total_events`, `result.events`), while individual events are returned as dictionaries (e.g., `event['event_name']`).
</Note>

## Query Model Events

Retrieve all LLM model events from your project:

```python theme={null}
result = client.events.get_events(
    filters=[
        EventFilter(
            field="event_type",
            value="model",
            operator=Operator.is_,
            type=Type.string,
        )
    ],
    limit=100,
)

print(f"Total events: {result.total_events}")
for event in result.events:
    print(f"  {event['event_name']}: {event['duration']}ms")
```

## Query Events in a Session

Get all events belonging to a specific trace/session:

```python theme={null}
result = client.events.get_events(
    filters=[
        EventFilter(
            field="session_id",
            value="<SESSION_ID>",
            operator=Operator.is_,
            type=Type.string,
        )
    ],
)

for event in result.events:
    print(f"{event['event_type']}: {event['event_name']}")
```

## Query Sessions

Get session-level data (root events only):

```python theme={null}
result = client.events.get_events(
    filters=[
        EventFilter(
            field="event_type",
            value="session",
            operator=Operator.is_,
            type=Type.string,
        )
    ],
    limit=50,
)

for session in result.events:
    # Session metadata includes aggregated stats
    print(f"Session: {session['session_id']}")
    print(f"  Events: {session.get('metadata', {}).get('num_events', 0)}")
    print(f"  Cost: ${session.get('metadata', {}).get('cost', 0):.4f}")
```

## Filter by Evaluator Score

Query events that have a specific evaluator score:

```python theme={null}
result = client.events.get_events(
    filters=[
        EventFilter(
            field="event_type",
            value="model",
            operator=Operator.is_,
            type=Type.string,
        ),
        EventFilter(
            field="metrics.Context Relevance",
            value="3",
            operator=Operator.greater_than,
            type=Type.number,
        ),
    ],
    limit=100,
)
```

## Filter by User Feedback

Query events with specific user feedback:

```python theme={null}
result = client.events.get_events(
    filters=[
        EventFilter(
            field="feedback.rating",
            value="5",
            operator=Operator.is_,
            type=Type.number,
        ),
    ],
    limit=100,
)
```

## Filter by Metadata

Query events by custom metadata fields:

```python theme={null}
result = client.events.get_events(
    filters=[
        EventFilter(
            field="metadata.environment",
            value="production",
            operator=Operator.is_,
            type=Type.string,
        ),
    ],
    limit=100,
)
```

## Filter by Date Range

Query events within a specific time period:

```python theme={null}
result = client.events.get_events(
    filters=[
        EventFilter(
            field="event_type",
            value="model",
            operator=Operator.is_,
            type=Type.string,
        ),
    ],
    date_range={
        "$gte": "2024-01-01T00:00:00.000Z",
        "$lte": "2024-01-31T23:59:59.999Z",
    },
    limit=1000,
)
```

## Available Filter Operators

| Operator     | Python                  | Description        |
| ------------ | ----------------------- | ------------------ |
| Equals       | `Operator.is_`          | Exact match        |
| Not equals   | `Operator.is_not`       | Exclude matches    |
| Contains     | `Operator.contains`     | Substring match    |
| Not contains | `Operator.not_contains` | Exclude substring  |
| Greater than | `Operator.greater_than` | Numeric comparison |

## Common Filterable Fields

| Field                    | Description                            |
| ------------------------ | -------------------------------------- |
| `event_type`             | `session`, `model`, `tool`, or `chain` |
| `event_name`             | Name of the event/span                 |
| `session_id`             | Session/trace ID                       |
| `metrics.<name>`         | Evaluator scores                       |
| `feedback.<name>`        | User feedback values                   |
| `metadata.<name>`        | Custom metadata                        |
| `user_properties.<name>` | User properties                        |

<Note>
  Session events include aggregated metadata like `num_events`, `cost`, `total_tokens`. See [Session Aggregations](/v2/tracing/aggregation-logic) for details.
</Note>

## Export Timeouts and Retries

Export operations (`export()`, `export_async()`, `get_by_session_id()`) use a default read timeout of **300 seconds** to handle large result sets. You can override this with the `HH_EXPORT_TIMEOUT_SECONDS` environment variable:

```bash theme={null}
export HH_EXPORT_TIMEOUT_SECONDS=600  # 10 minutes
```

```python theme={null}
import os
from honeyhive import HoneyHive
os.environ["HH_EXPORT_TIMEOUT_SECONDS"] = "600"

# Must be set before creating the client
client = HoneyHive(api_key=os.environ["HH_API_KEY"])
```

<Note>
  The environment variable must be set **before** the `HoneyHive` client is instantiated. The timeout value must be a positive number (in seconds). If an invalid value is provided, the SDK falls back to the default of 300 seconds.
</Note>

The `export_async()` method automatically retries on transient HTTP errors (502, 503, 504), matching the behavior of `export()`.
