> ## Documentation Index
> Fetch the complete documentation index at: https://docs.honeyhive.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Curate from Traces

> Build datasets from production traces in HoneyHive

Build evaluation datasets directly from your production logs. This approach lets you create targeted test cases from real user interactions, edge cases, and interesting scenarios your application has encountered.

## Why Curate from Traces?

| Use Case                 | Example                                              |
| ------------------------ | ---------------------------------------------------- |
| **Regression testing**   | Capture successful interactions as golden test cases |
| **Edge case coverage**   | Find and preserve unusual inputs that caused issues  |
| **Domain-specific data** | Build datasets from real customer queries            |
| **Fine-tuning**          | Curate high-quality examples for model training      |

***

## Curate Sessions

Add complete user interactions (sessions) to your dataset.

<Steps>
  <Step title="Filter sessions">
    Go to **Traces** → **Sessions** and apply filters to find relevant sessions. Common filters:

    * Date range for recent production data
    * Evaluator scores (e.g., low relevance scores)
    * User feedback (thumbs down)
    * Metadata fields (environment, user segment)
  </Step>

  <Step title="Select sessions">
    Check the sessions you want to add to your dataset. You can select multiple sessions at once.

    <Frame>
      <img src="https://mintcdn.com/honeyhiveai/8CSzfyX-NUZzkr98/images/add-to-dataset.png?fit=max&auto=format&n=8CSzfyX-NUZzkr98&q=85&s=dc12dad196b1a70c2688769f0bd3138e" alt="Traces page showing selected sessions with checkboxes" width="1026" height="251" data-path="images/add-to-dataset.png" />
    </Frame>
  </Step>

  <Step title="Add to dataset">
    Click **Add to Dataset** and choose an existing dataset or create a new one. The session's inputs and outputs are automatically mapped to datapoint fields.
  </Step>
</Steps>

***

## Curate Model Events

Add specific LLM calls (model events) rather than full sessions. Useful when your pipeline has multiple LLM calls and you want to evaluate a specific one.

<Steps>
  <Step title="Go to Completions tab">
    Navigate to **Traces** → **Completions** to see all model events across sessions.
  </Step>

  <Step title="Filter model events">
    Filter by model name, token usage, latency, or evaluator scores to find relevant completions.
  </Step>

  <Step title="Select and add">
    Select the model events you want and click **Add to Dataset**. The model's input prompt and output response are mapped to datapoint fields.
  </Step>
</Steps>

***

## Curate Specific Spans

Add any span in your trace (tool calls, chain steps, etc.) to a dataset.

<Steps>
  <Step title="Open session detail">
    Click on a session to open the detail view showing the full span tree.

    <Frame>
      <img src="https://mintcdn.com/honeyhiveai/3gngxp5bnFS6hQ1x/images/trace-detail-view.png?fit=max&auto=format&n=3gngxp5bnFS6hQ1x&q=85&s=dd0c023acdb58699df21a3a356bedc5a" alt="Trace detail view showing span tree with input, output, and annotations panels" width="1600" height="1122" data-path="images/trace-detail-view.png" />
    </Frame>
  </Step>

  <Step title="Select span">
    Click on the specific span you want to curate (e.g., a retrieval step, tool call, or chain).
  </Step>

  <Step title="Add to dataset">
    Click **+ Add To** → **Add to Dataset** from the top action bar, or right-click the span for the context menu option.
  </Step>
</Steps>

***

<Tip>
  Each curated datapoint includes a `linked_event` field - a reference back to the original trace. Use this to investigate context when a test case fails.
</Tip>

***

## Best Practices

| Do                                                  | Don't                                   |
| --------------------------------------------------- | --------------------------------------- |
| Filter by evaluator scores to find quality examples | Add traces without reviewing them first |
| Include diverse edge cases, not just happy paths    | Curate only successful interactions     |
| Review curated data periodically for relevance      | Let datasets grow unbounded             |
| Use descriptive dataset names with dates            | Use generic names like "test-data"      |

***

## Next Steps

<CardGroup cols={2}>
  <Card title="Run Experiments" icon="flask" href="/v2/introduction/experiments-quickstart">
    Evaluate your application using curated datasets
  </Card>

  <Card title="Upload Datasets" icon="upload" href="/v2/datasets/import">
    Import datasets from external files
  </Card>
</CardGroup>
