Running experiments is a natural extension of the tracing capabilities of HoneyHive. We recommend you to go through the tracing quickstart before proceeding with this guide.

Full code

Here’s a minimal example to get you started with experiments in HoneyHive:

Running an experiment

Prerequisites

  • You have already created a project in HoneyHive, as explained here.
  • You have an API key for your project, as explained here.
  • You have instrumented your application with the HoneyHive SDK, as explained here.

Expected Time: 5 minutes

Steps

1

Create the flow you want to evaluate

Assuming you have gone through the tracing quickstart, you would have a function that looks like this:

The value returned by the function would map to the outputs field of each run in the experiment.

2

Setup input data

Create a list of input data for your function parameters:

In case your dataset is in a file, create a dataset by uploading the dataset in our platform and copy the dataset id under its name.

The input fields in the dataset should map to the fields mapped in the evaluate function.
3

(Optional) Setup Evaluators

Evaluators can be setup either from the code or computed on our side.

  1. Setup client side evaluators that run at the end of each run.

For more granular experiment metrics, you can enrich your tracer with client-side evaluators directly in your code.

  1. Setup server side evaluators to setup a metric on our platform.
4

Run experiment

Dashboard View

Remember to review the results in your HoneyHive dashboard to gain insights into your model’s performance across different inputs. The dashboard provides a comprehensive view of the experiment results and performance across multiple runs.

Conclusion

By following these steps, you can set up and run experiments using HoneyHive. This allows you to systematically test your LLM-based systems across various scenarios and collect performance data for analysis.