Quickstart
Get started with running experiments with HoneyHive
Running experiments is a natural extension of the tracing capabilities of HoneyHive. We recommend you to go through the tracing quickstart before proceeding with this guide.
Full code
Here’s a minimal example to get you started with experiments in HoneyHive:
Running an experiment
Prerequisites
- You have already created a project in HoneyHive, as explained here.
- You have an API key for your project, as explained here.
- You have instrumented your application with the HoneyHive SDK, as explained here.
Expected Time: 5 minutes
Steps
Create the flow you want to evaluate
Assuming you have gone through the tracing quickstart, you would have a function that looks like this:
The value returned by the function would map to the outputs
field of each run in the experiment.
Setup input data
Input datasets for experiments can be managed in two ways:
- HoneyHive Cloud: Upload and version your datasets directly in HoneyHive for team collaboration and dataset versioning. After uploading the dataset, use the
dataset_id
in your experiment configuration. - Code-managed: Define your input data directly in your code using a list of JSON objects. This is useful for dynamic datasets or when you want to keep everything in your codebase.
evaluate
function.(Optional) Setup Evaluators
Evaluators can be configured in two ways:
- Client-side Execution: Define evaluators in your code that run immediately after each experiment iteration. These evaluators have direct access to inputs and outputs and run synchronously with your experiment.
- Server-side Execution: Configure evaluators in HoneyHive UI that run asynchronously after your traces are logged. This is useful for computation-heavy evaluators or when you want to add/modify metrics after runs are complete.
Run experiment
Dashboard View
Remember to review the results in your HoneyHive dashboard to gain insights into your model’s performance across different inputs. The dashboard provides a comprehensive view of the experiment results and performance across multiple runs.
Conclusion
By following these steps, you can set up and run experiments using HoneyHive. This allows you to systematically test your LLM-based systems across various scenarios and collect performance data for analysis.