Get started with running evaluations on your Langchain and LlamaIndex pipelines with HoneyHive
As a prerequisite to this guide, read our guides on our Langchain and LlamaIndex tracers.
Executing an evaluation run in HoneyHive is as simple as setting a few fields on the HoneyHive tracer.
In order to instantiate an evaluation run:
Set the source field of the tracer to evaluation.
Set the metadata field of the tracer to a dictionary with a field dataset_name equal to the dataset that you want to run the evaluation over.
Let’s look at an example:
import os
import honeyhive
from honeyhive.utils.llamaindex_tracer import HoneyHiveLlamaIndexTracer
from llama_index.core import Settings, VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
from llama_index.core.callbacks import CallbackManager
import openai
defrun_tracer(source, metadata):
tracer = HoneyHiveLlamaIndexTracer(
project=os.environ["HH_PROJECT"],
name="Paul Graham Q&A",
source=source,
api_key=os.environ["HH_API_KEY"],
metadata=metadata,)
openai.api_key = os.environ["OPENAI_API_KEY"]
Settings.callback_manager = CallbackManager([tracer])
documents = SimpleWebPageReader(html_to_text=True).load_data(["http://paulgraham.com/worked.html"])
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")return tracer, response
tracer, response = run_tracer("evaluation",{"dataset_name": os.environ["HH_DATASET"]})
In order to add subsequent sessions onto an existing evaluation run:
Set the metadata field of the tracer to a dictionary with a field run_id equal to the ID of the existing evaluation run you want to append to
In order to help make this easy, the tracer object stores an eval_info object that contains the run id of any evaluation run that the tracer instantiated or appended to.
This is clearer if we look at an example:
import os
import honeyhive
from honeyhive.utils.llamaindex_tracer import HoneyHiveLlamaIndexTracer
from llama_index.core import Settings, VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
from llama_index.core.callbacks import CallbackManager
import openai
defrun_tracer(source, metadata):
tracer = HoneyHiveLlamaIndexTracer(
project=os.environ["HH_PROJECT"],
name="Paul Graham Q&A",
source=source,
api_key=os.environ["HH_API_KEY"],
metadata=metadata,)
openai.api_key = os.environ["OPENAI_API_KEY"]
Settings.callback_manager = CallbackManager([tracer])
documents = SimpleWebPageReader(html_to_text=True).load_data(["http://paulgraham.com/worked.html"])
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")return tracer, response
tracer, response = run_tracer("evaluation",{"dataset_name": os.environ["HH_DATASET"]})
_, new_response = run_tracer("evaluation",{"run_id": tracer.eval_info["run_id"]})