The described functionality is currently only supported for the HoneyHive Langchain and LlamaIndex tracers.

Executing an evaluation run in HoneyHive is as simple as setting a few fields on the HoneyHive Langchain and LlamaIndex tracers. In order to instantiate an evaluation run:

  • Set the source field of the tracer to evaluation.
  • Set the metadata field of the tracer to a dictionary with a field dataset_name equal to the dataset that you want to run the evaluation over.

Let’s look at an example:

import os
import honeyhive
from honeyhive.utils.llamaindex_tracer import HoneyHiveLlamaIndexTracer
from llama_index.core import Settings, VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
from llama_index.core.callbacks import CallbackManager
import openai


def run_tracer(source, metadata):
    tracer = HoneyHiveLlamaIndexTracer(
        project=os.environ["HH_PROJECT"],
        name="Paul Graham Q&A",
        source=source,
        api_key=os.environ["HH_API_KEY"],
        metadata=metadata,
    )

    openai.api_key = os.environ["OPENAI_API_KEY"]

    Settings.callback_manager = CallbackManager([tracer])

    documents = SimpleWebPageReader(html_to_text=True).load_data(
        ["http://paulgraham.com/worked.html"]
    )

    index = VectorStoreIndex.from_documents(documents)

    query_engine = index.as_query_engine()
    response = query_engine.query("What did the author do growing up?")
    return tracer, response

tracer, response = run_tracer("evaluation", {"dataset_name": os.environ["HH_DATASET"]})

In order to add subsequent sessions onto an existing evaluation run:

  • Set the metadata field of the tracer to a dictionary with a field run_id equal to the ID of the existing evaluation run you want to append to

In order to help make this easy, the tracer object stores an eval_info object that contains the run id of any evaluation run that the tracer instantiated or appended to.

This is clearer if we look at an example:

import os
import honeyhive
from honeyhive.utils.llamaindex_tracer import HoneyHiveLlamaIndexTracer
from llama_index.core import Settings, VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
from llama_index.core.callbacks import CallbackManager
import openai


def run_tracer(source, metadata):
    tracer = HoneyHiveLlamaIndexTracer(
        project=os.environ["HH_PROJECT"],
        name="Paul Graham Q&A",
        source=source,
        api_key=os.environ["HH_API_KEY"],
        metadata=metadata,
    )

    openai.api_key = os.environ["OPENAI_API_KEY"]

    Settings.callback_manager = CallbackManager([tracer])

    documents = SimpleWebPageReader(html_to_text=True).load_data(
        ["http://paulgraham.com/worked.html"]
    )

    index = VectorStoreIndex.from_documents(documents)

    query_engine = index.as_query_engine()
    response = query_engine.query("What did the author do growing up?")
    return tracer, response

tracer, response = run_tracer("evaluation", {"dataset_name": os.environ["HH_DATASET"]})
_, new_response = run_tracer("evaluation", {"run_id": tracer.eval_info["run_id"]})

Learn more