Instrumenting a RAG application with HoneyHive
@trace
decorator in Python and traceFunction
in TypeScript help us add custom spans for important functions in the application. It captures all function inputs and outputs as well as durations and other relevant properties.
We’ll start by placing the first decorator on the main RAG function.
rag_pipeline
/ragPipeline
span is a lot easier to read and interpret.
We can see that the user query was What does the document talk about?
and the final output is the (possibly?) correct description provided by the model.
This high-level view will help us catch any glaring semantic issues.
However, this is still not sufficient.
We still need access to some specific fields from the vector DB and LLM step that can break down how we arrived at this output.
Luckily, our decorator approach can easily scale to include any step as we please.
get_relevant_documents
and understand whether the LLM’s answer is sensible.
Our UI makes it easy to navigate extremely nested JSONs with large text to make debugging smoother.
Just by investigating these spans we can quickly debug whether our retriever or generation step is causing our overall application to fail.
trace
decorator accepts metadata and configuration to provide more context to the traces:
main
function.
Using the enrich_session
/enrichSession
helper functions on our base tracer class, we will enrich the full session with the relevant external context as well.
get_relevant_documents
/getRelevantDocuments
: Retrieves relevant documents from Pinecone.generate_response
/generateResponse
: Generates a response using OpenAI’s GPT model.rag_pipeline
/ragPipeline
: Orchestrates the entire RAG process.main
function, we: