HoneyHive Docs

This guide shows how to trace pipelines that handle multi-modal data - images, audio, video, or documents with embedded media.

Auto-instrumentation captures vision calls automatically. If you’re using OpenAI Vision, Gemini Pro Vision, or similar APIs, the LLM calls are traced automatically via instrumentors. This guide covers tracing your custom processing logic around those calls.

When to Use This Guide

Use these patterns when your pipeline includes:

Image preprocessing before vision model calls
Audio transcription or synthesis
Video frame extraction or analysis
Document parsing with embedded media
Media storage/retrieval operations

Basic Pattern

Trace multi-modal functions the same way as any other function - use the @trace decorator:

import os
from honeyhive import HoneyHiveTracer, trace, enrich_span

HoneyHiveTracer.init(api_key=os.getenv("HH_API_KEY"))

@trace
def analyze_image(image_path: str, question: str) -> dict:
    """Analyze an image and answer a question about it."""
    
    # Your preprocessing
    image_data = load_and_resize(image_path)
    
    # Vision model call (auto-traced if using instrumentor)
    response = vision_model.analyze(image_data, question)
    
    return {"answer": response.text, "confidence": response.confidence}

Adding Media Metadata

Add context about the media being processed using enrich_span:

@trace
def process_video(video_path: str) -> dict:
    """Extract and analyze frames from video."""
    
    # Add media metadata for debugging and analysis
    enrich_span({
        "media_type": "video",
        "format": "mp4",
        "duration_seconds": get_duration(video_path),
        "resolution": "1920x1080"
    })
    
    frames = extract_keyframes(video_path)
    analyses = [analyze_frame(f) for f in frames]
    
    enrich_span({"frames_analyzed": len(frames)})
    
    return {"frame_analyses": analyses}

Don’t log media bytes. Store references (paths, URLs, IDs) instead of raw binary data. This keeps traces lightweight and queryable.

Multi-Step Pipeline Example

For pipelines with multiple processing stages, each traced function becomes a child span:

@trace
def process_document(doc_path: str) -> dict:
    """Process document with embedded images."""
    
    # Each @trace function creates a child span
    text = extract_text(doc_path)           # Child span
    images = extract_images(doc_path)       # Child span
    
    summaries = []
    for img in images:
        summary = analyze_image(img)        # Child span per image
        summaries.append(summary)
    
    return {
        "text": text,
        "image_summaries": summaries
    }

@trace
def extract_text(doc_path: str) -> str:
    enrich_span({"step": "text_extraction"})
    # ... extraction logic
    return text

@trace  
def extract_images(doc_path: str) -> list:
    enrich_span({"step": "image_extraction"})
    # ... extraction logic
    return image_paths

@trace
def analyze_image(image_path: str) -> str:
    enrich_span({
        "step": "image_analysis",
        "image_path": image_path
    })
    # ... vision model call
    return summary

The trace tree shows the full pipeline hierarchy:

process_document
├── extract_text
├── extract_images
├── analyze_image (image_1)
├── analyze_image (image_2)
└── analyze_image (image_3)

Useful Metadata Fields

Field	Description	Example
`media_type`	Type of media	`"image"`, `"audio"`, `"video"`
`format`	File format	`"png"`, `"wav"`, `"mp4"`
`duration_seconds`	Length for audio/video	`120.5`
`resolution`	Dimensions	`"1920x1080"`
`file_size_bytes`	Size for performance tracking	`1048576`
`source_url`	Reference to original	`"s3://bucket/file.png"`
`processing_steps`	Operations performed	`["resize", "normalize"]`

Custom Spans

Full guide to the @trace decorator

Enriching Traces

Adding metadata with enrich_span

OpenAI Vision

Auto-tracing for OpenAI vision calls

Gemini Vision

Auto-tracing for Gemini vision calls

​When to Use This Guide

​Basic Pattern

​Adding Media Metadata

​Multi-Step Pipeline Example

​Useful Metadata Fields

​Related

Custom Spans

Enriching Traces

OpenAI Vision

Gemini Vision

When to Use This Guide

Basic Pattern

Adding Media Metadata

Multi-Step Pipeline Example

Useful Metadata Fields

Related