> ## Documentation Index
> Fetch the complete documentation index at: https://docs.honeyhive.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Multi-Modal Tracing

> Trace pipelines that process images, audio, video, and other media

This guide shows how to trace pipelines that handle multi-modal data - images, audio, video, or documents with embedded media.

<Note>
  **Auto-instrumentation captures vision calls automatically.** If you're using OpenAI Vision, Gemini Pro Vision, or similar APIs, the LLM calls are traced automatically via [instrumentors](/v2/integrations/openai). This guide covers tracing your custom processing logic around those calls.
</Note>

## When to Use This Guide

Use these patterns when your pipeline includes:

* Image preprocessing before vision model calls
* Audio transcription or synthesis
* Video frame extraction or analysis
* Document parsing with embedded media
* Media storage/retrieval operations

***

## Basic Pattern

Trace multi-modal functions the same way as any other function - use the `@trace` decorator:

```python theme={null}
import os
from honeyhive import HoneyHiveTracer, trace, enrich_span

HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY"),
    project=os.getenv("HH_PROJECT")
)

@trace
def analyze_image(image_path: str, question: str) -> dict:
    """Analyze an image and answer a question about it."""
    
    # Your preprocessing
    image_data = load_and_resize(image_path)
    
    # Vision model call (auto-traced if using instrumentor)
    response = vision_model.analyze(image_data, question)
    
    return {"answer": response.text, "confidence": response.confidence}
```

***

## Adding Media Metadata

Add context about the media being processed using `enrich_span`:

```python theme={null}
@trace
def process_video(video_path: str) -> dict:
    """Extract and analyze frames from video."""
    
    # Add media metadata for debugging and analysis
    enrich_span({
        "media_type": "video",
        "format": "mp4",
        "duration_seconds": get_duration(video_path),
        "resolution": "1920x1080"
    })
    
    frames = extract_keyframes(video_path)
    analyses = [analyze_frame(f) for f in frames]
    
    enrich_span({"frames_analyzed": len(frames)})
    
    return {"frame_analyses": analyses}
```

<Tip>
  **Don't log media bytes.** Store references (paths, URLs, IDs) instead of raw binary data. This keeps traces lightweight and queryable.
</Tip>

***

## Multi-Step Pipeline Example

For pipelines with multiple processing stages, each traced function becomes a child span:

```python theme={null}
@trace
def process_document(doc_path: str) -> dict:
    """Process document with embedded images."""
    
    # Each @trace function creates a child span
    text = extract_text(doc_path)           # Child span
    images = extract_images(doc_path)       # Child span
    
    summaries = []
    for img in images:
        summary = analyze_image(img)        # Child span per image
        summaries.append(summary)
    
    return {
        "text": text,
        "image_summaries": summaries
    }

@trace
def extract_text(doc_path: str) -> str:
    enrich_span({"step": "text_extraction"})
    # ... extraction logic
    return text

@trace  
def extract_images(doc_path: str) -> list:
    enrich_span({"step": "image_extraction"})
    # ... extraction logic
    return image_paths

@trace
def analyze_image(image_path: str) -> str:
    enrich_span({
        "step": "image_analysis",
        "image_path": image_path
    })
    # ... vision model call
    return summary
```

The trace tree shows the full pipeline hierarchy:

```
process_document
├── extract_text
├── extract_images
├── analyze_image (image_1)
├── analyze_image (image_2)
└── analyze_image (image_3)
```

***

## Useful Metadata Fields

| Field              | Description                   | Example                         |
| ------------------ | ----------------------------- | ------------------------------- |
| `media_type`       | Type of media                 | `"image"`, `"audio"`, `"video"` |
| `format`           | File format                   | `"png"`, `"wav"`, `"mp4"`       |
| `duration_seconds` | Length for audio/video        | `120.5`                         |
| `resolution`       | Dimensions                    | `"1920x1080"`                   |
| `file_size_bytes`  | Size for performance tracking | `1048576`                       |
| `source_url`       | Reference to original         | `"s3://bucket/file.png"`        |
| `processing_steps` | Operations performed          | `["resize", "normalize"]`       |

***

## Related

<CardGroup cols={2}>
  <Card title="Custom Spans" icon="code" href="/v2/tracing/custom-spans">
    Full guide to the @trace decorator
  </Card>

  <Card title="Enriching Traces" icon="sparkles" href="/v2/tracing/enrich-traces">
    Adding metadata with enrich\_span
  </Card>

  <Card title="OpenAI Vision" icon="eye" href="/v2/integrations/openai">
    Auto-tracing for OpenAI vision calls
  </Card>

  <Card title="Gemini Vision" icon="eye" href="/v2/integrations/gemini">
    Auto-tracing for Gemini vision calls
  </Card>
</CardGroup>
