HoneyHive Docs

HoneyHive’s tracing capabilities extend beyond text-based data, allowing you to capture and analyze multi-modal information in your AI applications. This guide focuses on instrumenting functions that handle multi-modal data, particularly those that return S3 URLs pointing to images, audio, or other non-text assets. Multi-modal tracing is crucial for applications that process various types of data, such as:

Image generation or analysis
Audio processing
Video content creation or analysis
Document processing with embedded media

By tracing these functions, you can gain insights into how your application handles different data types and how they impact your AI pipeline’s performance and accuracy. To instrument functions that return S3 URLs for multi-modal data, you’ll use the same trace decorator as with text-based functions. Here’s how to set it up:

First, ensure you’ve initialized the HoneyHiveTracer:

Python

from honeyhive import HoneyHiveTracer

HoneyHiveTracer.init(
    api_key=MY_HONEYHIVE_API_KEY,
    project=MY_HONEYHIVE_PROJECT_NAME,
    source=MY_SOURCE,  # e.g., "prod", "dev", etc.
    session_name=MY_SESSION_NAME,
)

Import and use the trace decorator:

Python

from honeyhive import trace

@trace
def process_image(image_path):
    # Image processing logic here
    # ...
    return "s3://my-bucket/processed-images/image123.jpg"

To make your traces more informative, you can add metadata about the multi-modal data:

Python

@trace(
    metadata={
        "data_type": "image",
        "format": "jpg",
        "resolution": "1024x768",
        "processing_steps": ["resize", "enhance", "annotate"]
    }
)
def process_image(image_path):
    # Image processing logic here
    # ...
    return "s3://my-bucket/processed-images/image123.jpg"

Here are examples of tracing different types of multi-modal data:

Audio Processing

Python

@trace(
    metadata={
        "data_type": "audio",
        "format": "wav",
        "duration_seconds": 120,
        "sample_rate": 44100
    }
)
def transcribe_audio(audio_file):
    # Audio transcription logic
    # ...
    return "s3://my-bucket/transcriptions/audio123.txt"

Video Analysis

Python

@trace(
    metadata={
        "data_type": "video",
        "format": "mp4",
        "duration_seconds": 300,
        "resolution": "1920x1080",
        "fps": 30
    }
)
def analyze_video(video_file):
    # Video analysis logic
    # ...
    return "s3://my-bucket/video-analysis/video123.json"

Include relevant metadata: Add information about the data type, format, size, and any processing steps to provide context.
Use consistent naming conventions: For S3 URLs, use a consistent structure to make it easier to analyze and group related assets.
Consider privacy and data protection: Ensure that your S3 URLs and metadata don’t contain sensitive information.
Link related traces: If a multi-modal process involves multiple steps, use consistent identifiers in your metadata to link related traces.

Introduction

Guides

Tutorials

Learn more

Multi-Modal Tracing

Audio Processing

Video Analysis

Introduction

Guides

Tutorials

Learn more

​Why Multi-modal Tracing?

​Using the trace Decorator for Multi-modal Data

​Adding Context to Multi-modal Traces

​Handling Different Multi-modal Types

​Audio Processing

​Video Analysis

​Best Practices for Multi-modal Tracing

Why Multi-modal Tracing?

Using the `trace` Decorator for Multi-modal Data

Adding Context to Multi-modal Traces

Handling Different Multi-modal Types

Audio Processing

Video Analysis

Best Practices for Multi-modal Tracing