Evaluators can be run either locally (client-side) or remotely (server-side), each with its own set of advantages and use cases. Understanding these differences is crucial for selecting the right approach for your evaluation needs.

Client-side evaluators

Client-side evaluators run locally within your application’s environment, executing synchronously as part of your experiment workflow. They are ideal for quick evaluations like checking a simple regex pattern or performing basic format validations.

Pros

  • Synchronous Execution: Run immediately after each experiment iteration, providing instant feedback in real-time.
  • Offline Experiments: Suitable for offline experiments, such as unit tests or CI pipelines, where latency is not a concern.
  • Guardrails: Perfect for online guardrails like format assertions and PII detections, where results are needed at execution time.

Cons

  • Resource Utilization: Depend on local infrastructure, which can be a limitation for resource-intensive evaluations.
  • Management Challenges: Not automatically managed, making versioning and collaboration within a team more difficult.

Server-side evaluators

Server-side evaluators run remotely on HoneyHive’s infrastructure, operating asynchronously and independently from your main application. They are ideal for resource-intensive and complex tasks, like LLM-assisted coherence or fact-checking tasks.

Pros

  • Asynchronous Execution: Operate independently of the main application flow, enabling non-blocking operations.
  • Centralized Management: Provide a consistent, centralized approach to deployment, management, and versioning across environments.
  • Online Suitability: Well-suited for online evaluations, such as real-time performance monitoring and quality metrics.
  • Scalability: Can handle larger datasets and more complex computations without impacting application performance.
  • Human Evaluations: Support asynchronous human evaluations, which require centralized coordination.
  • Post-Ingestion Analysis: Run after the application has completed and logs are ingested, enabling evaluations that incorporate later user feedback or additional data.

Cons

  • Latency: Asynchronous nature means results are not available immediately, which may not be suitable for workflows requiring instant feedback.
  • Less Flexible for Rapid Prototyping: Require more setup and coordination, making them less suitable for quick iterations or early-stage experiments compared to client-side evaluators.