HoneyHive Docs

HoneyHive’s federated architecture separates the Control Plane (CP) from the Data Plane (DP). In a self-hosted deployment, both planes run inside your environment, so no customer data leaves your infrastructure boundary. Data Planes are built to operate independently from the Control Plane, so telemetry ingestion and evaluations continue running with no downtime even if the CP is temporarily unavailable. This page explains what data exists, how it moves between planes, and what controls you have over retention and deletion. For background on how the planes work together, see Platform Architecture.

Data Classification

HoneyHive processes three categories of data. Understanding the classification determines where each category is stored and how it crosses the CP/DP boundary.

Category	Examples	Contains PII?	Stored in
Application data	LLM inputs and outputs, trace payloads, evaluation input/output pairs, datasets, uploaded artifacts	Potentially — depends on what your application sends	Data Plane (S3)
Telemetry metadata	Span durations, token counts, model names, error rates, session aggregates, evaluation scores (numeric results, pass/fail)	No	Control Plane (ClickHouse)
Platform configuration	User accounts, RBAC roles, project settings, evaluator definitions, SAML/SSO integration, alert rules	No (organizational metadata only)	Control Plane (PostgreSQL)

PII handling: Application data may contain PII if your LLM application processes personal information. HoneyHive provides SDK-level PII redaction so you can strip sensitive fields before they reach the Data Plane. See Tracing Concepts — Handling Sensitive Data for configuration details, and PII Controls below for how the architecture limits PII propagation.

Control Plane and Data Plane Boundary

The CP and DP are logically independent — they run on separate databases, separate message queues, and separate compute. Even in a self-hosted deployment where both are within your infrastructure, the boundary is enforced to ensure defense-in-depth and to support future migration to dedicated or hybrid hosting models.

What crosses the boundary

Direction	What crosses	What does NOT cross	Purpose
DP to CP	Telemetry metadata (span durations, token counts, evaluation scores, session aggregates)	Raw trace content (LLM inputs/outputs), uploaded artifacts, dataset contents	Analytics engine (ClickHouse) needs aggregated metrics for dashboards, charts, and alerting
CP to DP	RBAC context, data retention policies, evaluator template definitions	User credentials, billing data, SSO/SAML configuration	Data Plane needs to know which evaluators to run and how to process incoming traces
CP to DP	Signed JWT tokens (via JWKS endpoint)	Private keys, session cookies	Data Plane verifies SDK API keys and user tokens without sharing a database with the CP
DP to CP	Health metrics, Data Plane status	Application data, customer payloads	Controller-to-controller gRPC stream for lifecycle management

In a Dedicated Cloud deployment (where the CP is managed by HoneyHive and only the DP is in your environment), the boundary above becomes a network boundary between your AWS account and HoneyHive’s. Only metadata and configuration cross the boundary — never raw trace content. Communication is one-way: the DP initiates outbound HTTPS to the CP. The CP never initiates connections into your network.

What stays within each plane

Plane	Data that never leaves
Data Plane	Raw LLM inputs and outputs, full trace payloads, uploaded datasets, S3-stored artifacts, evaluation input/output pairs
Control Plane	User accounts and passwords, RBAC policies, SSO/SAML configuration, billing and usage data, data plane index (metadata catalog of all captured data across data planes), central schema catalog

Data Flow

Ingestion path

Your application SDK sends traces to the DP Ingestion Service via HTTPS. The service acknowledges receipt immediately to minimize client latency.
The Ingestion Service immediately writes raw trace payloads (including LLM inputs/outputs) to DP S3 to ensure zero data loss, then publishes telemetry metadata to the CP NATS queue and evaluation events to the DP NATS queue.
The CP Writer Service consumes metadata from CP NATS, enriches it (session linking, metadata inheritance), and batch-writes to ClickHouse for dashboards, charts, and alerting.
Failed write batches are retried with exponential backoff; persistently failing events go to a dead letter queue on S3.

Evaluation path

The DP Evaluation Service consumes events from the DP NATS queue and executes configured evaluators (Python-based, LLM-based, or composite).
Evaluation scores (numeric results, pass/fail, labels) are published to the CP NATS queue as telemetry metadata.
The Writer Service persists scores to ClickHouse alongside the original trace metadata.

Authentication path

The Data Plane verifies API keys and user tokens using the CP’s JWKS endpoint. No shared database or credentials exist between the planes.

Data Stores

Data Plane stores

Store	Technology	What it holds	Encryption
Object Storage	Amazon S3	Raw trace payloads, LLM inputs/outputs, uploaded artifacts, datasets	SSE-KMS with customer-managed keys
Metadata DB	Amazon RDS (PostgreSQL)	Dataset definitions, datapoint records, metric configurations, provider secrets (encrypted), experiment metadata	AES-256 via KMS at rest; TLS 1.2+ in transit
Cache	Redis HA	Rate limiting, session caching	In-memory; encrypted at rest if persistence is enabled
Message Queue	NATS JetStream	Evaluation task queue (internal, no external access)	TLS for inter-node communication
Dead Letter Queue	Amazon S3	Failed ingestion or evaluation events (retried automatically)	SSE-KMS

Control Plane stores

Store	Technology	What it holds	Encryption
Analytics Engine	ClickHouse (on EKS)	Structured event data: span durations, token counts, model identifiers, evaluation scores, session aggregates (does not include raw LLM inputs/outputs)	EBS encryption via KMS at rest
Metadata DB	Amazon RDS (PostgreSQL)	User accounts, RBAC, organization hierarchy, project configuration, prompt templates, alert definitions	AES-256 via KMS at rest; TLS 1.2+ in transit
Message Queue	NATS JetStream	Event stream for Writer Service, notification stream for alerts	TLS for external communication
Cache	Redis HA	Session cache, rate limiting	In-memory
Dead Letter Queue	Amazon S3	Failed write batches (retried automatically)	SSE-KMS

By default, ClickHouse runs as a stateful workload on EKS. For customers who lack internal ClickHouse expertise, HoneyHive recommends either a HoneyHive-managed Control Plane or a managed ClickHouse deployment in the customer’s environment. See the infrastructure requirements documentation for specifics on supported topologies.

PII Controls

HoneyHive provides multiple layers to control how personally identifiable information (PII) is handled:

SDK-level redaction

The SDK supports schema-based PII filters that strip or mask sensitive fields before data leaves your application. Configure redaction rules to remove names, emails, phone numbers, or any custom fields from trace payloads at the source. For background on how tracing captures data and where to apply redaction, see Tracing Concepts.

Data Plane boundary

Raw trace payloads (which may contain PII) are stored exclusively in the Data Plane’s S3. Only telemetry metadata (durations, counts, scores) crosses the boundary to the Control Plane. This means even if PII reaches the Data Plane, it does not propagate to the analytics layer.

Encryption

All data at rest is encrypted with customer-managed KMS keys. All data in transit uses TLS 1.2+. See Security for full encryption details.

Retention Controls

Data retention is customer-configurable at multiple levels:

Data type	Default retention	Configurable?	Mechanism
Trace payloads (S3)	No automatic deletion	Yes	S3 lifecycle policies — set expiration rules per bucket
ClickHouse events	No automatic deletion	Yes	ClickHouse TTL policies — configure per-table retention periods
ClickHouse backups	Customer-defined	Yes	Scheduled snapshots to S3 via clickhouse-backup (see Operations Guide)
PostgreSQL metadata	Indefinite	Yes	Application-level retention APIs
RDS backups	7 days	Yes	RDS backup retention period (customer-configurable)
S3 access logs	Customer-defined	Yes	Dedicated logging bucket with its own lifecycle policy
EKS audit logs	Customer-defined	Yes	CloudWatch Logs retention settings

For compliance-driven retention, set S3 lifecycle policies and ClickHouse TTL rules to match your organization’s data retention policy. Both mechanisms run automatically once configured.

Deletion and Purge

Project-level archival

Deleting a project archives it — data is retained and can be recovered if needed. If the team is confident the original data should be permanently removed, they must purge the S3 store and ClickHouse index using escalated infrastructure privileges. This two-step design protects against accidental data loss.

Record-level deletion

HoneyHive is designed as an audit trail, so individual record deletion is not exposed through the standard API or UI. Deletions require infrastructure-level escalated privileges to ensure data integrity and auditability.

Infrastructure-level purge

For complete data removal (e.g., decommissioning a deployment):

S3: Delete all objects and the bucket itself, or apply an immediate-expiration lifecycle rule
RDS: Delete the database instance and all automated backups
ClickHouse: Drop the database or delete the EKS persistent volumes
EKS: Tear down the cluster via Terraform destroy

Since you own all infrastructure in a self-hosted deployment, you have full control over data destruction. No data residues remain on HoneyHive-managed systems.

Deletion operations are irreversible. Ensure you have appropriate backups before purging data. HoneyHive cannot recover data that has been deleted from customer-owned infrastructure.

Summary

Question	Answer
Does any data leave my environment?	No — in a self-hosted deployment, both the CP and DP run in your infrastructure. No data leaves your environment.
What crosses the CP/DP boundary?	Telemetry metadata (DP to CP) and configuration/tokens (CP to DP). Raw trace content stays in the DP.
Can I control data retention?	Yes — via S3 lifecycle policies, ClickHouse TTL rules, and RDS backup retention settings.
Can I delete all data?	Yes — you own all infrastructure. Projects are archived by default; permanent deletion requires escalated infrastructure privileges. Full infrastructure purge is supported for decommissioning.
Does the DP work if the CP is down?	Yes — Data Planes operate independently. Telemetry ingestion and evaluations continue with no downtime.

​Data Classification

​Control Plane and Data Plane Boundary

​What crosses the boundary

​What stays within each plane

​Data Flow

​Ingestion path

​Evaluation path

​Authentication path

​Data Stores

​Data Plane stores

​Control Plane stores

​PII Controls

​SDK-level redaction

​Data Plane boundary

​Encryption

​Retention Controls

​Deletion and Purge

​Project-level archival

​Record-level deletion

​Infrastructure-level purge

​Summary

Data Classification

Control Plane and Data Plane Boundary

What crosses the boundary

What stays within each plane

Data Flow

Ingestion path

Evaluation path

Authentication path

Data Stores

Data Plane stores

Control Plane stores

PII Controls

SDK-level redaction

Data Plane boundary

Encryption

Retention Controls

Deletion and Purge

Project-level archival

Record-level deletion

Infrastructure-level purge

Summary