> ## Documentation Index
> Fetch the complete documentation index at: https://docs.honeyhive.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Self-Hosting

> Updates to HoneyHive Helm charts for self-hosted deployments.

<Update label="May 2026">
  ### v0.104.0 - May 5, 2026

  * New `dpPythonmetricService.gunicornWorkers` (default: `4`) and `dpPythonmetricService.pythonExecutionTimeout` (default: `0.1`) in `data-plane/services/values.yaml` let operators tune custom metric concurrency and per-metric runtime without patching the deployment template.
  * `dpEvaluationService.autoscaling.datadog.enabled` (default: `false`) adds optional Datadog External Metrics scaling on NATS evaluation queue depth. Tune `dpEvaluationService.autoscaling.datadog.targetBacklogPerPod` to control backlog per replica.
  * `dpEvaluationService.autoscaling.keda.enabled` (default: `false`) adds optional KEDA-based queue-depth scaling from Prometheus. When enabled, KEDA manages evaluation replicas directly instead of the CPU HPA.
  * New `<service>.affinity` (default: `{}`) for every control-plane and data-plane service and `jobs.affinity` (default: `{}`) for CronJobs and batch jobs in both control-plane and data-plane values. Supports `podAffinity`, `podAntiAffinity`, and `nodeAffinity` rules for pod scheduling.
  * New `frontendIngress.tls.redirect` (default: `false`) in `control-plane/services/values.yaml`. When `true` and TLS is enabled, the ALB listens on HTTP:80 and redirects to HTTPS.
  * New `dpIngestionService.resources` (default: `{}`) and memory-based HPA target `dpIngestionService.autoscaling.targetMemoryUtilizationPercentage` (default: `60`) let operators scale ingestion on memory pressure independently from other data-plane services.
  * **Action Required**: Set `common.dataplane.dpPublicUrl` in `data-plane/services/values.yaml` to your public Data Plane URL before upgrading.
</Update>

<Update label="April 2026">
  ### v0.103.0 - April 24, 2026

  * New `jobs` configuration block in `control-plane/services/values.yaml` for Kubernetes CronJobs that process alert transitions
  * New `jobs` configuration block in `data-plane/services/values.yaml` for operator-triggered S3 time-window backfill batch workloads
  * New per-service keys: `<service>.service.annotations` (default: `{}`), `<service>.service.labels` (default: `{}`), `<service>.service.type` (default: `"ClusterIP"`) now exposed on `cpWriterService`, `cpControllerService`, `cpNotificationService`, `dpControllerService`, `dpIngestionService`, `dpEvaluationService`, `dpLlmproxyService`, `dpPythonmetricService` — previously hardcoded
  * Service port protocol hints (`name: http`, `protocol: TCP`, `appProtocol: http`/`grpc`) added to all service templates for correct L7 traffic classification by service meshes (Istio, Linkerd)
  * `appProtocol` added to ClickHouse instance Service ports (`http`, `interserver`, `metrics`: `appProtocol: http`; `native`: `appProtocol: tcp`)
  * ClickHouse instances `chi-installation` chart version bumped from `1.1.6` to `1.1.7`
  * No breaking changes — fully backward-compatible with v0.102.0 configurations

  ### v0.102.0 - April 8, 2026

  * New `podAnnotations` (default: `{}`) in `control-plane/infrastructure/clickhouse/clickhouse_instances/values.yaml` for arbitrary annotations on ClickHouse pods (useful for Datadog autodiscovery, Prometheus scraping)
  * Fixed missing `DP_DATABASE_URL` env var in dp-pythonmetric-service deployment template, now reads from `common.externalSecrets.postgres.secretName` / `uriKey` like all other data-plane services
  * No breaking changes — fully backward-compatible with v0.101.0 configurations

  ### v0.101.0 - April 1, 2026

  * Disabled ClickHouse replica check before attaching backup parts (`CLICKHOUSE_CHECK_REPLICAS_BEFORE_ATTACH` set to `"false"` in backup container) - prevents backup restore failures in environments where replica availability cannot be confirmed
  * ClickHouse instances `chi-installation` chart version bumped from `1.1.5` to `1.1.6`
  * No breaking changes — fully backward-compatible with v0.100.0 configurations
</Update>

<Update label="March 2026">
  ### v0.100.0 - March 25, 2026

  * New `ingress.albClassName` (default: `"alb"`) and `frontendIngress.albClassName` (default: `"alb"`) in control-plane and data-plane services for configurable ALB ingress class
    * Useful for shared CP+DP cluster scenarios where each plane needs its own ALB IngressClass (e.g., `"cp-alb"`, `"dp-alb"`)
    * Existing deployments using the default `"alb"` class require no changes
  * New scheduling config for CP prometheus-nats-exporter in `control-plane/infrastructure/nats/values.yaml`: `prometheusExporter.tolerations`, `prometheusExporter.nodeSelector`, `prometheusExporter.affinity`, `prometheusExporter.additionalLabels`
  * `common.extraLabels` now propagated to all Service, ServiceAccount, HPA, and Ingress resources across both planes (previously only applied to Deployments)

  ### v0.99.3 - March 20, 2026

  * **Action Required**: Add `common.controlPlane.id` in `data-plane/services/values.yaml` - set it to match `common.controlPlane.id` from your control-plane values. Without this, `dp-controller-service` cannot identify its parent control plane, causing data-plane-to-control-plane communication failures
  * New `global.labels` (default: `{}`) in `control-plane/infrastructure/nats/values.yaml` and `data-plane/infrastructure/nats/values.yaml` - applied to all NATS-generated Kubernetes resources (StatefulSet, Service, PVC, ConfigMap, PDB, etc.)
  * New `nack.additionalLabels`, `nack.tolerations`, `nack.nodeSelector`, `nack.affinity` in control-plane NATS values for JetStream Controller scheduling
  * `prometheusExporter.additionalLabels` now propagated to Prometheus NATS Exporter Deployment, Service, and ServiceMonitor labels in both planes
  * Fixed `topologySpreadConstraints` typo in control-plane NATS values (`topolicySpreadConstraints` → `topologySpreadConstraints`)

  ### v0.99.2 - March 9, 2026

  * ClickHouse instances `chi-installation` chart version bumped from `1.1.4` to `1.1.5` in `control-plane/infrastructure/clickhouse/clickhouse_instances/Chart.yaml`

  ### v0.99.1 - March 6, 2026

  * Fixed cp-notification-service healthcheck port - renamed env var from `EXPRESS_PORT` to `PORT` in deployment template
  * Control plane `hive-control-plane` chart version bumped from `0.2.0` to `0.2.1`

  ### v0.99.0 - March 6, 2026

  * Added readiness and liveness probes (`GET /healthcheck`) to all 9 service deployments across control-plane and data-plane
  * `cp-backend-service` and `dp-backend-service` include a startup probe (allows up to 310s for Prisma migrations before liveness checks begin)
  * Eliminates intermittent 502 errors during rolling updates caused by traffic routing to pods not yet ready to serve
  * No values.yaml changes required - probes use hardcoded values in deployment templates
</Update>

<Update label="February 2026">
  ### v0.98.9 - February 25, 2026

  * Unified encryption configuration for cp-controller, dp-controller, and dp-llmproxy-service
  * Supports at-rest encryption for identity management and provider secrets via KMS or environment variable mode
  * New `common.encryption.keyId` value in both control-plane and data-plane services
  * New ExternalSecret templates for encryption in both control-plane and data-plane secret-store charts
  * Replaced `dpLlmproxyService.kmsKeyId` with unified `HH_ENCRYPTION_KEY_ID` and `HH_ENCRYPTION_SECRET` env vars in dp-llmproxy-service
  * Fixed perpetual ArgoCD OutOfSync caused by Redis PDB `enabled: true` in control-plane Redis
  * **Action Required**: Remove `dpLlmproxyService.kmsKeyId` from data-plane values and add `common.encryption.keyId` in both control-plane and data-plane values
</Update>

<Update label="February 2026">
  ### v0.98.1 - February 11, 2026

  * Added `common.extraLabels` for custom governance/compliance labels on all Kubernetes resources (control-plane, data-plane, shared dependencies)
  * Added `common.observability.otel.exporterProtocol` (default: `"grpc"`) for OTLP exporter protocol configuration in data-plane
  * Added `dpLlmproxyService.kmsKeyId` (default: `"alias/hh-provider-secrets"`) for AWS KMS encryption of LLM provider secrets
  * Removed duplicate `common.observability` block in data-plane services values
  * Fixed Next.js cache permission errors in cp-frontend-service with `nextjs-cache` emptyDir volume
  * Fixed OTEL service name in cp-frontend-service (was hardcoded to `cp-controller-service`)
  * Added custom CA certificate support (`SSL_CERT_FILE`, `REQUESTS_CA_BUNDLE`) for dp-llmproxy-service and dp-pythonmetric-service
  * Added `DP_DATABASE_URL` env var and KMS config to dp-llmproxy-service
  * Fixed ClickHouse logging configuration (moved to `config.d/99-logger.xml` with `replace="1"`)
  * **Action Required**: Set `common.extraLabels` if your organization requires specific labels on all resources
</Update>

<Update label="January 2026">
  ### v0.90.17 - January 12, 2026

  * Added kube-prometheus-stack monitoring for both control-plane and data-plane (Prometheus, Grafana, Alertmanager with 30-day retention)
  * Added Tempo for distributed tracing in both control-plane and data-plane
  * Added Loki and Promtail for centralized log aggregation in control-plane
  * Added legacy `nats-old` chart for backward compatibility
  * Added Datadog integration support for OTEL collectors (disabled by default, set `datadog.enabled: true`)
  * Added `common.tls.caCerts` dictionary for custom root CA certificates in data-plane
  * Added `common.controlPlane.apiPublicUrl` for data-plane to call control-plane API
  * Added resource limits for dp-llmproxy-service and dp-pythonmetric-service (500m/512Mi requests, 1000m/1Gi limits)
  * Added persistent storage for ClickHouse Keeper (`storage.enabled: true`, `storage.size: "10Gi"`)
  * Updated ClickHouse Keeper image to `altinity/clickhouse-keeper:25.3.6.10034.altinitystable-alpine`
  * Simplified ClickHouse Operator values from 903 lines to 29 lines
  * Updated ExternalSecret API version from `v1beta1` to `v1` (requires External Secrets Operator 0.9.0+)
  * Moved OTEL collector `nodeSelector`/`affinity`/`tolerations` under `opentelemetry-collector` key
  * **Action Required**: Set `common.controlPlane.apiPublicUrl` to your control-plane API endpoint
  * **Action Required**: Ensure ClickHouse Keeper persistent storage is enabled for production
  * **Action Required**: Update External Secrets Operator to 0.9.0+ if not already
</Update>

<Update label="December 2025">
  ### December 2025

  * Removed OpenUnison authentication infrastructure (all charts, operators, CRDs)
  * Removed Nginx ingress infrastructure
  * Added NATS infrastructure for data-plane with independent cluster deployment (3 replicas, JetStream, PDB)
  * Disabled S3 DLQ and disk spool in writer service (`cpWriterService.dlq.enabled: false`)
  * Added `frontendIngress.alb.annotations` for custom ALB annotations
  * Removed PVC functionality from cp-writer-service
  * Changed cp-frontend-service `NEXTJS_PORT` env var to `PORT`
  * Added auth config env vars (`AUTH_ISSUER_DOMAIN`, `AUTH_CLIENT_ID`, `AUTH_CLIENT_SECRET`) to cp-frontend-service
  * Added NATS connection settings for data-plane services (dp-evaluation-service, dp-ingestion-service)
  * Enabled Redis authentication in control-plane (`auth: true`, `existingSecret: redis-secrets`)
  * Removed gRPC ingress from data-plane services
  * Switched from NLB to ALB for both control-plane and data-plane ingress
  * Added NATS HA streams configuration with configurable replicas
  * Added `common.dataPlane.dpPublicUrl` and `common.controlPlane.frontendPublicUrl` for cross-plane communication
  * Added Prometheus monitoring for NATS (exporter on port 7777) and ClickHouse (built-in on port 9363)
  * Added Redis authentication for data-plane (`auth: true`, `existingSecret: redis-secrets`)
  * Fixed Redis PDB in data-plane (removed invalid `enabled` field)
  * **Action Required**: Remove any NLB-related values overrides and switch to ALB configuration
  * **Action Required**: Remove any OpenUnison or Beekeeper-related overrides from values files
  * **Action Required**: Configure auth secrets in AWS Secrets Manager with `client-secret` and `cp-jwt-private-key`
</Update>
