TL;DR: In late Aug–Sep 2025, Bitnami (Broadcom) shifted most free images off docker.io/bitnami, introduced a latest-only, dev-intended “bitnamisecure” subset, archived versioned tags to docker.io/bitnamilegacy (no updates), ran rolling brownouts of popular images, and said their OCI Helm charts on Docker Hub would stop receiving updates (except for the tiny free subset). Result: lots of teams saw pull failures and surprise drift, especially for core bits like kubectl, ExternalDNS, PostgreSQL; some Helm charts still referenced images that went missing mid-migration. GitHub+2hub.docker.com+2
What changed (and when)
Timeline. Bitnami announced the change for 28 Aug 2025, then postponed deletion of the public catalog to 29 Sep 2025, running three 24-hour brownouts to “raise awareness.” Brownout sets explicitly included external-dns (Aug 28) and kubectl, redis, postgresql, mongodb (Sep 17). Tags were later restored, except very old distro bases. GitHub
Free tier becomes “bitnamisecure/…” Available only as latest and “intended for development” (their wording). No version matrix. hub.docker.com+1
Legacy archive. Versioned tags moved to docker.io/bitnamilegacy—no updates, no support; meant only as a temporary bridge. GitHub+1
Charts. Source code stays on GitHub, but OCI charts on Docker Hub stop receiving updates (except the small free subset) and won’t work out-of-the-box unless you override image repos. Bitnami’s own FAQ shows helm upgrade … --set image.repository=bitnamilegacy/... as a short-term band-aid. GitHub
That mix of latest-only + brownouts + chart defaults still pointing at moved/blocked images is why so many clusters copped it, bru—especially anything depending on kubectl sidecars/hooks, ExternalDNS, or PostgreSQL images. GitHub
Why “latest-only, dev-intended” breaks production hygiene
Production needs immutability and pinning. “Latest” is mutable and can introduce breaking changes or CVE regressions without your staging gates ever seeing them. Bitnami explicitly positions these bitnamisecure/* freebies as development-only; if you need versions, you’re pointed to a paid catalog. That alone makes the free images not fit for prod, regardless of hardening claims. hub.docker.com
How clusters actually broke
Brownouts removed popular images for 24h windows. If your charts/Jobs still pulled from docker.io/bitnami, pods simply couldn’t pull. Next reconciliation loop? CrashLoop/back-off. GitHub
Chart/image mismatch. OCI charts remain published but aren’t updated to point at the new repos; unless you override every image.repository (and sometimes initContainer/metrics sidecars), you deploy a chart that references unavailable images. Bitnami’s own example shows how many fields you might need to override in something like PostgreSQL. GitHub
kubectl images. Lots of ops charts use a tiny kubectl image for hooks or jobs. When bitnami/kubectl went dark during brownouts, those jobs failed. Upstream alternatives exist (see below). hub.docker.com+1
Better defaults for core components (ditch the vendor lock)
Wherever possible, move back upstream for the chart and use official/community images:
Velero – Upstream chart (VMware Tanzu Helm repo on Artifact Hub) and upstream images (pin). artifacthub.io
kubectl – Prefer upstream registry: registry.k8s.io hosts Kubernetes container images; several maintained images provide kubectl (or use distro images like alpine/kubectl/rancher/kubectl if they meet your standards—pin exact versions). GitHub+3Kubernetes+3GitHub+3
For stateful services:
PostgreSQL – Operators such as CloudNativePG (CNCF project). Alternatives include commercial operators; or, if you stick with straight images, use the official postgres image and manage via your own Helm/Kustomize. cloudnative-pg.io+1
MongoDB – Percona Operator for MongoDB (open-source) is a strong, widely used option. Percona Documentation+1
Redis – Consider the official redis image (or valkey where appropriate), plus a community operator if you need HA/cluster features; evaluate operator maturity and open issues for your SLA needs. (Context from Bitnami’s lists shows Redis/Valkey were part of the brownout sets.)
Questions Bitnami should answer publicly
Why ship a dev-only latest-only free tier for components that underpin production clusters, without a long freeze window and frictionless migration for chart defaults? (Their Docker Hub pages literally say latest-only and dev-intended.) hub.docker.com
Why brownouts of ubiquitous infra images (external-dns, kubectl, postgresql) during the migration window, increasing blast radius for unsuspecting teams? GitHub
Why leave OCI charts published but not updated to sane defaults (or at least yanking them) so new installs don’t reference unavailable registries by default?
Bitnami
Gain confidence, control and visibility of your software supply chain security with production-ready open source software delivered continuously in hardened images, with minimal CVEs and transparency you can trust.
We have lost confidence in your software supply chain.
TL;DR: Pin versions, set sane resources, respect system-node taints, make Gatekeeper happy, no-encoding secrets, and mirror images (Never pull from public registries and blindly trust them).
Works great on AKS, EKS, GKE — examples below use AKS.
The default dynakube template that Dynatrace provides you – will probably not work in the real world. You have zero trust, Calico firewalls, OPA Gatekeeper and perhaps some system pool taints?
Quick checks (healthy install):
dynatrace-operator Deployment is Ready
2x dynatrace-webhook pods
dynatrace-oneagent-csi-driver DaemonSet on every node (incl. system)
OneAgent pods per node (incl. system)
1x ActiveGate StatefulSet ready
Optional OTEL collector running if you enabled it
k get dynakube
NAME APIURL STATUS AGE
xxx-prd-xxxxxxxx https://xxx.live.dynatrace.com/api Running 13d
kubectl -n dynatrace get deploy,sts
# CSI & OneAgent on all nodes
kubectl -n dynatrace get ds
# Dynakube CR status
kubectl -n dynatrace get dynakube -o wide
# RBAC sanity for k8s monitoring
kubectl auth can-i list dynakubes.dynatrace.com \
--as=system:serviceaccount:dynatrace:dynatrace-kubernetes-monitoring --all-namespaces
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/dynatrace-operator 1/1 1 1 232d
deployment.apps/dynatrace-webhook 2/2 2 2 13d
NAME READY AGE
statefulset.apps/xxx-prd-xxxxxxxxxxx-activegate 1/1 13d
statefulset.apps/xxx-prd-xxxxxxxxxxx-otel-collector 1/1 13d
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
xxx-prd-xxxxxxxxxxx-oneagent 9 9 9 9 9 <none> 13d
dynatrace-oneagent-csi-driver 9 9 9 9 9 <none> 13d
NAME APIURL STATUS AGE
xxx-prd-xxxxxxxxxxx https://xxx.live.dynatrace.com/api Running 13d
yes
Here are field-tested tips to keep Dynatrace humming on Kubernetes without fighting OPA Gatekeeper, seccomp, or AKS quirks.
1) Start with a clean Dynakube spec (and pin your versions)
Pin your operator chart/image and treat upgrades as real change (PRs, changelog, Argo sync-waves). A lean cloudNativeFullStack baseline that plays nicely with Gatekeeper:
Why this works: it respects control-plane taints, adds the CriticalAddonsOnly toleration for system pools, sets reasonable resource bounds, and preps you for GitOps.
2) System node pools are sacred — add the toleration
If your CSI Driver or OneAgent skips system nodes, your visibility and injection can be patchy. Make sure you’ve got:
Your taints might be different, so check what taints you have on your systempools. This is the difference between “almost there” and “golden”.
3) Resource requests that won’t sandbag the cluster
OneAgent: requests: cpu 100m / mem 512Mi and limits: cpu 300m / mem 1.5Gi are a good starting point for mixed workloads.
ActiveGate: requests: 500m / 1.5Gi, limits: 1000m / 1.5Gi. Tune off SLOs and node shapes; don’t be shy to profile and trim.
4) Make Gatekeeper your mate (OPA policies that help, not hinder)
Enforce the seccomp hint on DynaKube CRs (so the operator sets profiles on init containers and your PSA/Gatekeeper policies stay green).
ConstraintTemplate (checks DynaKube annotations):
5) Secrets: avoid the dreaded encode (akv2k8s tip)
Kubernetes Secret.data is base64 on the wire, but tools like akv2k8s can also feed you values that are already base64. If using tools like akv2k8s, use this to transform the output.
This will ensure Dynatrace can read the Kubernentes Opaque secret as it, no base64 encoding on the secret.
6) Mirror images to your registry (and pin)
Air-gapping or just speeding up pulls? Mirror dynatrace-operator, activegate, dynatrace-otel-collector into your ACR/ECR/GCR and reference them via the Dynakube templates.*.imageRef blocks or Helm values. GitOps + private registry = fewer surprises.
We use ACR Cache.
7) RBAC: fix the “list dynakubes permission is missing” warning
If you see that warning in the UI, verify the service account:
kubectl auth can-i list dynakubes.dynatrace.com \ –as=system:serviceaccount:dynatrace:dynatrace-kubernetes-monitoring –all-namespaces
If “no”, ensure the chart installed/updated the ClusterRoleandClusterRoleBinding that grant list/watch/get on dynakubes.dynatrace.com. Sometimes upgrading the operator or re-syncing RBAC via Helm/Argo cleans it up.
When you install the Dynatrace Operator, you’ll see pods named something like dynatrace-webhook-xxxxx. They back one or more admission webhook configurations. In practice they do three big jobs:
Mutating Pods for OneAgent injection
Adds init containers / volume mounts / env vars so your app Pods load the OneAgent bits that come from the CSI driver.
Ensures the right binaries and libraries are available (e.g., via mounted volumes) and the process gets the proper preload/agent settings.
Respects opt-in/opt-out annotations/labels on namespaces and Pods (e.g. dynatrace.com/inject: "false" to skip a Pod).
Can also add Dynatrace metadata enrichment env/labels so the platform sees k8s context (workload, namespace, node, etc.).
Validating Dynatrace CRs (like DynaKube)
Schema and consistency checks: catches bad combinations (e.g., missing fields, wrong mode), so you don’t admit a broken config.
Helps avoid partial/failed rollouts by rejecting misconfigured specs early.
Hardening/compatibility tweaks
With certain features enabled, the mutating webhook helps ensure injected init containers comply with cluster policies (e.g., seccomp, PSA/PSS).
That’s why we recommend the annotation you’ve been using: feature.dynatrace.com/init-container-seccomp-profile: "true" It keeps Gatekeeper/PSA happy when it inspects the injected bits.
Why two dynatrace-webhook pods?
High availability for admission traffic. If one goes down, the other still serves the API server’s webhook calls.
How this ties into Gatekeeper/PSA
Gatekeeper (OPA) also uses validating admission.
The Dynatrace mutating webhook will first shape the Pod (add mounts/env/init).
Gatekeeper then validates the final Pod spec.
If you’re enforcing “must have seccomp/resources,” ensure Dynatrace’s injected init/sidecar also satisfies those rules (hence that seccomp annotation and resource limits you’ve set).
Dynatrace Active Gate
A Dynatrace ActiveGate acts as a secure proxy between Dynatrace OneAgents and Dynatrace Clusters or between Dynatrace OneAgents and other ActiveGates—those closer to the Dynatrace Cluster. It establishes Dynatrace presence—in your local network. In this way it allows you to reduce your interaction with Dynatrace to one single point—available locally. Besides convenience, this solution optimizes traffic volume, reduces the complexity of the network and cost. It also ensures the security of sealed networks.
The docs on Active Gate and version compatibility with Dynakube are not yet mature. Ensure the following:
With Dynatrace Operator 1.7 the v1beta1 and v1beta2 API versions for the DynaKube custom resource were removed.
ActiveGates up to and including version 1.323 used to call the v1beta1 endpoint. Starting from ActiveGate 1.325, the DynaKube endpoint was changed to v1beta3 Ensure your ActiveGate is up to date with the latest version.
As part of our ongoing platform reliability work, we’ve introduced explicit CPU and memory requests/limits for all Dynatrace components running on AKS.
🧩 Why it matters
Previously, the OneAgent and ActiveGate pods relied on Kubernetes’ default scheduling behaviour. This meant:
No guaranteed CPU/memory allocation → possible throttling or eviction during cluster load spikes.
Risk of noisy-neighbour effects on shared nodes.
Unpredictable autoscaling signals and Dynatrace performance fluctuations.
Setting requests and limits gives the scheduler clear boundaries:
Requests = guaranteed resources for stable operation
Limits = hard ceiling to prevent runaway usage
Helps Dynatrace collect telemetry without starving app workloads
These values were tuned from observed averages across DEV, UAT and PROD clusters. They provide a safe baseline—enough headroom for spikes while keeping node utilisation predictable.
You must be logged in to post a comment.