The Silent Killer: How Enabling OIDC on AKS Can Break Your Apps (Even If You Don’t Use Workload Identity Yet)

So, you’re doing the “right thing.” You’re preparing your AKS cluster for the future by enabling the OIDC Issuer and Workload Identity. You haven’t even migrated your apps to use Federated Identity yet—you’re still rocking the classic Azure Pod Identity (or just standard Service Accounts). No harm, no foul, right?

Wrong.

As soon as you flip the switch on OIDC, Kubernetes changes the fundamental way it treats Service Account tokens. If you have long-running batch jobs (like Airflow workers, Spark jobs, or long-polling sensors), you might be walking into a 401 Unauthorized trap.


The “Gotcha”: Token Lifespan

Before OIDC enablement, your pods likely used legacy tokens. These were static, long-lived (often valid for ~1 year), and lived as simple secrets. They were the “set it and forget it” of the auth world.


How do you know if you are using the OIDC tokens? Inspect the token in your containers
/var/run/secrets/kubernetes.io/serviceaccount/token

If the Audience has xyz.oic.<env>-aks.azure.com, then its the OIDC token. Even though you have not implemented workload identity yet.

“aud”: [
https://australiaeast.oic.prod-aks.azure.com/<tenantguid>/<guid>/“,

The Moment You Enable OIDC/Workload Identity: AKS shifts to Bound Projected Tokens. These are significantly more secure but come with a strict catch: The default expiration is 1 hour (3600 seconds).

If your app starts a session and doesn’t explicitly refresh that token, it will expire 60 minutes later. For a 4-hour batch job or a persistent sensor, this means your app will work perfectly… until it suddenly doesn’t.


Why It’s Sneaky

  • Azure Identity Still Works: Your connection to Key Vault or Storage via Pod Identity stays up.
  • The K8s API Fails: Only the calls within the cluster (like checking the status of another pod or a SparkApplication CRD) start throwing 401s.
  • It’s a Time Bomb: Everything looks fine in your 10-minute dev test. The failure only triggers in Production when the job hits the 61st minute or the token expired mid process.

The Quick Fix: The 24-Hour Band-Aid

If you aren’t ready to refactor your code to handle token rotation (which is the “real” fix), you can manually override the token lifespan using a Projected Volume in your Deployment or StatefulSet.

By mounting a custom token, you can extend that 1-hour window to something more batch-friendly, like 24 hours.

The Workaround YAML

You need to disable the automatic token mount and provide your own via volumes and volumeMounts.

# 1. Disable the default automount
--ServiceAccount--
apiVersion: v1
automountServiceAccountToken: false
kind: ServiceAccount
--Deployment/Statefulset--
spec:
automountServiceAccountToken: false
serviceAccountName: your-app-sa
containers:
- name: my-app
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: custom-token
readOnly: true
# 2. Project a token with a longer expiration
volumes:
- name: custom-token
projected:
defaultMode: 420
sources:
- serviceAccountToken:
# Match this to your cluster's OIDC issuer audience
audience: https://australiaeast.oic.prod-aks.azure.com/YOUR-GUID/
expirationSeconds: 86400 # 24 Hours
path: token
- configMap:
name: kube-root-ca.crt
items:
- key: ca.crt
path: ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace

The Long-Term Play

While the 24-hour token buys you time, it’s a temporary safety net. Microsoft and the Kubernetes community are pushing for shorter token lifespans (AKS 1.33+ will likely enforce this more strictly).

Your to-do list:

  1. Upgrade your SDKs: Modern Kubernetes clients (and Airflow providers) have built-in logic to reload tokens from the disk when they change.
  2. Avoid Persistent Clients: Instead of one long-lived client object, initialize the client inside your retry loops.
  3. Go All In: Finish the migration to Azure Workload Identity and move away from Pod Identity entirely.

Don’t let a security “improvement” become your next P1 incident. Check your batch job durations today!

TIP: Use TOKEN REVIEW to test your tokens, once you switch a cluster to OIDC.
https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-review-v1/

See:
https://learn.microsoft.com/en-us/azure/aks/workload-identity-migrate-from-pod-identity – This article does not warn you about the OIDC switch flick affecting Token behavour.

https://kubernetes.io/docs/concepts/storage/projected-volumes/

https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/

  • Uncategorized

Leave a comment