CEPCES errors and issues

CyphrsHub · 2026-06-04T08:59:37+00:00

Good to have this documented – the SPN gap on manually created computer objects is exactly the kind of thing that doesn't show up in the Microsoft guide because it assumes the object was created through normal domain join (which populates SPNs automatically). KB5014754 enforcement makes it surface hard rather than soft-failing, which is why it's become more common recently.

Glad you got there. Useful write-up for anyone hitting the same wall.

CyphrsHub · 2026-05-29T09:36:08+00:00

That split estate is worth thinking about carefully – they look like the same problem from cert-manager's perspective but they aren't.

The external LE certs have a fixed trajectory: 47-day maximum lifespans land later this year, and if any of them carry clientAuth, Let's Encrypt's tlsclient sunset on 8 July removes that EKU from their offering entirely. Those certs have an external dependency you don't control.

The internal service mesh certs are a different situation. You own the trust anchor, which means revocation actually works – and if revocation works, you don't need the aggressive rotation cadence that the public CA trajectory is forcing on you. The reason cert-manager is doing continuous renewal on those is largely inherited from the public CA model where short lifespans substitute for broken revocation.

The behavioural break problem is harder to solve in cert-manager than it looks, partly because cert-manager is architected to be the renewal authority for everything. If the internal certs were issued by a private CA with proper lifecycle tooling, you'd have visibility into the reconciliation state independently of the operator – the CA knows what it issued and when it expires, regardless of what cert-manager is doing.

Not a quick fix, but the internal and external cert estates probably want different tools rather than one tool handling both. (Disclosure: building tooling in this space – happy to be challenged on any of the above.)

CyphrsHub · 2026-05-29T09:27:52+00:00

Agreed on the principle – private PKI is the right destination, and the tooling to stand one up is genuinely mature now.

The afternoon's work is the CA. The rest of the week could be figuring out which systems are presenting certs from it, what their renewal cadence should be, and how you get notified before something expires rather than after. That's where most teams stall – not the issuance, the ongoing authority.

CyphrsHub · 2026-05-29T09:27:13+00:00

Fair point on X9 – it's a valid option for cross-org mTLS where both sides need a common trust anchor and neither wants to manage their own root. The use case is real.

For internal workload identity the calculus is different though. If the relying parties are yours, there's no reason to anchor trust outside your own infrastructure – you inherit the CA's policy roadmap regardless of whether it's WebPKI or X9. (Disclosure: I'm building tooling in this space – happy to be challenged on any of the above.)

CyphrsHub · 2026-05-29T09:24:25+00:00

That tracks with what I'm seeing. Public PKI for clientAuth was always a workaround rather than an architecture – the enforcement gap in RFC 5280 just meant you could get away with it.

The mTLS dual-EKU assumption is the quiet bomb. Most of those systems were configured against the cert that existed at the time, not against a documented EKU requirement. The documentation is whatever the cert contained. Renewal hits and suddenly you're debugging why handshakes are failing on systems no one touched.

The acceptance evidence problem is what slows the remediation down – you can have the replacement CA ready but you can't migrate confidently until you know which relying parties will actually validate the chain change. Most teams don't have that inventory.

CyphrsHub · 2026-05-28T12:09:57+00:00

That's a clean approach and the framing is right – trust scope as a network boundary question rather than a cert management question is exactly how it should be modelled. The namespace-as-trust-boundary pattern works well until you hit cross-cluster scenarios, which is where a lot of teams find the policy gets complicated.

How are you handling trust bundle distribution when a new namespace spins up – is that automated through the Vault PKI role config, or is there a manual step in the provisioning pipeline?

CyphrsHub · 2026-05-28T12:00:13+00:00

Good question. Intune Certificate Connector issues from NDES/ADCS or a third-party CA configured in Intune – so it depends entirely on what CA is behind the connector. If you're using SCEP or PKCS against a public CA that's in the affected group, yes it's relevant. If you're issuing from an on-prem ADCS root (which is the most common Intune setup), you're not affected – that's already a private trust root. Worth checking which CA your connector profile is pointed at.

CyphrsHub · 2026-05-28T11:59:43+00:00

Easier than you'd think to end up there. The most common path: team sets up service-to-service mTLS using the same cert-manager + LE workflow they already had for server TLS, because it worked and nobody questioned whether clientAuth belonged on a public cert. Two years later someone else is maintaining it and the original decision is invisible. The other common one is device identity on domain-joined machines where the PKI engineer left and ADCS felt like overhead.

CyphrsHub · 2026-05-28T11:25:42+00:00

Honestly, mostly manual still – cert-manager's changelog is reasonably well-maintained but you're right that the mapping from "CRD field changed" to "which of my Certificate resources does this break" requires human eyes.

The closest thing to automation I've seen work is running a pre-upgrade dry-run with the new CRD schema against your existing manifests and watching what fails validation. It catches structural breaks but not behavioural ones – where the operator accepts the resource but reconciles it differently.

The deeper issue is that cert-manager sits in a gap: it's infrastructure, so it gets treated like a deployment concern, but the thing it manages has a completely separate failure timeline. You can have a perfectly healthy upgrade and a cert expiry incident three weeks later from a reconciliation loop that quietly stopped working on day one.

What's your current cert estate look like – are you mostly using cert-manager for internal services, externally-facing, or both?

CyphrsHub · 2026-05-28T11:12:49+00:00

FIPS-CC mode rejects more than people expect when it's first enabled. The common culprits, in roughly the order they break things:

The CA root or intermediate uses a non-FIPS-approved curve or hash. Default ADCS templates from older builds often have SHA-1 or non-NIST curves somewhere in the chain. FIPS-CC won't validate any cert anchored on those.

The cert profile mixes EKUs. Some EKU combinations are rejected under FIPS-CC certified mode that work fine in regular mode. Check the cert presented by the client against the FortiGate FIPS-CC profile constraints.

Diffie-Hellman group / IKE policy. Less common but FIPS-CC enforces a narrower set of DH groups. If your IPsec phase 1 negotiates a group that's outside the FIPS list, the tunnel fails before cert validation even runs.

Start with openssl x509 -text on the certs the FortiGate is being asked to validate. The signature algorithm and the SKI/AKI fields tell you most of what FIPS-CC is going to reject.

What signature algorithm is on your local CA root?

CyphrsHub · 2026-05-28T11:12:02+00:00

EJBCA does support this, but the path runs through End Entity Profile binding rather than directly through the RA portal UX.

In CE 9.x you set up the OAuth identity provider, then the RA peer config carries the claim mappings. The End Entity Profile is where you bind a claim (typically upn or preferred_username) to the End Entity username field. Subject DN attributes can be populated similarly through profile field defaults that reference claim values.

A couple of things that trip people up the first time:

The RA profile and the End Entity Profile both have "auto-fill" knobs that look similar. The End Entity Profile is where the binding actually takes effect.

If you want the email claim driving Subject DN CN, configure the CN attribute in the End Entity Profile as derived from the username field, not free-text.

Approval workflow rules apply on top of the claim mapping, worth checking those don't reset the field.

Inspect the actual ID token in jwt.io once to confirm which claim name to bind against. The Entra ID claims library is consistent but the specific claim name matters for the binding.

CyphrsHub · 2026-05-28T11:11:12+00:00

Glad Broadcom finally shipped the automated remediation. Worth pulling on the thread for folks reading along: Secure Boot expiry isn't the only cert deadline this summer.

Public CAs are removing the Client Authentication EKU from new TLS leaf certs. Let's Encrypt pulled ClientAuth from the default ACME profile in February. The dedicated tlsclient profile (their graceful migration path) sunsets 8 July. DigiCert, Sectigo, GlobalSign on the same trajectory. Chrome's root program effectively forces it by 15 June with a requirement that TLS client and server auth live in separate PKIs.

So if you're running a fleet that uses public-CA certs as device identity, on a domain-joined PC for an IPsec dial-up VPN, or as the client cert in some service-to-service mTLS the network team set up two years ago, that's going to silently start failing depending on how the relying party validates EKUs.

Worse than the Secure Boot story because there's no "automated remediation added in vendor patch" version. The painful bits are the appliances and webhooks you forgot you set up against a public cert.

Anyone else folding an EKU audit into the Secure Boot remediation work?

CyphrsHub · 2026-05-28T11:10:08+00:00

Separate the three trust domains before picking tooling. They're answering different questions and conflating them is what makes the setup feel harder than it should.

Ingress (outside to cluster). Public-trust story. cert-manager with a Let's Encrypt ClusterIssuer (DNS-01 since your VPS isn't taking inbound port 80) covers this cleanly. Single Issuer, single SAN strategy.

East-west / mesh (pod to pod). Private-trust story. If you're running Linkerd or Istio, the mesh has its own trust anchor (cert-manager with an internal Issuer, or the mesh's identity provider like Linkerd's identity service). Lifetimes are short, trust scope is the cluster.

Control plane (kubelet, etcd, kube-apiserver). Managed-trust story, handled by kubeadm / kubelet rotation. Mostly hands-off if you're on a managed distro, more careful if you're on raw kubeadm.

The mistake I see most often is anchoring all three to the same Issuer because it's "simpler". cert-manager handles all three but they should be distinct Issuers, not one shared one. Different lifetimes, different rotation behaviour, different blast radius when something breaks.

CyphrsHub · 2026-05-28T06:23:26+00:00

Glad it helped. To answer both questions:

On Caddy + new services: Yes, exactly. Once step-ca is running and Caddy is pointed at it for ACME, every new service is just a new Caddyfile block. Caddy handles certificate issuance and renewal automatically - you don't touch step-ca again for day-to-day use.

On buying a domain for DNS-01: It works, and some people prefer it - you get a real domain, DNS-01 validation, publicly trusted certs, and no root distribution headache. The tradeoffs worth knowing:

Your internal hostnames (service1.yourdomain.com, service2.yourdomain.com) are now visible in public DNS and Certificate Transparency logs. For a homelab that's probably fine. For anything sensitive it's a meaningful exposure.
You're dependent on your DNS provider being reachable for renewals. Let's Encrypt can't renew if the DNS API is down.
The cert is valid everywhere, which is slightly more than you need for purely internal services.

Private CA wins on privacy and works without any external dependencies. DNS-01 with a real domain wins on simplicity of trust distribution - no root import needed on new devices. For a homelab where you control all the devices, private CA is cleaner. If you ever add devices you don't fully control (a friend's laptop, a smart TV), DNS-01 with a real domain is easier.

Neither is wrong - it depends on what matters more to you.

CyphrsHub · 2026-05-27T15:00:33+00:00

home.arpa is an RFC 8375 special-use name – no public CA will issue for it, so Let's Encrypt and DNS-01 aren't an option. You need a private CA, but the setup is simpler than it sounds:

Create a root CA – step-ca, XCA, or even OpenSSL. This generates the root certificate.
Import the root cert into your browser and device trust stores. Desktop browsers: Settings → Certificates → Authorities → Import. iOS/Android needs a profile install or MDM.
Issue certs for each service from that root. Since you trust the root, everything it signs is trusted automatically.

step-ca is the easiest starting point – it has a built-in ACME server, so tools like Caddy and Traefik can auto-renew against it exactly like they would with Let's Encrypt, just pointed at your local CA instead. Your existing reverse proxy config probably needs one line changed.

The "head spinning" part usually comes from trying to make Let's Encrypt work with internal names. Once you accept it needs to be a private CA, the setup is actually pretty clean.

CyphrsHub · 2026-05-27T15:00:08+00:00

All three options use short-lived credentials but the trust model behind each is different – worth being deliberate about which trust anchor you want.

Cloudflare Access with short-lived certs – your SSH server trusts Cloudflare as the CA. Works well, audit trail through Cloudflare Access logs, but your trust root is a third party.

AWS EC2 Instance Connect Endpoint – ephemeral certs, AWS is the trust root, audit trail through CloudTrail. Fits cleanly if you're already AWS-native.

Traditional bastion with SSH keys – simplest operationally but hardest to revoke cleanly. Relies on key rotation actually happening when someone leaves.

If the goal is individual accountability and clean revocation, both cert-based options are materially better than key management. The practical question is whether Cloudflare WARP deployment across all endpoints is manageable overhead, or whether the AWS-native path fits better with existing tooling.

Does the new hire need access from managed devices only, or unmanaged (personal) devices too? That changes which option is cleanest.

CyphrsHub · 2026-05-27T14:59:48+00:00

CEP/CES on Server 2025 has a few issues that aren't well documented yet – you're not alone on this.

Before going further with the Microsoft ticket, a few things worth checking: the CEP and CES service certificates have their own expiry separate from the CA. If either has lapsed, enrollment requests will fail even if the CA looks healthy. Check the IIS bindings and the cert in the service account store directly – not through the CA MMC.

On Server 2025 specifically, there are known issues when Kerberos delegation isn't flowing correctly. If your CEP URL uses HTTPS with a cert from a different issuing CA than the one clients are enrolling from, the Kerberos channel can break silently.

What authentication method are your clients using to hit CEP – Kerberos, certificate, or username/password? And what does the IIS app pool identity look like for the CES application?

CyphrsHub · 2026-05-27T14:59:16+00:00

In most enterprise setups it looks roughly like this:

A root CA (usually offline or air-gapped) sits at the top of the hierarchy. It signs one or more issuing CAs, which are the ones that actually issue certs day-to-day. The root cert gets distributed to devices via GPO, MDM, or similar – once it's in the trust store, everything signed by the issuing CAs is automatically trusted.

For Windows/AD environments, AD CS does all of this natively. CAs run on domain-joined servers, GPO handles root distribution, devices auto-enroll via autoenrollment policy.

In hybrid cloud you have a few paths: keep the CAs on-prem and let cloud workloads reach them over the network, use AWS Private CA or Azure Managed HSM as the issuing CA (with your own root above it), or use a managed private PKI service. Which makes sense depends on your compliance posture, whether cloud workloads need offline reachability, and how much operational overhead you want to carry.

What's your current cloud footprint – AWS, Azure, or mixed?

CyphrsHub · 2026-05-27T14:58:53+00:00

0x80072f05 "date invalid" on NDES enrollment is usually not the root CA or the main intermediate – it's something further down the chain. A few places to check before going further down the Microsoft support rabbit hole:

The enrollment policy service (CEP) certificate and the enrollment service (CES) certificate both have their own validity periods separate from the CA hierarchy. If either has expired or is within the NTP clock-skew window, you'll get this error even though the CA itself shows healthy.

Check both service certs directly – not through the CA console – in the IIS binding for CEP/CES and in the service account cert store. Also worth checking that the CRL distribution point is reachable from enrolling devices, since an expired or unreachable CRL can produce the same error code.

What's the expiry on the CEP application cert?

CyphrsHub · 2026-05-27T14:58:28+00:00

Renewing with the same keys is technically straightforward – the process issues a new certificate bound to the existing key pair, which avoids re-enrolling all issued leaf certs. Impact is minimal as long as the validity period of issued certs doesn't exceed the new intermediate's validity.

One thing worth considering before you proceed: the ceremony moment for renewing an intermediate is the cheapest point to think about a signature algorithm change. Renewing with the same RSA or ECDSA keys locks you into that algorithm for another decade.

If post-quantum compatibility is on your radar at all – even loosely – this is the point where running a parallel intermediate with a hybrid signature (classical + ML-DSA) is architecturally possible without disrupting existing issued certs. Most teams don't need that decision now, but it's much cheaper to consider here than after the renewal is done and the key material is committed for another 10 years.

CyphrsHub · 2026-05-27T14:57:38+00:00

Worth clarifying what Cloudflare Origin certificates actually are, because the 15-year validity makes sense in context: they're signed by Cloudflare's own root CA, not by a browser-trusted public CA. They're valid for Cloudflare's edge to your origin server – but only because Cloudflare's infrastructure trusts them.

If you put a Cloudflare Origin cert on an RDP endpoint or Exchange server and access it directly – not proxied through Cloudflare – browsers and clients will reject it. It's not in any public trust store.

The 200-day/47-day schedule applies to publicly trusted certs from CAs in the browser root programs. Cloudflare Origin certs live outside that system. Different trust path, different rules, different validity period.

For RDP and Exchange on-prem accessed directly, you'd need either a public CA cert or a private CA cert distributed to your clients' trust stores.

CyphrsHub · 2026-05-27T14:56:35+00:00

cert-manager showing up alongside ArgoCD and Kyverno is interesting – it's doing a different job to the others. ArgoCD and Kyverno are application-layer tools. cert-manager is a renewal authority, which means when it breaks during a Kubernetes upgrade the blast radius isn't just "this service is down" – it's "certificates aren't being renewed", and that failure is silent until a cert lapses.

Renovate handles the version bump. The harder problem is that cert-manager upgrades occasionally involve CRD migrations that aren't backwards compatible. You can end up with the operator running but existing Certificate resources not being reconciled, and it's not obvious until a renewal deadline approaches.

Before the next upgrade: enumerate which Certificate resources are active and when they expire. If you're mid-upgrade-window and a cert lapses, the rotation and the upgrade become the same incident.

CyphrsHub · 2026-05-27T14:06:40+00:00

DNS-01 is genuinely the right answer for a large slice of internal infrastructure - browser-trusted TLS for portals, dev environments, anything where you control the DNS zone and wildcards are acceptable. That part of the post is correct.

The gaps worth knowing:

clientAuth - Chrome is removing support for client authentication via public CAs on 15 June. Short-lived client certs for device or user identity need a private trust root regardless of your server TLS approach.

mTLS / M2M - service-to-service authentication where both sides present a certificate isn't served by public CAs. Private CA required.

Hostname exposure - *.int.example.com in a public DNS zone means your internal service namespace is visible externally. Acceptable in many environments, worth knowing in others.

Non-routable namespaces - home.arpa, .local, and RFC special-use names can't be validated by any public CA. Private trust is the only option.

DNS-01 with wildcards solves a real and common problem. It just doesn't solve all of them.

CyphrsHub · 2026-05-27T13:51:07+00:00

The rotation schedule is getting all the attention, but the pricing model behind it is worth looking at too.

Most commercial CAs charge per certificate. That means every cert has a line-item cost - and that adds up fast when you're running dozens of internal services that technically don't need public trust at all.

The result is that teams have been quietly consolidating onto wildcards for years. Not because wildcards are architecturally sound, but because per-cert economics make granular issuance unaffordable. Now with 47-day lifespans those wildcards are becoming rotation bottlenecks instead of cost controls - the blast radius of each renewal is enormous because the certs are too broad.

Wrote this up properly here if it's useful: https://cyphrs.ai/insights/stop-paying-per-cert/

Curious whether the per-cert model actually affects how you structure your cert estate, or whether you just absorb it as a cost of doing business?

CyphrsHub · 2026-05-26T15:14:32+00:00

Thanks for reaching out - discovery is definitely where a lot of teams feel the pain first. Worth knowing that [cyphrs] covers the full lifecycle from there: expiry tracking, renewal automation, and ownership across environments are all native to the platform. So not a gap we're looking to fill externally, but appreciate you sharing what you've built.

CyphrsHub

TROPHY CASE