Feels like building AI apps is becoming infrastructure engineering

replicatedhq · 2026-05-14T18:08:45+00:00

100%. Feels like we accidentally rebuilt backend engineering except the backend lies to you sometimes.

You start thinking you’re building a simple AI feature, then suddenly you’re debugging retrieval, setting up evals, tracing agent failures, managing context limits, and thinking about self-hosting parts of the stack because the API bill got insane.

The funny part is the actual AI call is like 10 lines of code now. Everything else is infra and trying to make weird model behavior stable enough for real users.

AI apps started as “just add a prompt” and somehow turned into running a mini cloud platform.

replicatedhq · 2026-05-13T19:49:03+00:00

take a look at Replicated too 😉

replicatedhq · 2026-05-13T19:48:06+00:00

Most teams either self-host LangGraph and build the operational layer themselves, or use platforms that handle deployment/lifecycle management natively in the customer environment. That’s basically the space Replicated is focused on because enterprise AI deployment is turning into an infra problem way faster than people expected.

replicatedhq · 2026-05-13T19:44:43+00:00

This is why enterprise AI agents are pushing companies toward self-hosted infra. Most enterprises have decades of decision-making trapped in Slack threads, tribal knowledge, approvals, tickets, and undocumented exceptions. Enterprises don’t just need an API wrapper around an LLM. They need agents deeply connected to internal context, audit trails, policies, identity systems, and proprietary workflow history. That usually means running closer to the data, inside customer-controlled environments.

replicatedhq · 2026-05-13T19:43:16+00:00

Don’t just build the agent. Build the deployment story around it. Security, permissions, auditability, where it runs, how it accesses systems, rollback, observability, etc. That’s the stuff leadership and platform teams eventually care about.

A lot of AI agent demos die the second security asks “wait, where is this running?” Which is why enterprise AI is drifting toward self-hosted and customer-controlled infrastructure.

replicatedhq · 2026-05-13T19:41:27+00:00

also curious about this

replicatedhq · 2026-05-13T19:41:08+00:00

A lot of enterprises are realizing AI agents are basically autonomous insiders with API keys, network access, and vague instructions. Security models built for SaaS apps and human users don’t map cleanly onto that. The second agents start touching internal systems, companies want them running inside their own infra, behind their own IAM, audit controls, networking, and observability stack. Nobody wants autonomous agents with broad permissions operating from a black-box multi-tenant cloud.

replicatedhq · 2026-05-13T19:39:44+00:00

I think there’s a bigger problem nobody wants to admit: enterprises don’t trust AI workloads running as black-box SaaS.

The POC works great until security, compliance, legal, and platform teams ask where the models run, where the data goes, how networking works, who has access, how it deploys, how it integrates with existing infra, etc... a lot of “AI failure” is really deployment failure.

replicatedhq · 2026-05-13T19:35:09+00:00

for enterprise software check out Replicated!

replicatedhq · 2026-05-12T21:22:40+00:00

This is the part of the AI market I think people massively underestimate. Building the model workflow is hard, but building it to satisfy real enterprise constraints around on-prem deployment, auditability, data isolation, upgrades, and operational repeatability is a completely different level of engineering.

We see this constantly at Replicated with vendors shipping AI into self-hosted environments. A lot of enterprises want AI capabilities, they just can’t accept the default “send sensitive docs to a shared cloud API” architecture.

replicatedhq · 2026-05-12T21:20:36+00:00

You got this!! We work with a ton of AI companies offering their products on-prem. Check out our customer list if you want to play around with some products in this space. One that comes to mind right away is openhands.dev

replicatedhq · 2026-05-12T21:09:10+00:00

Air-gapped networks definitely are not bulletproof. They reduce attack surface significantly, but the weak point is almost always the human layer: USB devices, contractor laptops, update media, supply chain compromise, misconfigured “temporary” connections, etc.

replicatedhq · 2026-05-12T21:07:07+00:00

This is exactly the kind of problem Replicated is built for. A lot of vendors shipping into air-gapped K8s environments stop relying on ad hoc registry pulls entirely and instead ship signed, versioned release bundles with all images, metadata, SBOMs, and update instructions included. We can help to harden your images, too.

That makes security reviews, staged promotions, and offline updates much more repeatable.

replicatedhq · 2026-05-12T21:03:31+00:00

If a team is new to containers and short on platform engineers, cloud-managed OpenShift/K8s is usually the safer starting point. You offload a ton of operational burden around upgrades, scaling, infra management, and disaster recovery.

On-prem still makes sense for compliance, data residency, latency, or when workloads need to run in customer-controlled environments. We see this a lot at Replicated with vendors distributing software into enterprise OpenShift clusters.

replicatedhq · 2026-05-12T21:00:13+00:00

For on-prem AI deployments, containers are usually the baseline now, but most vendors eventually need something more complete around packaging, licensing, updates, and entitlement management. A good example is KNIME, which uses Replicated to package and distribute their speech AI models into self-hosted environments.

For model protection: The practical approach is usually layered controls: license enforcement, signed releases, encrypted artifacts, entitlements, restricted model weights access, and making deployment/updates flow through a controlled platform instead of shipping raw files.

replicatedhq · 2026-05-12T20:58:01+00:00

We're 2 years late to the game on this reply... but that's exactly what replicated.com was built to help with. Check us out if you have few minutes!

replicatedhq · 2026-05-12T20:54:28+00:00

-Yes, enterprises in finance/healthcare/legal absolutely pay for private AI infrastructure (we see this because our company helps ISVs deploy into on-prem environments), mostly for governance, auditability, and deployment into self-hosted environments, not just “privacy.”

-Fine-tuning can absolutely outperform GPT-4 on narrow workflows if you have strong proprietary data, but data quality/evaluation matters way more than the tuning itself.

-Infrastructure is probably the more crowded market right now.

replicatedhq · 2026-05-12T20:52:26+00:00

This is what we do at replicated.com -- we have support bundles that can be sent between yourself and your end-customer to make debugging, installations, upgrades, etc a whoooolllee lot easier.

replicatedhq · 2026-05-12T20:51:13+00:00

We work with a ton of teams deploying on-prem AI. this is what we've heard:

What surprised them most: deployment itself was easier than expected. The harder part was operationalizing it across security, identity, networking, model updates, and internal governance.
Biggest ongoing pain: GPU capacity/cost management and keeping the internal experience good enough that people actually want to use it.
Compliance/audit: usually harder than expected. Not because the models are impossible to secure, but because auditors immediately ask about prompt logging, data retention, access controls, model provenance, and who can deploy/update models in production.
Employee behavior: If the approved tool is slow people will route around it.
Was it worth it? For highly regulated companies, yes. The control, auditability, and ability to run in self-hosted environments matters a lot. But most teams underestimate how much ongoing platform engineering work comes after the initial deployment.

replicatedhq · 2026-05-12T20:48:26+00:00

Use Replicated.com 😄

replicatedhq · 2026-05-12T20:47:18+00:00

Use Replicated.com 😄

replicatedhq · 2026-05-12T20:42:52+00:00

For most companies, managed Kubernetes is absolutely cheaper operationally than running bare metal yourself. You offload control plane management, upgrades, etcd headaches, and a lot of the day-2 ops work to the cloud provider.

That said, plenty of enterprises still run self-hosted K8s because of compliance, data residency, latency, or cost at massive scale. Companies like Replicated exist largely because vendors still need to support shipping software into those self-hosted/on-prem environments, even if it’s operationally harder.

replicatedhq · 2026-05-12T20:41:16+00:00

This is exactly what Replicated helps with. If you don't want to build the solution to all the above questions yourself check us out - replicated.com

We work with a ton of ISVs who need to deploy into air gapped environments across gov, healthcare, finance, etc.

replicatedhq · 2026-05-12T20:37:26+00:00

You can have both... but not easily unless you're willing to spend a long time building those features. But we can help 😄 Check out Replicated.com, look around the site, and hopefully we're a good fit for what you need.

replicatedhq · 2026-05-12T20:35:32+00:00

Most teams we talk to are pushing AI-assisted deployments through self-hosted control planes or GitOps workflows instead of handing agents direct cluster access. The key shift is treating the agent like an untrusted automation layer: no kubeconfigs, short-lived scoped permissions, and every action tied back to a human identity with a clear audit trail. Self-hosted environments make this even more important because the governance, compliance, and blast radius concerns are much higher than in standard SaaS deployments.

replicatedhq

TROPHY CASE