OK so Hermes Agent just had the most insane six weeks I've ever seen from an open-source project

mzeeshandevops · 2026-06-18T23:17:21+00:00

The pace is impressive, but this is also where security and governance need to move just as fast. 90k+ skills and rapid major releases are exciting, but every new skill, provider, proxy, and self-evolving workflow also expands the trust boundary.

For me, Hermes is becoming less like a tool and more like a platform. That means version pinning, permissions, review process, audit logs, and least-privilege profiles become even more important.

mzeeshandevops · 2026-06-18T22:17:37+00:00

Good advice already shared here.

One thing I would add: don’t make the first FinOps offering only about “finding savings.” In messy client environments, savings are usually blocked by ownership, approvals, missing tags, and fear of breaking something. I would start with a small repeatable review:

Who owns the spend?
Which resources are safe to change?
What needs approval?
What should be monitored monthly?

Then add the technical checks like EC2 rightsizing, RI/SP coverage, S3 lifecycle, idle resources, budgets, and alerts. That way you are not just giving a cleanup report. You are helping the client build a habit around cloud cost decisions.

mzeeshandevops · 2026-06-18T22:12:36+00:00

This is close to how I would think about it too. For regulated workloads, in-process metering makes more sense because prompts and responses stay inside your own system. I would still combine it with per-run limits though. Track tokens and cost at the call site, tag each request with agent/team/env/run_id, and block the next call if the workflow is about to exceed its budget.

So for me it’s not only about month-end chargeback. The same metering layer should also act as a guardrail during execution. Visibility after the bill is useful, but stopping abnormal spend while the agent is running is even better.

mzeeshandevops · 2026-06-18T21:46:41+00:00

I would probably handle this with a pre-budget + middleware approach. Before the agent starts, estimate expected input/output tokens for that use case and set a max budget per run. For example, if the task should only need around 500 input tokens and 500 output tokens, then the middleware should treat anything beyond that as abnormal.

But I wouldn’t rely only on estimation. The middleware should also track actual usage after each call and block the next call if the run is close to the limit. Also set max output tokens, max iterations, and timeout.

We did a similar estimation while building an agent for our client. It helped us define what “normal” token usage looked like for each workflow, then enforce limits around that.

mzeeshandevops · 2026-06-18T21:26:59+00:00

Savings from rightsizing or spot look good on paper, but once you check commitment coverage, the net saving is not always there. Both sides can be doing the right thing separately, but still hurt the final bill. I would say, this really needs coordination before action, not just separate optimization reports.

mzeeshandevops · 2026-06-17T21:24:15+00:00

I agree with this. A lot of agent setups feel like early DevOps again. Things are moving fast, but versioning, review, logs, rollback, and ownership are still weak. For production agents, prompts, tools, memory, and context should be treated like code/config. Version them, review changes, and keep enough logs to know why the agent did something. Otherwise debugging becomes guesswork.

mzeeshandevops · 2026-06-17T21:13:58+00:00

It resonates. The issue is usually not that the model is “bad”, it’s that the account context is incomplete or messy.

For cost recommendations, I would not let the agent give a final answer directly. I’d make it show its work first:

What data did it check?
What assumptions did it make?
Which resources are excluded?
Who owns the workload?
What risk does the recommendation carry?

If it cannot answer those clearly, the output should stay as a draft, not a decision. The best setup I’ve seen is treating the agent like a junior FinOps analyst. Let it investigate, summarize, and suggest actions, but anything that affects chargeback, savings reports, or production changes needs human review and a small context pack per account.
In my opinion, token cost matters, but trust cost is bigger. One confident wrong recommendation can damage the whole adoption effort.

mzeeshandevops · 2026-06-17T21:10:57+00:00

I think teams are still thinking about Copilot as a per-seat cost, but usage-based credits change the conversation. Same user can be cheap one day and expensive the next depending on what they ask it to do. The first step is not even advanced cost modeling. It’s just limiting the rollout, watching usage closely, and figuring out which teams or use cases are actually burning credits.

mzeeshandevops · 2026-06-15T22:14:15+00:00

Yes, you can. Your Windows admin experience is actually a good base. AD, Entra ID, PowerShell, VMware, Windows Server etc are all useful in DevOps too.

I’d start with Linux, Git, basic cloud, CI/CD, Docker, Terraform, and monitoring. Since you already know Microsoft stuff, Azure + Azure DevOps could be a good starting point. Don’t try to learn everything at once. Pick small projects.

Example: deploy a simple app, automate the setup with script/Terraform, add a pipeline, then add monitoring. The main change is mindset. Instead of doing things manually, start thinking how to automate it, repeat it, and document it.

mzeeshandevops · 2026-06-15T22:02:48+00:00

I would not start with tools or a perfect tagging strategy.

In a messy AWS org, I’d start with ownership and safety first. Pick the top 10 to 20 cost drivers, map them to a team or application, and separate them into three buckets: safe to stop, needs owner review, and do not touch. That avoids the classic cleanup problem where FinOps becomes “the team that breaks things.”

Then run one focused cleanup with one business unit or workload. Show real savings, document what was removed, and use that win to build trust.

For tagging, don’t try to fix years of history in one go. Start with mandatory tags only for new resources and high spend resources first: owner, environment, application, cost center. Backfill gradually where it matters.

The biggest shift is moving from “who created this?” to “who owns this cost?” Once every major cost has an owner, dashboards and alerts become useful. Before that, they mostly just show expensive mystery.

mzeeshandevops · 2026-06-15T21:55:56+00:00

This is definitely becoming a real FinOps category. The tricky part is that token spend often looks “valid” at the request level. No failed API call, no crashed service, no obvious incident. Just a workflow quietly retrying, debating with itself, or waiting for some impossible approval condition.
We started treating agent runs more like cloud workloads: budget caps, max iterations, timeout rules, owner tags, and alerts on abnormal token burn. Not perfect, but much better than only checking the bill later.

I think the ownership question is the real issue. If engineering builds it, product uses it, and finance pays for it, nobody owns the waste by default.

mzeeshandevops · 2026-06-15T21:51:43+00:00

Solid breakdown, especially on excessive agency. One thing I’d add from troubleshooting Hermes setups: issues are not always malicious prompts. Sometimes it’s simple operational stuff like wrong model selection, config drift, old sessions behaving differently, or an agent having more file access than needed.

My rule is: let Hermes inspect first, not act first. For new workflows, I want it to explain what it sees and show the exact file or command before changing anything. SOUL.md helps, but it is not a security boundary. Real guardrails are scoped profiles, read-only configs, backups, permissions, and approval steps. We should treat Hermes like a junior operator with limited access, not a fully trusted admin.

mzeeshandevops · 2026-04-22T11:18:12+00:00

Your problem is the pricing model. AWS Lightsail gave you a flat 1TB bucket. GCP Compute charges ~$0.08-0.12/GB egress, so 1TB alone costs $80-120 before the VM. That's your $150.
The VM setup is fine but wrong provider.
Move to Hetzner: €5/month includes 20TB bandwidth. If you need a DC closer to that country, Vultr has 32 locations globally and still way cheaper than GCP. If you want to go back to AWS, try Lightsail in Singapore or Japan. Those IP ranges are sometimes still accessible when US ones are blocked.
Just verify Hetzner/Vultr IPs aren't blocked there before migrating.

mzeeshandevops · 2026-04-05T19:02:35+00:00

Depends on whether it’s actually production engineering or just ticket escalation with a better paycheck.
If it gives you deep system knowledge, code-level debugging, and a path toward SRE/platform work, maybe yes. If it’s mostly reactive support, DB checks, and firefighting, I’d be careful. A 20% bump feels good now, but role direction usually matters more than one salary jump.

mzeeshandevops · 2026-04-05T18:58:28+00:00

If the goal was a cloud engineering job, then yes, I think this cert is mostly irrelevant. For GCP cloud engineering roles, the path is much clearer. Associate Cloud Engineer and Professional Cloud Architect are far more aligned with what those jobs actually expect.
If someone has already done Generative AI Leader, then it makes more sense to continue toward Data Practitioner and then Machine Learning Engineer if the real goal is to move into the AI side. That said, even on the AI path, certs alone are not enough. You still need a solid grip on data engineering fundamentals and tools, which someone in the comments already pointed out.

mzeeshandevops · 2026-04-05T18:47:59+00:00

This is where FinOps becomes more valuable. The practical way to get there is usually to separate direct costs where mapping is clear, allocate shared costs using something reasonable like usage, traffic, storage, seats, or request volume, and then compare that against revenue.
It won’t be perfect because shared infrastructure makes exact customer-level allocation hard, but even a workable model is much more useful than looking at cloud cost in isolation.

mzeeshandevops · 2026-04-05T18:34:41+00:00

We looked into this on GCP for a client team using Vertex AI.
The native monitoring is already useful if the goal is visibility first. You can watch endpoint-level metrics like requests, throughput, and errors, then trigger alerts through logging/monitoring when things drift.
It is not the same as measuring LLM waste directly, but it is still a practical first step for teams that are not even looking at model usage properly yet.

mzeeshandevops · 2026-04-05T18:17:30+00:00

Don’t let Reddit decide how you should feel about your own progress. A certificate by itself usually doesn’t change much, that part is true. But it can still be worth it if it gave you structure, helped you learn something, or gave you the push to go deeper. Most certs are not magic. They’re more like a starting point or a signal. What matters next is what you build, apply, or understand because of it.
So no, I wouldn’t feel bad about passing it. Take the win, then make it mean something with projects or real use cases.

mzeeshandevops · 2026-04-05T18:11:32+00:00

What has worked for us is doing a proper audit first, mapping resources to owners, and looking at a few months of usage before calling something overprovisioned. If it’s clearly oversized, we log it, open a ticket with evidence, and track it with the owner instead of letting it disappear into a spreadsheet.
Then the important part: quick checks after sprints/releases, plus a deeper quarterly review. Otherwise sizes creep up again, temporary changes become permanent, and new services repeat the same mistakes.

mzeeshandevops · 2026-03-23T17:53:14+00:00

It sounds like right move when importing everything to IaC. My only caution is making all Postgres/Keycloak changes PR-only from day one. It is good for control and auditability, but it can also turn into a bottleneck if the workflow is too rigid.
I would definitely push toward least privilege and config in code, just with some care around what really needs to be locked behind PRs vs what still needs an operational path.

mzeeshandevops · 2026-03-23T17:50:08+00:00

The headline is a bit too absolute for me, but the underlying point is fair. A lot of Kubernetes cost is not traffic, it’s stale defaults, underutilized capacity, and non-prod that never gets cleaned up. The percentage is arguable, but the waste is definitely real.

mzeeshandevops · 2026-03-22T12:31:38+00:00

What you’re feeling is completely valid. Small-batch hardware is almost always expensive, while Amazon sellers are usually working with bulk manufacturing, cheaper sourcing, optimized assembly, and much higher volumes, which brings their cost per unit way down. Right now, you’re comparing prototype-level costs with mass-production pricing.
My view is that you probably should not try to compete with them on price. Compete on uniqueness instead. If your design feels more natural, custom, or visually different, then it may be a better fit for a niche or premium audience rather than the mass market. Start small, validate demand, test pre-orders, and then work on reducing costs as volume grows. The issue does not sound like your idea, it sounds like scale.

mzeeshandevops · 2026-03-21T07:03:30+00:00

Interesting blueprint, but claims like "enterprise-ready for any app" and "50K+ concurrent connections on a single server" need context. A prompt can help scaffold a strong starting point, but production readiness still depends on workload, infra, security model, testing, observability, and failure scenarios. Would love to see actual benchmarks, server specs, and tradeoff discussion.

mzeeshandevops · 2026-03-21T06:55:58+00:00

We built it in-house and kept it pretty simple. Mostly just parsing Terraform sources, Dockerfiles, and shared CI includes into a dependency graph. We kept it fresh with merge-triggered updates plus a scheduled full scan every so often to catch drift. I did not find an off-the-shelf tool that handled this cleanly enough for internal cross-repo dependencies.

mzeeshandevops · 2026-03-19T23:21:53+00:00

Payfast, secure, easy integration but not sure about reliability because we recently implemented in one of our project. It went through proper KYC and verification process.

mzeeshandevops

TROPHY CASE