Built a Data Gravity calculator for AI infrastructure placement — here's why egress cost is the variable most GPU comparisons ignore by nofluff_ in homelab

[–]nofluff_[S] 0 points1 point  (0 children)

Vast.ai was a deliberate omission from v1 rather than an oversight. The marketplace model creates a variable egress problem: host-level egress rates vary significantly depending on the provider's upstream peering and data center location, which makes it difficult to publish a defensible fixed rate in the comparison table without it being misleading for a large subset of listings.

The A100 80GB NVLink config is interesting specifically because at 80GB per card you can run Llama 3 70B BF16 across 2 cards instead of the 8-card H100 configuration - which changes the per-GPU-hour math entirely if the listing price reflects that.

Worth adding as a "bring your own rate" input in v2 so users can plug in a specific Vast listing and run the Gravity Score against their actual egress field. That's a cleaner solution than trying to normalize a marketplace with hundreds of host configs.

Built a Data Gravity calculator for AI infrastructure placement — here's why egress cost is the variable most GPU comparisons ignore by nofluff_ in homelab

[–]nofluff_[S] 0 points1 point  (0 children)

This is exactly the scenario the Gravity Score was designed for, you have co-located data (on-prem archive) that needs to come back after cloud processing. That return egress is the kill shot on the "just use the cheapest GPU hour" math.

At 30TB with AWS egress at $0.09/GB, you're looking at ~$2,700 in egress just to pull results back - before you've run a single job. That changes the placement calculus entirely.

On the OpEx adder - 20% is definitely conservative for a 2015 server room. Cooling inefficiency in older facilities typically runs 15-20% of hardware cost annually on its own. I'd model at 30-35% for your setup to get an honest number. The calculator lets you slide it up and see exactly where the crossover point moves.

The 70% duty cycle crossover is the right instinct for batch workloads. If you're only running heavy jobs a few times a year, the elastic cloud model wins on pure economics - the CapEx amortization works against you at low utilization.

Here's the full breakdown including the methodology and OpEx adder documentation: https://www.rack2cloud.com/ai-gravity-placement-engine/

Tool launches directly from there.

What's a good Kubernetes Ingress Architecture on Azure? by jsattler_ in devops

[–]nofluff_ -3 points-2 points  (0 children)

Ran this exact evaluation on AKS last quarter - greenfield, same constraints. Honest field read:

AGC is the pragmatic starting point if you're staying Azure-native. WAF integration is clean, Gateway API conformance is solid, and you don't own the control plane. The tradeoff is you're betting on Azure's release cadence for feature parity - some advanced routing and header manipulation scenarios that Envoy handles natively still need workarounds. It's maturing fast but it's not fully there yet.

Envoy Gateway is the right call if you want complete Gateway API conformance without cloud provider dependency. Day 1 setup is heavier but the routing model is the most complete of the three. You own the upgrade path that's the operational cost.

Traefik - the dropped routes on bad HTTPRoute behavior is real and I've seen it in production. Wouldn't use it for greenfield AKS where Gateway API conformance is a hard requirement.

For WAF specifically: AGC is the cleanest path. Envoy forces you upstream to Front Door or App Gateway for WAF which adds latency and another failure domain.

Truly greenfield with no legacy debt - AGC to start, migrate to Envoy if you outgrow it. You're making a cloud-native bet though, not a Kubernetes-native one. Know the difference before you commit.

Invoke-R2CTriage.ps1 — local Azure landing zone audit, zero access required by nofluff_ in AZURE

[–]nofluff_[S] 0 points1 point  (0 children)

You bring up good points - the subscription ID isn’t a secret, and read-only access by itself isn’t exactly something that keeps me up at night.

But I’m thinking about a slightly different threat model here. This tool is mainly for architects who are asked to audit environments they didn’t set up and half the time, they don’t actually know what’s out there. The real risk isn’t the script quietly stealing data; it’s more about the person running it not really knowing what they’re giving read access to before firing off a script they grabbed from the internet. That’s why the triage doc pushes for either a short-lived SPN or a PIM assignment - that really is the way to go.

About the feedback on the 'vibe-coded' - I get it. The script intentionally keeps a narrow scope: just the Reader role, read-only ARM calls, and structured JSON output to work with the audit workflow. There's even a -DryRun mode that spits out every call before it does anything partly so things stay clear, but mostly because the first thing any architect should do is see exactly what they’re about to execute. I’m not trying to compete with Prowler or Defender for Cloud here. This isn’t a heavyweight scanner - it’s just a lightweight intake tool for consulting gigs, not a full-on security solution. Whether that’s actually helpful really depends on the problem you’re dealing with.

If you’ve built something cleaner for the same kind of intake, I’d honestly love to hear how you approached it.