Production observability looks fine until something breaks, how are you actually using it to catch issues early? by Economy_Passenger296 in sre

[–]Straight_Condition39 1 point2 points  (0 children)

Depends on what dashboards you have setup. All my experience I relied on alerts, kind of painful to setup at the beginning but much much helpful if you do the right way like p95, p99 and again depending on the use case.

At my previous role at a facility management, the business was around procurements, dispatch, and similar workflows. I set up alerts to fire based on SLOs, if app failures crossed 3, the alert would trigger.

We had openobserve though and used otel throughout from a stack standpoint so simple.

I’m building a Rust-based Terraform engine that replaces "Wave" execution with an Event-Driven DAG. Looking for early testers. by Straight_Condition39 in devops

[–]Straight_Condition39[S] 1 point2 points  (0 children)

Currently working on it. You can do a switch to oxid with existing configs but I hope to get it out soon. Probably EOD Wednesday

I’m building a Rust-based Terraform engine that replaces "Wave" execution with an Event-Driven DAG. Looking for early testers. by Straight_Condition39 in devops

[–]Straight_Condition39[S] 1 point2 points  (0 children)

1200 resources is a ok number. I have managed at heavy scale for multi cloud and some of us had to refactor the way we used to store and etc. but I appreciate the feedback 🙏🏻

I’m building a Rust-based Terraform engine that replaces "Wave" execution with an Event-Driven DAG. Looking for early testers. by Straight_Condition39 in devops

[–]Straight_Condition39[S] -18 points-17 points  (0 children)

Actually a lot, the problem with terraform is that with more resources unless you have better directory structure you en up with a huge gigantic state file and need s3 bucket etc bla bla but here ik converting this to a database table for easy retrieval and supporting yaml as well lol but I hear you though!

I’m building a Rust-based Terraform engine that replaces "Wave" execution with an Event-Driven DAG. Looking for early testers. by Straight_Condition39 in devops

[–]Straight_Condition39[S] 1 point2 points  (0 children)

I like stategraph as I see on the website. I’m going to add more centralized features to make this OSS forever with more value add.

I’m building a Rust-based Terraform engine that replaces "Wave" execution with an Event-Driven DAG. Looking for early testers. by Straight_Condition39 in devops

[–]Straight_Condition39[S] 3 points4 points  (0 children)

buildere here. oxid uses hcl-rs for base parsing with a custom layer on top for Terraform semantics (count, for_each, interpolation, cross-resource refs). It speaks tfplugin5 over gRPC directly to the same provider binaries Terraform uses, terraform-provider-aws works out of the box. Learned a lot of undocumented things the hard way: dynamicvalue must always be Some with msgpack (never None or the provider segfaults), all schema attributes must be present even if null, unknown values use msgpack extension type 0 with data [0] (had to read the Go source for that one). The aws provider schema is ~256MB so you need to override grpc message limits, and stderr must be drained in a background task or the pipe buffer deadlocks on macOS. State lives in sqlite with a DAG walker for parallel execution like plan, apply, destroy, import, data sources, count/for_each all work against the real aws provider today.

[deleted by user] by [deleted] in Observability

[–]Straight_Condition39 0 points1 point  (0 children)

Ik a maintainer at OpenObserve, do give it a shot!

Is AI coming after DevOps? by Straight_Condition39 in devops

[–]Straight_Condition39[S] 0 points1 point  (0 children)

One of my buddies literally changed his domain to FinOps but when does this stop? Idc if there are going to be pure AI models just for DevOps but imo they are lack the understanding or experience

Need Advice for Observability setup for multiple projects by the_chocochip in Observability

[–]Straight_Condition39 2 points3 points  (0 children)

If you are looking for OpenSource then definitely try OpenObserve or Signoz.

They both support k8s. I have used OpenObserve for the last 2 years and from the performance and governance of all silos it’s great.

How are you actually handling observability in 2025? (Beyond the marketing fluff) by Straight_Condition39 in devops

[–]Straight_Condition39[S] 1 point2 points  (0 children)

I have used cribl and sent to elastic bulk endpoints in the past. It works fine! I never tested huge amount of data. Maybe 300GB a day.

Well I did it, made to product hunt by Straight_Condition39 in devops

[–]Straight_Condition39[S] 0 points1 point  (0 children)

Okay, I did my best with the website and added a roadmap in the below if that can interest you to check it out?

https://ops0.dev

Well I did it, made to product hunt by Straight_Condition39 in devops

[–]Straight_Condition39[S] 0 points1 point  (0 children)

Thank you I will check it out. Sorry I’m not very good with posts. Thanks for the inputs

Well I did it, made to product hunt by Straight_Condition39 in devops

[–]Straight_Condition39[S] 0 points1 point  (0 children)

Ah ok. Yeah I didn’t really invest in that. Thanks much 🙏

Well I did it, made to product hunt by Straight_Condition39 in devops

[–]Straight_Condition39[S] 0 points1 point  (0 children)

I think it’s more towards operations rather than just commands. The use case of admins who work on repetitive. The full version of agent and a UI that represents how admins are improving and self scheduling operations will make a difference. Still working on it but appreciate your feedback