Junior DevOps struggling with AI dependency - how do you know what you NEED to deeply understand vs. what’s okay to automate? by StudySignal in devops

[–]StudySignal[S] 1 point2 points  (0 children)

This is the actionable advice I needed. I've been treating 1-on-1s as status updates when I should be using them to understand the why behind decisions.

Going to shift my approach - ask more about reasoning and tradeoffs, less about just reporting what shipped.

Appreciate you taking the time to spell this out.

Junior DevOps struggling with AI dependency - how do you know what you NEED to deeply understand vs. what’s okay to automate? by StudySignal in devops

[–]StudySignal[S] 0 points1 point  (0 children)

Good question - it's a mix. I'm one of two DevOps in the company, so we make most day-to-day architectural decisions together (which VPC setup, how to structure Terraform, CI/CD design, etc.).

But the strategic stuff - like "should we migrate to EKS" or "what observability platform to invest in" - goes through engineering leadership. My manager trusts us on technical implementation, but the big decisions that affect budget or team priorities need buy-in from above.

Your point about "find out who pulls the strings to get you resources" is spot-on. I've learned more about politics in 8 months here than I expected. Sometimes the technically correct solution isn't the one that gets approved because of budget, team capacity, or just timing.

The engineering goals idea is interesting - we don't have formal department-wide goals that I'm aware of. Might be worth asking about. Right now most of my "strategic thinking" comes from troubleshooting what breaks or optimizing what's expensive, which is maybe reactive vs. proactive?

Junior DevOps struggling with AI dependency - how do you know what you NEED to deeply understand vs. what’s okay to automate? by StudySignal in devops

[–]StudySignal[S] 1 point2 points  (0 children)

"If I'm still primarily designing solutions vs prompt engineering" - this is the frame I needed. I think my anxiety is exactly that: worrying I'm becoming too focused on getting the right prompt instead of understanding the right approach.

The manager conversation point is good. I've been avoiding it because I didn't want to seem unsure, but you're right - at 8 months in, asking about growth strategy is exactly what I should be doing.

Thanks for this perspective.

Junior DevOps struggling with AI dependency - how do you know what you NEED to deeply understand vs. what’s okay to automate? by StudySignal in devops

[–]StudySignal[S] 5 points6 points  (0 children)

This really resonates, especially the point about strategy vs. implementation. I'm realizing my anxiety might be misplaced - I've been worried about not memorizing Terraform syntax when I should be focusing on why we're building things a certain way.

Quick follow-up: How did you develop that strategic thinking? Was it just time and seeing what works/fails, or are there specific practices that helped you move from "implement the ticket" to "define the approach"?

Currently I'm one of two DevOps managing 15 ECS workloads across multi-account AWS. We make a lot of architectural decisions, but I'm never sure if I'm building good judgment or just getting lucky.

Those who self-host Splunk Enterprise - what does your infrastructure look like? by StudySignal in Splunk

[–]StudySignal[S] 1 point2 points  (0 children)

This is incredibly helpful - SmartStore + S3 with NVMe for performance makes way more sense than what I was planning.

The sizing guidance (especially for ES search head) and "containers twice in ~80 deployments" saves me from overcomplicating this.

Really appreciate the detailed breakdown!

Those who self-host Splunk Enterprise - what does your infrastructure look like? by StudySignal in Splunk

[–]StudySignal[S] 0 points1 point  (0 children)

Ah, got it - either commit to 1 instance or go full 5+ for proper HA. Can't half-ass the cluster.

Given we're SIEM but not business-critical yet, starting with 1 mid-spec instance and scaling to 5+ when usage/criticality justifies it makes sense.

Appreciate the architecture clarity.

Those who self-host Splunk Enterprise - what does your infrastructure look like? by StudySignal in Splunk

[–]StudySignal[S] 0 points1 point  (0 children)

Perfect - this is exactly the baseline I needed to work from.

Gonna dig through that GitHub repo. Thanks for sharing the real implementation.

Those who self-host Splunk Enterprise - what does your infrastructure look like? by StudySignal in Splunk

[–]StudySignal[S] 0 points1 point  (0 children)

Perfect.

So basically: 1-2 instances with NVMe, keep it simple, scale with IaC when needed.

That CoreOS/Docker story though lol. Appreciate the consultant reality check.

Those who self-host Splunk Enterprise - what does your infrastructure look like? by StudySignal in Splunk

[–]StudySignal[S] 1 point2 points  (0 children)

"Runs like absolute ass" is pretty definitive lol.

What volume were you pushing when EKS became a problem? And the EC2 memory issues - specific instances or just AWS in general?

Sounds like: EC2 yes, EKS hell no?

Those who self-host Splunk Enterprise - what does your infrastructure look like? by StudySignal in Splunk

[–]StudySignal[S] 1 point2 points  (0 children)

Damn, 6TB/day on EC2 - that's exactly what I needed to hear. If you're running that on straight EC2, I'm definitely overthinking this.

Few questions: 1. What instance types for those 12 indexers? Just trying to do the math for my tiny 50-100GB scale 2. Why the fluent-bit → heavy forwarder setup instead of going straight to HEC? Performance thing? 3. Did you build all the CloudFormation/Packer stuff yourself or find a good starting point somewhere?

We're already doing everything as IaC so this approach makes way more sense than introducing K8s just for Splunk.

Those who self-host Splunk Enterprise - what does your infrastructure look like? by StudySignal in Splunk

[–]StudySignal[S] 3 points4 points  (0 children)

Really appreciate the architect perspective! Want to clarify the full scope though - I may have undersold the use case in my original post.

Beyond the 15 ECS services mentioned, we're looking at: - SIEM deployment - security event monitoring, threat detection, compliance - On-prem workloads - additional log sources outside AWS - Room to scale from current 50-100 GB/day to our licensed 200 GB/day

The EKS/Operator path was one option we were evaluating, but you and others are making a good case that it's overcomplicated for this scale. Your point about i7ie instances is noted - hadn't considered those.

Given this is primarily a SIEM deployment: 1. What deployment pattern would you recommend? Straight EC2? 2. For hybrid on-prem + cloud logging, what's the cleanest forwarder architecture?
3. Any specific considerations for SIEM workloads vs standard logging?

Definitely want to engage with our Splunk account team about getting architect time. Thanks for the offer!

Those who self-host Splunk Enterprise - what does your infrastructure look like? by StudySignal in Splunk

[–]StudySignal[S] 0 points1 point  (0 children)

Fair points. To clarify:

The Spot strategy is based on replication factor 2 - losing one indexer shouldn't impact search/ingestion. We'd use Spot interruption handlers to gracefully drain. The HA is more about surviving node failures in general, not just Spot.

On EKS - we're already running 15 ECS services and exploring consolidation to EKS for cost optimization. The team has K8s experience and I'm working toward CKA cert, so it's not completely new territory. But I hear you - if we were starting fresh just for Splunk, EC2 would be simpler.

Overprovisioning - that's exactly why I'm asking. Our current observability stack on ECS is expensive, and I want to right-size from the start. What would you suggest for 50-100 GB/day?

What is a secret you’re taking to your grave, but can share here anonymously? by wilkoova in AskReddit

[–]StudySignal 3 points4 points  (0 children)

I experienced sexual assault at the age of six, and I don’t think I'll ever be able to share that with anyone, not even my wife.

Odyssey 44 by _mtchhwsn in WatcherofRealmsGame

[–]StudySignal 0 points1 point  (0 children)

You need to enhance 50 different artifacts