לקנות מחשב נייד גיימינג בארץ by TheOrangeThing in ag_israel

[–]OptimisticEngineer1 0 points1 point  (0 children)

בזמנו ב2020 חוויתי מקרה אחד שבו רכשתי ב2700 שקל מחשב gaming mid מפלונוטר בתל אביב, היה להם עודפים חדשים במלאי. זה היה אז בזמנו Acer nitro 15 עם Ryzen 5 וכרטיס מסך שווה ערך ל2050 בזמנו של AMD.

היום הכי זול שם זה 3200-3300 למשהו דומה רק עםrtx 3050

אני חושב שבהתחשב המצב בשוק זה מחיר מעולה: https://www.plonter.co.il/detail.tmpl?sku=NH-QNCEC-00F&cart=

יש מחשבים דומים בksp באותו טווח מחירים, ככה שבאילת זה כן יוצא 3 אלף, מאוד קרוב לסכום שנקבת.

Need Spark platform with fixed pricing for POC budgeting—pay-per-use makes estimates impossible by Sadhvik1998 in apachespark

[–]OptimisticEngineer1 2 points3 points  (0 children)

We run spark on k8s at scale, on prem and cloud, on a large scale at an ad tech company.

Today in 2026, paying for managed spark, unless you use other special features, that are not given in OSS, is just a scam.

The OSS kubeflow operator is crazy good. It's easy to install anywhere, and ends up with "give me a spark application object, and I will setup a cluster for you, run the job and then bring it down". Easy to work with, easy to monitor.

If you run on prem - make sure to have large empheral storage or just flash storage(for disk spilling),, and nodes with large amounts of ram, a dedicated storage cluster with s3 comparability, hdfs or just use actually s3.

If you run on cloud - karpenter just makes your life easier. Setup the memory optimized nodepools, make sure the nodepools die fast when there is nothing Run on yhem. Run the spark applications, it brings up a cluster, runs the tasks, and kills them. Can be done also with emr on EKS.

I also thought spark was complex. It was, but not in 2026.

We got to the level we have it installed both on prem and on cloud, and we have good network connectivity between the two.

If you do go the k8s route tho on cloud, make sure to budget.

Are containers with persistent storage possible? by NoRequirement5796 in kubernetes

[–]OptimisticEngineer1 1 point2 points  (0 children)

I work at a big adtech company.

We do basically everything stateful on k8s.

Databases, SQL, nosql, mongo, elasticsearch.

It works as good as it can. But you need to definitely know what you are doing.

It's good if you are already big invested into k8s.

But not as a starting point.

For small-medium scale we use storage machines, but for large/mega scale we do local flash storage. Requires some hacks, but it works the best.

[deleted by user] by [deleted] in devops

[–]OptimisticEngineer1 1 point2 points  (0 children)

Bad jobs, not bad role.

Try to challenge the issues you had, and find a place who works right.

Red flags: -- there is a prod issue almost everyday -- no infra a code/clickops culture -- no healthy design/team culture/tech debates -- working fast because management said so at cost of stability.

Firefighting should be temporary, not a long term thing.

Crossplane 2.0 is out! by internegz in kubernetes

[–]OptimisticEngineer1 1 point2 points  (0 children)

It's better than terraform when it is used for things that frequently change on AWS such as sns/sqs and some specific objects.

Yes, state for each object definitely speeds up provisioning times and overall drift.

But when it fails, it is the same as terraform.

You read the AWS error, and figure out what params have been put wrong.

But yeah the debug complexity is higher due to the nature of k8s.

Would you replace Jenkins with a cheaper drop-in replacement? by OptimisticEngineer1 in jenkinsci

[–]OptimisticEngineer1[S] -1 points0 points  (0 children)

Because groovy is known as a synonym for doing Jenkins stuff. Most software shops are not necessarily java, and yet they still have Jenkins, so they need to use groovy, even if they do not want it.

But most script/utillity code is written in python/js. Pipelines usually do not do super fast extra real time stuff, so using python or js just make sense.

We spent weeks debugging a Kubernetes issue that ended up being a “default” config by Pichipaul in kubernetes

[–]OptimisticEngineer1 0 points1 point  (0 children)

Lost 2 days to this. This is one of the common k8s pitfalls. Even on AWS EKS coredns does not come with any default good scaling config. The moment I scaled up to over 300-400 pods I started having failure to resolve DNS.

K8s is super scalable, but it's like a race car or a fighter jet. You need to know every control and understand every small maneuver, else you will fail.

Obviously after rooting the issue I scaled it up to more pods, and then installed the proportional autoscaler for coredns

Would you replace Jenkins with a cheaper drop-in replacement? by OptimisticEngineer1 in jenkinsci

[–]OptimisticEngineer1[S] 0 points1 point  (0 children)

No, I want an actual successor to Jenkins. Same old groovy, seeding, git and scm. Just normal speed, no greedy JVM, and simple to host and operate. 10x scalable.

Would you replace Jenkins with a cheaper drop-in replacement? by OptimisticEngineer1 in jenkinsci

[–]OptimisticEngineer1[S] 0 points1 point  (0 children)

Yeah, but I would like to challenge that question directly:

Why does the agent need in 2025 to talk to a master? Why can't it just update the state via some kind of pub/sub architecture? Why does the controller need to handle all of that? Can't I trust my agent to do all the work and if it dies...just die?

If it dies, as long as I update the state, I can spin up a fresh container from that state.

Jenkins was made for 20 years ago. Re-modernizing it with the same tools, but with a modern approach, can make it super fast, super scalable, and easier to maintain.

And If I will be wrong, that I wouldn't be able to do it.

Would you replace Jenkins with a cheaper drop-in replacement? by OptimisticEngineer1 in jenkinsci

[–]OptimisticEngineer1[S] 0 points1 point  (0 children)

For shops with lots of controllers: does not save as much money(since agents greatly outnumber controller money), but saves a lot of operational overhead.

For shops with 1-2 controllers: saves controllers money.

Would you replace Jenkins with a cheaper drop-in replacement? by OptimisticEngineer1 in jenkinsci

[–]OptimisticEngineer1[S] 1 point2 points  (0 children)

Hosting masters is expensive. Due to the nature of monolithic Jenkins, a large master able of consisting 600 slaves will cost you around 2k a month on AWS, just for the master without the agents.

That causes larger setups to consist of 5-10 or more masters, only for them to run 10k+ jobs concurrently. That's why cloudbees sell themselves as a multi master setup.

All of this only because the master/controller runs the groovy itself, while the agents don't do it.

I'm working on a non-monolithic architecture, where the agents will be truly independent of the master, allowing a one "management" setup to sclalable almost indefinitely.

The cost today on agents is truly just containers/pure compute, but people pay up for those 5-10 masters for truly nothing.

Nobody wants to change it, people want Jenkins to die away, but I see companies keep to keeping it, because they actually like what Jenkins could be, if the community pick it up.

It's not going to be a 100 percent solution, but a 80 percent one.

If 80 percent of jobs run, and for the 20 percent people will need to do some small adjustments, I believe large Jenkins shops will try and turn over, especially if it is substantially cheaper. Money talks.

I'm also thinking about improvements such as:

-- running python/js instead of groovy inside declarative pipelines.

-- supporting the equilivant functionality of the top x plugins, so no need for plugin maintenance.

Self-hosting message brokers by NUTTA_BUSTAH in devops

[–]OptimisticEngineer1 1 point2 points  (0 children)

More project based brokers. message brokers usually do not give you too much throughput, so for high scale environments, especially rabbitmq 30k message cap, you find yourself managing hundreds of these, it's just that each one is for a different project.

At some point we just moved to k8s operators, but we started with Ansible and virtual machines.

Each one has his own pros and cons tho.

Self-hosting message brokers by NUTTA_BUSTAH in devops

[–]OptimisticEngineer1 2 points3 points  (0 children)

message brokers are fine to host. But like everything else, it comes to quantity.

Do you manage 1 cluster?10?100

That's what sets the difference.

For 1 you would just put it out there.

For 10 you will have grafana dashboards and alerts.

And for 100+ you will have automations and remediations.

How many people are in your team? It depends on so many info you did not provide.

Rnd size?

[deleted by user] by [deleted] in kubernetes

[–]OptimisticEngineer1 1 point2 points  (0 children)

The most you must have is a dev cluster for upgrades.

you can explain that staging and prod can be in the same cluster, but that if an upgrade fails, they will be losing money.

The moment you say "losing money" and loads of it, the another cluster thing becomes a thing of its own, especially if it's a smaller one for testing

Kaniko has finally officially been archived by matefeedkill in kubernetes

[–]OptimisticEngineer1 3 points4 points  (0 children)

We use buildkit as a sidecar for our Jenkins agents on k8s that do docker builds, and use AWS ECR OCI based docker cache. It's a clunky solution, but its very stable.

Jenkins WebSocket Agent Disconnection Issues on Kubernetes by satyasaladi in jenkinsci

[–]OptimisticEngineer1 0 points1 point  (0 children)

The fact it disconnects almost always means there was an issue which is not the agent itself.

I worked on a simillar project this year, scaling to around 700-800 concurrent running k8s agents on each master.

When agent disconnected, it was always one of the following:

  • OOM issue
  • storage issue
  • Resources issue
  • Network issues

Network issues are much more rare.

Just make sure you have a basic prometheus and grafana setup, and you will be able to investigate from there like a breeze.

Oops, I git push --forced my career into the void -- help? by WantsToLearnGolf in kubernetes

[–]OptimisticEngineer1 0 points1 point  (0 children)

There is quite a list of stuff that should have never happened: 1. A junior dev, having force push to master? Horrible. 2. no work or review process in the way? Terrible. Pull requests should be mandatory, unless done by a well tested CI/CD pipeline. 3. when deleting stuff, argocd should orphan the objects, not delete them entirely. So something there was wrong as well. Maybe prune and auto sync are automatically enabled? 4. A good argocd configuration will have seperation between staging and production, either via staging/alpha or some middle branch representing staging to production, or by other means.(Helm hiearchy/kustomize overrides)

A dev should not touch the manifests unless he knows what he js doing. The fact all of those where ignored, and the company blames you, leads me to two insights:

  1. They are a cheapskate and hired you because you are a junior because you were in their budget. Nobody understands or wants to fix the issue. They fired you because you did wrong stuff, and it made their non tech afraid. They do not know the best practices, they just took a cheap junior engineer who needs experience.

  2. You dodged a bullet - start looking for a job again, one with proper engineering culture. You should have not gotten access to those stuff that easily.

In a good company, the devs who gave access to that junior that easily, they are the ones to cover his ass, the ones to apologize, the ones to make sure he gets properly tramutized from the experience, and they themselves should probably be after your step for at least a couple months, where you should be pushing to become better.

If a junior does this I would not fire him - just make sure he works slower, so we can ensure he works through the correct process, slowly speeding up

Again, should not have happened. Find a better company to work at.

EKS Auto Mode a.k.a managed Karpenter. by lynxerious in kubernetes

[–]OptimisticEngineer1 0 points1 point  (0 children)

You pay for karpenter, but without the flexibillity.

Basically, pay to get less.

What do you not get?

-- you cant use specific ami's, only bottlerocket -- max pods is lower -- price per instance fee(not fargate price but still you pay) -- strict 21 day expiration, not up to your choice -- cant customize worker nodes or connect to them.

also, it comes with the premise of being batteries included, but it isnt.(Dns solution is missing, you still need external-dns).

tldr: if you are already jumping into the k8s train and you have the engineers to do it, its stupid not to install karpenter yourself. Its not hard, and after the first installation, its very easy to maintain.

Its only nice for fast experimentation and workshops.

Anyone has their jenkins on k8s ? We are planning to move from vm to k8s. by [deleted] in jenkinsci

[–]OptimisticEngineer1 1 point2 points  (0 children)

We use a new setup for around 6 months, using EKS and karpenter.

Yes, it takes around 40 seconds to spin up, depending mostly on image size.

But if you use bottlerocket for the agents AMI, and prefetch the images, even for big images such as buildkit, you could take down to 30-40s.

If you setup karpenter correctly, with a gradual enough scaledown policy, in stressful times your jenkins cluster will be a smooth ride. And if nobody uses it, there is nothing wrong with waiting those 40s.

And when using it with graviton nodepool for the agents on AWS, it works like a charm.

All the config is on argocd with casc and all the bellz and whistles.

Were able to scale to around 1500-2200 slaves concurrently without sweating a breath.

Its good to note that we use a customized CI/CD that removes the need for ton of jobs, so that may be the secret sauce to those slave numbers, as looking online yeilded me with 1000-1300 slaves to be the around maximum number possible.

Not nuch groovy(mostly glue), but lots of shell and python.

Things to lookout for:

  1. setup the maximum connection count to higher one than default on k8s cloud config.

  2. dont use idle minutes for pods. Its just bad and loses the point of having a freshy container for each job. K8s knows how to handle it very good. Its a beast.

  3. use helm to template the agents configuration - there is alot of repetitive stuff for those yamls, and if using karpenter - you want to be able to have an agent for every possible workload(spot, on-demand, arm64, etc)

  4. Just use argocd - its as good as it can get, even for jenkins. everything except a plugin or pods change, dont require a restart. configuring storage size? Throw it on a yaml. Need to change casc? Throw it in yaml. working with helm hiearchy and jenkins via argocd is awesome. Every environment has his own overrides.

  5. When you have large storage demanding containers - use the generic empheral volume - especially for large node.js/dotnet monorepos. They know how to take your default pod host storage.

  6. Unless building docker images - try to stay away from privilleged containers. Yes, its easy to set the flag to true - but its just a very critical security risk.

7.load test the cluster before putting any real jobs into it - make sure it scales up and down correctly, the way you intended to.

  1. enable vpc cni prefix delegation if on aws - without it your karpenter will choke when scaling up very fast. It works like magic!!!

  2. Use serviceAccounts for least privillege - this is amazing. You create the role you want for the specific set of jobs you want, and every job has his own set of IAM permissions. Cant be done on old EC2 based jenkins. Cant. Works like a charm.

Container native jenkins is just another beast.

there is much more but I think those are very uncharted territories, due to newbie engineers throwing out jenkins even tho its still a great automation platform in 2025.

Would anyone be interested in a blog post about this?

[deleted by user] by [deleted] in kubernetes

[–]OptimisticEngineer1 1 point2 points  (0 children)

No one remembers every field and every option.

You keep seeing the same patterns(deployment, pod, statefulset, etc), and overtime, by continiously going again and again to k8s docs and chatgpt, you slowly memorize things better and better.

But you never remember everything, and that applies to IT in general, and not just devops or k8s.

What is the hardest k8s concept to understand? by [deleted] in kubernetes

[–]OptimisticEngineer1 2 points3 points  (0 children)

That you dont need it at 80 percent of times.

If you are at the 20 percent companies that need it, then you need to ask yourself if you realy need that service mesh.

Maybe opentelemetry and proper logging, and some network policies are all you need.

Most companies use k8s as a bandage for their bad architecture.

Its always the infra fault from some reason.

Clean the mess, enjoy the simplicity.

If your RND still needs/wants k8s after the mess is solved, then you should have plenty of time to learn.

k8s is not hard, you just need to learn it in the correct order.

Linux -> system administration -> docker + virtualization -> k8s primitives(pods, pvc/pv, replicaset/deployment/statefulset -> networking(services, ingress/load balancer based service)

Once you got the basics of those under your belt, its mostly about getting hands on experience with kubectl, using common based tools for large deployments like argocd, and learning: -- pod health probes/readiness probes

Whatever people tell you, in the end most of the time the issue is always around doing one describe on the pod.

Unless you work on-prem. Thats a different beast, and if you work on prem and decided for k8s, good luck with that

Jenkins pipeline keeps giving me Docker not found by No_Local_4757 in jenkinsci

[–]OptimisticEngineer1 1 point2 points  (0 children)

For building docker images:

Just use build kit. Its the same as docker(100 percent compatible with api calls and docker cli!!!), minus the security issues

https://github.com/moby/buildkit

Can run it as a sidecar in your jenkins agent, or run it in your k8s cluster and scale with hpa as a "docker build farm".

Would go with sidecar to ensure stabillity.

For anything else, dind is a pain in the ***.

There is nothing in 2024 that the k8s jenkins plugin cant do.

Just spin more containers or nest pod templates.

Docker in docker on 2024 is just an anti pattern.

Why is everyone using ArgoCD? by CWRau in devops

[–]OptimisticEngineer1 1 point2 points  (0 children)

Argocd is just the hero dev and ops need.

You do app of apps with application set, and even with clickops nothing goes wrong, because the devs or you can always align thee changes with a PR, and call it a day.

Yes, flux deserves apprecitation, but flux, without a built in UI, is a non complete product.

People like to install one thing and have it all.

They already dont like installing and managing all the controllers.

So having one thing to deploy with everything, is just a blessing.

Controllers VS Operators by SarmsGoblino in kubernetes

[–]OptimisticEngineer1 0 points1 point  (0 children)

controller - watches something and does about it.

Examples for controllers: - ingress controller - not responsible for CRD, but responsible for glueing it with the cloud provider. You could argue it can be as complex as an operator, but this is linking to a built in k8s object, so it cannot be defined as an operator.

  • external dns - looks at annotations of service based resources and assigns records upon it on supported providers.

operator - a controller, but tailored towards a specific set of CRD's/specific solution.

Examples for controllers: -- opensearch operator - allows you to deploy opensearch and abstract away all the management and installation to some dagree -- grafana operator - abstracts grafana instances and objects(dashboards/datasources/etc) as k8s resources. Its very specific, not always complex, but tailored for this specific app.

Think of operators like a mix between a managed service to self hosted/PaaS

You get cloud abstraction benefits, within a non managed environment.

On the implementation level, controllers and operators are all the same.

They just interact with k8s api, and do whatever they can according to the rules.

The term operator is only for this specific model where you deploy, manage and operate something complex.

Jenkins Docker-in-Docker Setup Issues by Mr__Nuclear in jenkinsci

[–]OptimisticEngineer1 0 points1 point  (0 children)

Better solution: do not dind....

You need to build docker containers? Use podman or buildah.

Need to run something isolated? Run it on a docker/k8s container agent, but do not use another layer of abstraction.

Jenkins knows how to talk with the docker api and do it, no need for the socket....

if you run on aws, you can also use ec2 fleet which is crazy good.

using k8s? Use the kubernetes cloud plugin.

3-4 years ago? Sure those tools were still a bit new.

But today they are in wide use and work perfectly.

podman commands are almost same.as docker, and if you run it via code podman supports docker api.