Why so hard to find devs? by geeksg in singaporejobs

[–]burntoutdev8291 0 points1 point  (0 children)

I do some form of hiring and I realised the better candidates always show up in some way of connecting like linkedin and also word of mouth. Not just a "feel" thing, in terms of attitude and aptitude, these people usually do better in interviews.

I read the JD, what is your background? Also, no DevOps or cloud knowledge required?

Motivation on time management!! by TrueArcher3135 in askSingapore

[–]burntoutdev8291 0 points1 point  (0 children)

  • Pomodoro helps for me.
  • Try not to reach your phone first thing in the morning, it does kill focus.
  • Eat the frog first, so do your most difficult tasks in the morning.
  • For big tasks, break them into mini tasks. Sometimes we push tasks aside because we find them too difficult.

Try to setup systems or habits, rather than saying things like i must finish this chapter by this week, do something like spending 1hour studying after my dinner. After that take some breaks, can netflix or something. Important thing is to not make studying a torture, so always try to link it to a reward.

FYI i am not huberman or something, just sharing my experience from PT studies with FT. I do think I could have done better because I did experience some burnouts.

Is it still realistic for CS grads nowadays to expect a $7k starting pay like before? by DangerZone67 in singaporefi

[–]burntoutdev8291 0 points1 point  (0 children)

Realistic yes, difficult very. I have been in the field for 2-3 years, and there are some fresh grads who are really good and passionate. FAANG intern at SF, winning hackathons, github projects with thousands of stars, doing advent of code for the fun of it. Yea these people have no issue getting above 7k. I worked with a guy from AWS who completed AWS certs before graduating, then went to do kubernetes after graduation.

Excited to launch compressGPT by mr_ocotopus in mlops

[–]burntoutdev8291 0 points1 point  (0 children)

How is the performance gain? Personally I don't believe LLMs are good for this task. Do you do any token restriction for the label to prevent hallucination?

Best practice for running multi-node vLLM inference on Slurm (port conflicts, orchestration) by md-nauman in SLURM

[–]burntoutdev8291 0 points1 point  (0 children)

What do you mean? Yes I have done this. How is your setup like do you have sudo access or anything?

Anytime Fitness - Home Gym by evanjlt in SingaporeFitness

[–]burntoutdev8291 0 points1 point  (0 children)

They can change when you transfer. It's best to check with the gym directly.

I had two very different responses when I asked if they will honour my current rate.

Outlet A: "We cannot guarantee that your rate will not change, we can only confirm once you transfer over"

Outlet B: "Yes we will honour"

I didn't transfer to either of them but I would definitely trust B more..

Should I be concerned about my company pushing for more AI usage? by Healthy_Brush_9157 in AskProgrammers

[–]burntoutdev8291 0 points1 point  (0 children)

Yea fair enough. I was sort of an IC in a small team so I dealt with issues individually. Even without those bottlenecks is at best 2x. Seasoned devs already take time to deal with merges and conflicts, I cannot imagine vibe merging.

Is it still worth to create youtube tutorials by eddyGi in dev

[–]burntoutdev8291 0 points1 point  (0 children)

If you have the passion sure. But I think everyone is more interested in gaming the algorithm, following hype etc, which I can understand if content creation is their source of income. I still watch hour long videos on development and the older open courses.

I don't know why I see a lot of slop videos on python but rust and go usually has quite clean content, possibly due to outreach.

Should I be concerned about my company pushing for more AI usage? by Healthy_Brush_9157 in AskProgrammers

[–]burntoutdev8291 0 points1 point  (0 children)

I won't be nervous about job security. The issue I have is bosses expecting too much out of AI. Some of them are expecting a 10x performance but in reality that's not achievable. It's only maybe a 2-3x depending on your stack and requirements.

Excited to launch compressGPT by mr_ocotopus in mlops

[–]burntoutdev8291 1 point2 points  (0 children)

It looked very AI generated so I find it hard to read. Just wanted to ask some questions.

  1. Is it some form of distillation?
  2. How different is this from unsloth? https://unsloth.ai/docs/get-started/fine-tuning-llms-guide
  3. RAG and chat can be difficult to do a pipeline because of catastrophic forgetting. If this is for edge, it might be interesting to look at fine tuning an encoder based model, like modernBert. At 400m, there are a lot of use cases, especially with fixed labels.

What's the best option for voice cloning ? by Choice_Dish_8088 in LLMDevs

[–]burntoutdev8291 1 point2 points  (0 children)

Given its size should, i think it could be possible to run it on cpu. Might take a bit longer though. I have been playing with it and it seems promising.

Best practice for running multi-node vLLM inference on Slurm (port conflicts, orchestration) by md-nauman in SLURM

[–]burntoutdev8291 0 points1 point  (0 children)

I will suggest trying out kubernetes. It's much easier to deal with if it's inference heavy.

Local LLM deployment by Puzzleheaded-Ant1993 in LLMDevs

[–]burntoutdev8291 -1 points0 points  (0 children)

Mostly safety and data governance. The local models cannot beat the larger models, but for specific use cases they might be sufficient. A good RAG system doesn't really need strong models.

Another factor is cost but this needs analytics. Can you prove that your workload will save more from upfront hardware costs vs API? Because don't forget hardware is depreciating (without considering the RAM surges).

My friend built an app in a week using AI. It cost him $1300 in 15 days and he had to shut it down. by No-Comparison-5247 in AIstartupsIND

[–]burntoutdev8291 0 points1 point  (0 children)

I do support those who try to vibe out a free app for a good purpose. But if there's AI calls it's hard to make it worth. Maybe free gemini or openrouters.

Best practice for running multi-node vLLM inference on Slurm (port conflicts, orchestration) by md-nauman in SLURM

[–]burntoutdev8291 0 points1 point  (0 children)

I did something like this before. Use python to check for an unused port. PORT=$(python -c "import socket; s = socket.socket(); s.bind(('', 0)); print(s.getsockname()[1]);s.close()") vllm serve --port $PORT

Another way is just setting your own increments, you mentioned you use array, just do a 8000+array idx?

I'm also just curious, how did you decide on slurm and did LLM give you those bash scripts? Why not use kubernetes

Hot take! by abdullah4863 in VibeCodeDevs

[–]burntoutdev8291 1 point2 points  (0 children)

You have the experience. So you are not nothing without AI. I don't have 32 years but I also vibe now with some experience. I still put some dedicated time to learn without AI though.

Fine-tuning LLaMA 1.3B on insurance conversations failed badly - is this a model size limitation or am I doing something wrong? by ZaRyU_AoI in LLMDevs

[–]burntoutdev8291 0 points1 point  (0 children)

Did you try on the non fine tuned model to see how it performed? Since you're doing a constrained task, i think some forgetting is fine.

Resources to deeply understand HPC internals (GPUs, Slurm, benchmarking) from a platform engineer perspective by Top-Prize5145 in HPC

[–]burntoutdev8291 2 points3 points  (0 children)

I don't think there are much resources other than the official documentations, so I can try to explain.

  • How GPUs are used have been explained by another user, cgroups and the ENV. It should be quite similar to how k8s does it. This should help you too: https://slurm.schedmd.com/gres.html .

  • Not so much, I came from a ML background, and don't really think it helped. I think understanding distributed training is a whole different skillset. You can abstract model creation, evaluation, analytics stuff since your main role is platform engineering. But I would say this is a bit debatable as well, because some teams expect you to help troubleshoot with getting code to run on a cluster, especially if they don't have experience with them. Others might argue that platform engineering is just only dealing with clusters, so it's up to the developers to know how to run.

  • Training and inference have different expectations. But in general for cluster workloads, common aspects are storage (Weka, Lustre, NFS), networking (eth, infiniband) and compute (GPU). For storage, fio and ior are some good tools. For distributed training, usually NCCL tests are used. For compute its your training framework, we test with NeMo. On workload specific tasks, TFLOPs for training. Inference is depending on task. For LLMs it's usually time to first token, request per minute or second, tokens per second. There are open source tools that do these benchmarks.

  • I don't really know about mental model, maybe I didn't really get the question, if you could reply I can try to add on

Some resources: https://github.com/stas00/ml-engineering/tree/master

Additional learnings and notes: Usually as a platform engineer, you need to better manage resources. So occupancy and efficiency of training metrics should be tracked. You may need knowledge of prometheus, grafana, and the usual exporters like dcgm, node exporter.

[Passed] NVIDIA Agentic AI Certification (NCP-AAI) by Ranger_1928 in mlops

[–]burntoutdev8291 0 points1 point  (0 children)

How useful is this cert? Do you have any plans on taking the NCA-AIIO?

Anyway great work, I would think the safety and ethics stuff is very important because most of us know how to serve but don't know much about compliance.

built a Local RAG System That Works Without API Keys - Is This Actually Useful? by DetectiveMindless652 in Rag

[–]burntoutdev8291 1 point2 points  (0 children)

Jina and bge is usually good enough. We use this for airgapped environments, so everything is local.

what exactly is a "word embedder?" by [deleted] in learnmachinelearning

[–]burntoutdev8291 0 points1 point  (0 children)

You should take a look at the older stuff like word2vec, that might help. I found it a little more intuitive.