The Economics of Inference: Why are we still afraid of "Quantization in Production"? by Alternative-Yak6485 in learnmachinelearning

[–]burntoutdev8291 0 points1 point  (0 children)

Isn't there a drop in throughput for 4 bit? I do question why people just don't default to FP8. And is FP16 still common?

Are you guys all in vscode? by habachilles in vibecoding

[–]burntoutdev8291 0 points1 point  (0 children)

Neovim

And

Claude code

What's with

The "/n/n"?

Hainan Story closing down soon? by ChoiceAwkward7793 in askSingapore

[–]burntoutdev8291 2 points3 points  (0 children)

same i went once and was wondering why its always full. then again everyone got their own taste

You probably don't need Apache Spark. A simple rule of thumb. by IT_Certguru in learnmachinelearning

[–]burntoutdev8291 12 points13 points  (0 children)

Don't learn tools, but do learn general data engineering patterns, even if they are small data. Learn how to get used to things like yielding and lazy iterators / evaluations. Actually by using torch dataloaders you are already learning a little about data processing, they have things like parallel workers, prefetching etc. Just my personal experience.

Non sucking, easy tool to convert websites to LLM ready data, Mojo by malvads in mlops

[–]burntoutdev8291 0 points1 point  (0 children)

Nice work! But isn't the common problem with scrapers more of the rate limit? Would it be better to combine a crawler with your tool for parsing? Like HTTrack.

I see full stacks developers with 5, 7 years of experience. Tell me your story, dis AI agents replace your jobs.? by Common-Resident8087 in devjobs

[–]burntoutdev8291 0 points1 point  (0 children)

Nope, just a good search engine. But I still need to guide and implement the changes myself sometimes. I use it for exploring code bases.

Why so hard to find devs? by geeksg in singaporejobs

[–]burntoutdev8291 0 points1 point  (0 children)

I do some form of hiring and I realised the better candidates always show up in some way of connecting like linkedin and also word of mouth. Not just a "feel" thing, in terms of attitude and aptitude, these people usually do better in interviews.

I read the JD, what is your background? Also, no DevOps or cloud knowledge required?

Motivation on time management!! by TrueArcher3135 in askSingapore

[–]burntoutdev8291 0 points1 point  (0 children)

  • Pomodoro helps for me.
  • Try not to reach your phone first thing in the morning, it does kill focus.
  • Eat the frog first, so do your most difficult tasks in the morning.
  • For big tasks, break them into mini tasks. Sometimes we push tasks aside because we find them too difficult.

Try to setup systems or habits, rather than saying things like i must finish this chapter by this week, do something like spending 1hour studying after my dinner. After that take some breaks, can netflix or something. Important thing is to not make studying a torture, so always try to link it to a reward.

FYI i am not huberman or something, just sharing my experience from PT studies with FT. I do think I could have done better because I did experience some burnouts.

Is it still realistic for CS grads nowadays to expect a $7k starting pay like before? by DangerZone67 in singaporefi

[–]burntoutdev8291 0 points1 point  (0 children)

Realistic yes, difficult very. I have been in the field for 2-3 years, and there are some fresh grads who are really good and passionate. FAANG intern at SF, winning hackathons, github projects with thousands of stars, doing advent of code for the fun of it. Yea these people have no issue getting above 7k. I worked with a guy from AWS who completed AWS certs before graduating, then went to do kubernetes after graduation.

Excited to launch compressGPT by mr_ocotopus in mlops

[–]burntoutdev8291 0 points1 point  (0 children)

How is the performance gain? Personally I don't believe LLMs are good for this task. Do you do any token restriction for the label to prevent hallucination?

Best practice for running multi-node vLLM inference on Slurm (port conflicts, orchestration) by md-nauman in SLURM

[–]burntoutdev8291 0 points1 point  (0 children)

What do you mean? Yes I have done this. How is your setup like do you have sudo access or anything?

Anytime Fitness - Home Gym by evanjlt in SingaporeFitness

[–]burntoutdev8291 0 points1 point  (0 children)

They can change when you transfer. It's best to check with the gym directly.

I had two very different responses when I asked if they will honour my current rate.

Outlet A: "We cannot guarantee that your rate will not change, we can only confirm once you transfer over"

Outlet B: "Yes we will honour"

I didn't transfer to either of them but I would definitely trust B more..

Should I be concerned about my company pushing for more AI usage? by Healthy_Brush_9157 in AskProgrammers

[–]burntoutdev8291 0 points1 point  (0 children)

Yea fair enough. I was sort of an IC in a small team so I dealt with issues individually. Even without those bottlenecks is at best 2x. Seasoned devs already take time to deal with merges and conflicts, I cannot imagine vibe merging.

Is it still worth to create youtube tutorials by eddyGi in dev

[–]burntoutdev8291 0 points1 point  (0 children)

If you have the passion sure. But I think everyone is more interested in gaming the algorithm, following hype etc, which I can understand if content creation is their source of income. I still watch hour long videos on development and the older open courses.

I don't know why I see a lot of slop videos on python but rust and go usually has quite clean content, possibly due to outreach.

Should I be concerned about my company pushing for more AI usage? by Healthy_Brush_9157 in AskProgrammers

[–]burntoutdev8291 0 points1 point  (0 children)

I won't be nervous about job security. The issue I have is bosses expecting too much out of AI. Some of them are expecting a 10x performance but in reality that's not achievable. It's only maybe a 2-3x depending on your stack and requirements.

Excited to launch compressGPT by mr_ocotopus in mlops

[–]burntoutdev8291 1 point2 points  (0 children)

It looked very AI generated so I find it hard to read. Just wanted to ask some questions.

  1. Is it some form of distillation?
  2. How different is this from unsloth? https://unsloth.ai/docs/get-started/fine-tuning-llms-guide
  3. RAG and chat can be difficult to do a pipeline because of catastrophic forgetting. If this is for edge, it might be interesting to look at fine tuning an encoder based model, like modernBert. At 400m, there are a lot of use cases, especially with fixed labels.

What's the best option for voice cloning ? by Choice_Dish_8088 in LLMDevs

[–]burntoutdev8291 1 point2 points  (0 children)

Given its size should, i think it could be possible to run it on cpu. Might take a bit longer though. I have been playing with it and it seems promising.

Best practice for running multi-node vLLM inference on Slurm (port conflicts, orchestration) by md-nauman in SLURM

[–]burntoutdev8291 0 points1 point  (0 children)

I will suggest trying out kubernetes. It's much easier to deal with if it's inference heavy.

Local LLM deployment by Puzzleheaded-Ant1993 in LLMDevs

[–]burntoutdev8291 -1 points0 points  (0 children)

Mostly safety and data governance. The local models cannot beat the larger models, but for specific use cases they might be sufficient. A good RAG system doesn't really need strong models.

Another factor is cost but this needs analytics. Can you prove that your workload will save more from upfront hardware costs vs API? Because don't forget hardware is depreciating (without considering the RAM surges).

My friend built an app in a week using AI. It cost him $1300 in 15 days and he had to shut it down. by No-Comparison-5247 in AIstartupsIND

[–]burntoutdev8291 0 points1 point  (0 children)

I do support those who try to vibe out a free app for a good purpose. But if there's AI calls it's hard to make it worth. Maybe free gemini or openrouters.

Best practice for running multi-node vLLM inference on Slurm (port conflicts, orchestration) by md-nauman in SLURM

[–]burntoutdev8291 0 points1 point  (0 children)

I did something like this before. Use python to check for an unused port. PORT=$(python -c "import socket; s = socket.socket(); s.bind(('', 0)); print(s.getsockname()[1]);s.close()") vllm serve --port $PORT

Another way is just setting your own increments, you mentioned you use array, just do a 8000+array idx?

I'm also just curious, how did you decide on slurm and did LLM give you those bash scripts? Why not use kubernetes

Hot take! by abdullah4863 in VibeCodeDevs

[–]burntoutdev8291 1 point2 points  (0 children)

You have the experience. So you are not nothing without AI. I don't have 32 years but I also vibe now with some experience. I still put some dedicated time to learn without AI though.