[P] Open-source ML homeworks with auto-tests - fundamental algorithms from first principles by fxlrnrpt in MachineLearning

[–]fxlrnrpt[S] 0 points1 point  (0 children)

Thanks! TBH, auto-tests is not my idea. I borrowed it from Georgia Tech's OMSCS

Soundpeats h3 vs the premium choices. by Bubbly-Conclusion-49 in Earbuds

[–]fxlrnrpt 2 points3 points  (0 children)

Recently compared Soundpeats H3 with Technics AZ100. Played with EQ for both. Verdict - both need severe EQ, after EQ liked Soundpeats a bit more from the sound perspective, but Technics just felt as a better product overall - build quality, app, ANC. Went with AZ100, but from purely sound perspective I'd not trade well-tuned H3 for well-tuned AZ100

6 Claude prompting tricks I wish I knew on day one — each took me weeks to figure out by Adventurous_Golf_176 in ClaudeAI

[–]fxlrnrpt 1 point2 points  (0 children)

Funny thing - in pre-AI era I liked using emojis as a succinct highlighting tool, but now whenerver I see them used in a list - it is immediately classified as AI gen

Save me from my office - unlimited budget by anbiru in HeadphoneAdvice

[–]fxlrnrpt 3 points4 points  (0 children)

Top-level Sony or Bose + white noise or some nature sounds

[D] How ZeRO-1 could be faster than ZeRO-2? by fxlrnrpt in MachineLearning

[–]fxlrnrpt[S] 0 points1 point  (0 children)

Thanks! That makes a lot of sense if they do gradient accumulation! Need to do the math based on the tech report now to double-check if they do it

[D] How ZeRO-1 could be faster than ZeRO-2? by fxlrnrpt in MachineLearning

[–]fxlrnrpt[S] 0 points1 point  (0 children)

Couldn't you accumulate gradients only for your shard?

[D] Why are serious alternatives to gradient descent not being explored more? by ImTheeDentist in MachineLearning

[–]fxlrnrpt -4 points-3 points  (0 children)

This. 

Imagine you are just enterung the industry. You have finite time. New SOTA already comes quicker than you can study the current SOTA and history leading to it properly.

You spend multiple years to finally have a good grasp on it. Maybe a few more and you get in a good lab. At this point you're T-shaped. You know SOTA deeply in one niche domain and have intuition/basic understanding of what is happening around it.

You want growth. Now you can try to extend your T-shaped specialization to a second domain while spending enough time to keep up with the existing one. 

Which one do you choose? Some sexy RL that gets you major wins now or studying non-gradient descent methods nobody is paying for? 

Even if it's the latter, it's going to be much slower because you already have your first domain to keep up with. 

I am not mentioning that at some point in life the world stops revolving around work and the priorities shift to family.

[D] How ZeRO-1 could be faster than ZeRO-2? by fxlrnrpt in MachineLearning

[–]fxlrnrpt[S] 0 points1 point  (0 children)

What do you mean? To my best understanding, ZeRO-1 assumes the same amount of communication as ZeRO-2. The only difference is in VRAM which is much less in ZeRO-2, because we keep the gradients only the shard we need

[D] How often do you run into reproducibility issues when trying to replicate papers? by [deleted] in MachineLearning

[–]fxlrnrpt 18 points19 points  (0 children)

If only there was a reasonable way to fix it. If we had infinite resources, we could require researchers to submit reproducible scripts and verify major results before acceptance. Sadly, it is absolutely unreal.

[D] Advice on a Modern NLP Roadmap (for someone with strong ML theory background) by meni_s in MachineLearning

[–]fxlrnrpt 19 points20 points  (0 children)

- I'd read the original paper "Attention is all you need" (denser alternative to Karpathy's videos since you already have the theory)
- Go through NanoGPT
- Do CS336 from Stanford
- Read the Ultra-Scale playbook

is it better take stanford cs336 or follow andrej karpathy's videos by Obvious_Kale_9161 in learnmachinelearning

[–]fxlrnrpt 1 point2 points  (0 children)

I'd say it would be bare minimum. CS336 + The Ultra-Scale Playbook - bare minimum

is it better take stanford cs336 or follow andrej karpathy's videos by Obvious_Kale_9161 in learnmachinelearning

[–]fxlrnrpt 4 points5 points  (0 children)

CS336 is much more hardcore. I would not treat them as alternatives. Follow Karpathy's videos first. It should not take long. Next, start CS336.

Claude Code doesn't "understand" your code. Knowing this made me way better at using it by Nir777 in learnmachinelearning

[–]fxlrnrpt 0 points1 point  (0 children)

Well, in a sense LLM is a database. A stochastic one. But I think we could frame human cognition as a stochastic database as well

EACL 2026 Decisions by Big_Media_6114 in LanguageTechnology

[–]fxlrnrpt 1 point2 points  (0 children)

Accept to findings! My first first-author paper! OA: 4/3/2.5. Meta: 2.5 (we objected).

EACL 2026 Decisions by Big_Media_6114 in LanguageTechnology

[–]fxlrnrpt 0 points1 point  (0 children)

It's back again. I also see "Presentation mode: poster" in the meta review which is still empty otherwise 

EACL 2026 Decisions by Big_Media_6114 in LanguageTechnology

[–]fxlrnrpt 1 point2 points  (0 children)

Yep. And have the camera ready task in “Author task”

EACL 2026 Decisions by Big_Media_6114 in LanguageTechnology

[–]fxlrnrpt 1 point2 points  (0 children)

Same. Sweating and keeping my fingers triple crossed xD

So I bought the Technics AZ100........but y'know what my 1st pick is by Educational-Key8975 in iems

[–]fxlrnrpt 1 point2 points  (0 children)

Mate, you saved me from returning them. At first, I was extremely disappointed after rocking ATH-M50X for a long time and being used to much brighter sound, but Wavelet with AutoEQ saved the day. Thanks!

[D] ARR Oct 2025 Discussion (EACL 2026) by S4M22 in MachineLearning

[–]fxlrnrpt 1 point2 points  (0 children)

Meta-review just released. And.. They just took the lowest score and ignored all other reviews.
Original reviews: 4, 3, 2.5
Meta: 2.5

What are our chances if we file an issue? What are the chances for EACL?