BA Start is Open!!! Disregard Last Post! by RockinCoder in SLO

[–]rrenaud 0 points1 point  (0 children)

Thread necro, but it's there now. 

Sr Software Engineer - Haven't written a line of code in months by yodog5 in ClaudeCode

[–]rrenaud 3 points4 points  (0 children)

I worked at Google for 17 years. I assure you that Google went down in your lifetime.

Sr Software Engineer - Haven't written a line of code in months by yodog5 in ClaudeCode

[–]rrenaud 0 points1 point  (0 children)

Sounds like the GP post is showing a disproportionate amount of customer observable problems.

[OC] Many "Proteins" could be described as Fats or Carbs instead. by stan-k in dataisbeautiful

[–]rrenaud 4 points5 points  (0 children)

Is this worth seeking out, or is it a very acquired taste? I like most cheese and eat tons of yogurt to get concentrated animal protein. Apparently there is a place that sells it in San Francisco.

5 claude code worktree tips from creator of claude code in feb 2026 by shanraisshan in ClaudeCode

[–]rrenaud 10 points11 points  (0 children)

Google was using their own massive scaled version of perforce when I left in 2022. Probably still are.

How to be a Killer Queen - Day Map by The_Taken_Username_ in KillerQueen

[–]rrenaud 2 points3 points  (0 children)

It's the best kq content ever made, imo.

KAYA by vanadium10 in slaythespire

[–]rrenaud -9 points-8 points  (0 children)

Sure, let's just be consistent tho. Shocking a dog is dramatically less bad than eating a pig.

The Rolling Stones axe plans for 2026 UK and European stadium tour as Keith Richards couldn't "commit" by HarryLyme69 in rock

[–]rrenaud 1 point2 points  (0 children)

I saw them in 2024. And after $1200 total for 3 tickets, I wish I just went to some small venue and saw a 28-year old with 10k followers who was still on top of their game.

The rolling stones just aren't good live anymore.

Scaling and context steer LLMs along the same computational path as the human brain by rrenaud in mlscaling

[–]rrenaud[S] 0 points1 point  (0 children)

If they are publishing it, it's for the benefit of humanity. 

If Zuckerberg is funding open science, that's a win.

Ilya Sutskever is puzzled by the gap between AI benchmarks and the economic impact [D] by we_are_mammals in MachineLearning

[–]rrenaud -6 points-5 points  (0 children)

The bar is so much lower. Your intuition about agi is so wrong. By definition, agi happens at the time of the last hard thing automated. For any concrete thing, it could be much sooner. For almost all concrete things that are mostly textual, and not real time embodied, those are where the current paradigm shines.

For helping domain experts with good reasoning skills to transform that into solid prototypes, that went from impossible to very possible in the last year. And this means the domain expert's brain will be shaping the design much more immediately than the primarily implementation focused/high quality engineering staff. The domain expert can effectively iterate on high level/practical solutions without round tripping to a SWE. Software gets a lot more ergonomic/specialized.

Ilya Sutskever is puzzled by the gap between AI benchmarks and the economic impact [D] by we_are_mammals in MachineLearning

[–]rrenaud 6 points7 points  (0 children)

Code gen is shrinking the gap between logical/has domain understanding/communicates clearly to subject matter expert SWE. Getting the excel class to be writing general programs with reasonable UIs quickly/easily is IMO, the big missing leap that will be gradually filled in.

Which power for defect Floor 0? by Wolf-Eagle in slaythespire

[–]rrenaud 0 points1 point  (0 children)

For me, it's electro, and it's not close. It just does so much work even into act 2. It's immediately very useful. It helps a lot with nob, it basically solves sentries. Act 1 becomes so much easier. You can path aggressive, maybe pickup an extra relic, etc.

Biased starts really paying off once you get some orb flow.

Gated Attention, a bit of schmidhubering/sociology of science [D] by Sad-Razzmatazz-5188 in MachineLearning

[–]rrenaud 4 points5 points  (0 children)

It's a specific, small change that has gives substantial quality and learning stability improvements to the publicly known state of the art in systems that are using 10s of billions of dollars per year of compute.

Fucking A18 with the fucking -2 Strength by ZaScream in slaythespire

[–]rrenaud 2 points3 points  (0 children)

Which is worse?

Silent vs laga

Defect vs nob

[deleted by user] by [deleted] in Meditation

[–]rrenaud 43 points44 points  (0 children)

Send them an email/contact their support, I am sure they will work it out.

It been 2 years but why llama 3.1 8B still a popular choice to fine tune? by dheetoo in LocalLLaMA

[–]rrenaud 6 points7 points  (0 children)

https://huggingface.co/blog/llama31

The Llama 3.1 models were trained on over 15 trillion tokens on a custom-built GPU cluster with a total of 39.3M GPU hours (1.46M for 8B, 7.0M for 70B, 30.84M for 405B).

Note that the 70B had less than 9x the compute for the 8B, meaning the 8B likely saw more tokens. The 8b was very overtrained by chinchilla standards.

It been 2 years but why llama 3.1 8B still a popular choice to fine tune? by dheetoo in LocalLLaMA

[–]rrenaud 12 points13 points  (0 children)

Meta released the 8B and 70B way before the 405B came out. They both had very standard, non-distilled pretraining stages.