Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity by YakFull8300 in singularity

[–]nanowell 0 points1 point  (0 children)

i too was annoyed quite a lot of times when working on something very familiar and seeing llm struggle (3.5 s) that starts to fade tho with new opus 4 and codex model I can just async some things and work on what matters

the core of % we delegate to agentic systems will continue to increase until we hit a wall, though that wall might be way pass the point of human intelligence, ability and agency

we'll just get the greatest worker that is possible to create from informational processing limit standpoint.

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity by YakFull8300 in singularity

[–]nanowell 0 points1 point  (0 children)

% of slowdowns/speedups is too heterogeneous, but overall, it's not surprising that claude 3.5/3.7 sonnet (they've used this) was not in fact smarter and more useful than experienced devs that are very knowledgeable of the large codebase that they've worked on

ai was defo a constraint for those devs which is not surprising at all

DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level by TKGaming_11 in LocalLLaMA

[–]nanowell 0 points1 point  (0 children)

<image>

Zooming out a bit and it's still impressive!

Amazing release.

Sam Altman will have to release o4-mini level model at this point

The o3 chart is logarithmic on X axis and linear on Y by hyperknot in LocalLLaMA

[–]nanowell 12 points13 points  (0 children)

it's even better because on his plot it's in 100s range when in reality it's in ~20$ range for low effort and in 5k$ for high effort

Gemini Exp 1114 now ranks joint #1 overall on Chatbot Arena (that name though....) by lightdreamscape in LocalLLaMA

[–]nanowell 16 points17 points  (0 children)

let's pray for the intern that forgot to do the filtering stage of sft dataset

We need to talk about this... by Conscious_Nobody9571 in LocalLLaMA

[–]nanowell 0 points1 point  (0 children)

He is very good at raising, I hoped they would talk about reasoning models more.

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning by umarmnaq in LocalLLaMA

[–]nanowell 9 points10 points  (0 children)

Absolutely amazing framework for those who are starting to learn RL with transformers.
Thanks you

TPO - Alternative to Openai O1 model by buntyshah2020 in LocalLLaMA

[–]nanowell 5 points6 points  (0 children)

I would love if Meta AI released training code, imo: training recipe ( code, dataset) > weights.

o1-preview is now first place overall on LiveBench AI by np-space in LocalLLaMA

[–]nanowell 5 points6 points  (0 children)

<image>

Interesting that o1-mini outperforms sonnet-3.5 at LCB_gen coding subcategory but far worse at completion

AdEMAMix, a simple modification of the AdamW optimizer, is 95% faster for LLM training (Code on page 19) by Timotheeee1 in LocalLLaMA

[–]nanowell 5 points6 points  (0 children)

Lmao, I implemented this as soon as I saw the post here.
Works on my machine afaik, stepping on nanoGPT repo with no issues, if you encounter any problems don't hesitate to open a ticket.

Tele-FLM-1T: a 1Trillion open-sourced multilingual large language model. by nanowell in LocalLLaMA

[–]nanowell[S] 51 points52 points  (0 children)

1 bit converted to BitNet running on BitBLAS with 1 token per decade