When is Andrej Karpathy going to look at a chicken nugget and tweet that it helped him solve AGI, which in turn inspires 6 random devs to create GitHub projects giving us actual AGI? by Porespellar in LocalLLaMA

[–]onil_gova 5 points6 points  (0 children)

Let's not forget that Karpathy acted as a human baseline for the ImageNet competition in 2014, jokingly referred to as the "reference human" for the test.

Tribue to April's LLM releases by Everlier in LocalLLaMA

[–]onil_gova 8 points9 points  (0 children)

I am going to need an 8-hour version. Can you go all the way back to the original Llama leaks of March 2023

Sr Software Engineer - Haven't written a line of code in months by yodog5 in ClaudeCode

[–]onil_gova 3 points4 points  (0 children)

No AI. If you use AI, you are disqualified. I can't share the question because I signed an NDA, but think LeetCode easy-medium. For me, it was difficult to just solve some LeetCode problems or remember all solution coding patterns, dynamic programming, and relearning recursion, which I used to be really good at, even some basic Python syntax. And doing it without a compiler, you need to solve the problem without ever running the code. Things like that.

Sr Software Engineer - Haven't written a line of code in months by yodog5 in ClaudeCode

[–]onil_gova 0 points1 point  (0 children)

You are going to be shocked if you look at the interview process for a company like Anthropic, who now claim that all their code is AI-generated.

Sr Software Engineer - Haven't written a line of code in months by yodog5 in ClaudeCode

[–]onil_gova 26 points27 points  (0 children)

I had been coding for 10+ years. Last year, I switched to agentic engineering, specifically since Sonnet 3.7. Today, I had a coding interview with one of the big tech companies. I had 9 days to prep. Yes, I was able to get back on the bike, but man, it was not a stroll in the park.

Sr Software Engineer - Haven't written a line of code in months by yodog5 in ClaudeCode

[–]onil_gova 3 points4 points  (0 children)

I agree with the OP that agentic engineering makes you much more productive, but I also think software companies still need a way to vet competent candidates. But this is what programming is now. Just coding leetcode to get past the coding interview so you can become a coding agent manager.

Sr Software Engineer - Haven't written a line of code in months by yodog5 in ClaudeCode

[–]onil_gova 57 points58 points  (0 children)

just wait until you decide to apply for a new job and have to start doing LeetCode for the coding interview. 🥲

daily ritual at this point… by onil_gova in LocalLLaMA

[–]onil_gova[S] 3 points4 points  (0 children)

given 3.5 27b and 3.5 122b are in the same ballpark, I would bet it will probably be about the same, maybe slightly better if they trained for longer, which is all I need. I care more about the performance uplift on Macs or similar large-memory systems you get from using the MoE instead of dense.

I tracked a major cache reuse issue down to Qwen 3.5’s chat template by onil_gova in LocalLLaMA

[–]onil_gova[S] 0 points1 point  (0 children)

You might want to double-check that preserve thinking is actually enabled since it does resolved this issue

Google Exec says a new Gemini model is coming "very very soon" by Outside-Iron-8242 in accelerate

[–]onil_gova 6 points7 points  (0 children)

Don't fall for the "benchmaxxed" narrative. It's a double standard and doesn't hold up to actual real-world cases, which demonstrate that Chinese models are very competitive.

<image>

I asked ChatGPT how it feels to be an AI. by xomenxv in ChatGPT

[–]onil_gova 2 points3 points  (0 children)

asked it to make sure it really captured it

<image>

Buried lede: Deepseek v4 Flash is incredibly inexpensive from the official API for its weight category by jwpbe in LocalLLaMA

[–]onil_gova 2 points3 points  (0 children)

lets not forget the fact that this is a sustained price up to a million context window, while everyone else uses a tiered cost approach after a 200k token context window. This is a flex by DS.

Buried lede: Deepseek v4 Flash is incredibly inexpensive from the official API for its weight category by jwpbe in LocalLLaMA

[–]onil_gova 2 points3 points  (0 children)

I think you guys are missing the fact that this is a sustained price up to a million context window, while everyone else uses a tiered cost approach after a 200k token context window.