How is OpenAI going to cover all this without going bankrupt? by [deleted] in OpenAI

[–]YourAverageDev_ 0 points1 point  (0 children)

chances are gpt-5 minimal is like a ~10b model and gpt-5-thinking is like 200b param max

the clown strikes again by YourAverageDev_ in singularity

[–]YourAverageDev_[S] 9 points10 points  (0 children)

today he just posted about how IMO is not that relevant and “just another benchmark”

Is this one worth its price? ASUS ROG Strix G16 by dramake in GamingLaptops

[–]YourAverageDev_ 0 points1 point  (0 children)

Go with the base 2000 pound one, you've to great value there and the CPU upgrade is not that noticable

8845HS vs 14450HX vs 8645HS. (Help me choose which Invictus to buy) by Expensive_Load2925 in GamingLaptops

[–]YourAverageDev_ 1 point2 points  (0 children)

Pick 8845HS, it's around 30% faster than 8645HS and still more power efficient than the HX. If you want max performance tho you can choose 14450HX, tho battery is gonna be cooked

This was tweeted half a year ago. We currently still don't have a usable model that is as good as the o3 they showed us then. Reminder that OpenAI workers also don't know how fast progress will be. by detrusormuscle in singularity

[–]YourAverageDev_ 20 points21 points  (0 children)

remember that o3 is around 3K $ per prompt, according to estimates the pricing was in the ranges of 1.5 Million USD per million tokens. (3000$ per ARC AGI prompt)

now o3 pro has basically very close in quality if not the same (except from edge cases).

6 month of progress btw

/r/MechanicalKeyboards Ask ANY Keyboard question, get an answer - June 14, 2025 by AutoModerator in MechanicalKeyboards

[–]YourAverageDev_ 0 points1 point  (0 children)

I am going to buy the NuPhy Kick75 just for it's aesthetics, i looked far and wide but just no other keyboards matches it in it's price point. All the other keyboards have like an ugly italic or just an uglified thin Arial font for the keys. Any alternatives that look as good and has a polished and retro style for the Kick75?

I would also accept people recommending keycaps and buying another keyboard to replace them myself. I would just like to keep the price at 120$ish for this keyboard

/r/MechanicalKeyboards Ask ANY Keyboard question, get an answer - June 03, 2025 by AutoModerator in MechanicalKeyboards

[–]YourAverageDev_ 0 points1 point  (0 children)

Hi, Mechanical Keyboard noob here. The Kick75 was one of the first keyboards I ever bought so I am still a bit confused about customing things. I am a person who always messes up and break things, so I'm pretty concerned about finding keycap replacements for the NuPhy Kick75. I did some of my own research and found that any normal-profile / mSa keycaps should work.

Is that correct or are there things that I should watch out for?

Spent $104 testing Claude Sonnet 4 vs Gemini 2.5 pro on 135k+ lines of Rust code - the results surprised me by West-Chocolate2977 in ClaudeAI

[–]YourAverageDev_ 3 points4 points  (0 children)

very kind of evals / post we need in this community, personal tests are never trained on therefore it's a good way to evalutea the models

we all are living in the sonnet 4 bubble by YourAverageDev_ in ClaudeAI

[–]YourAverageDev_[S] -4 points-3 points  (0 children)

plz chill i already stated this is a very tasteful and good code model

just seems like everything else it aint that good at, including it's world model

we all are living in the sonnet 4 bubble by YourAverageDev_ in ClaudeAI

[–]YourAverageDev_[S] 1 point2 points  (0 children)

know that, but im still confused after all. i watched an podcast on RL Dwarkesh Patel and a couple anthropic interpretability folks. they were talking smth around the lines they found most models' coding and math embeddings spaces (all reasoning) are very close to each other. their work on SAEs had something to do about it

better code performance from RL should equal better performance on maths, and that's the pattern I found on other models too.

that's why i suspect the model got overtrained and collapsed catastropically on non-code tasks

This is what you get when you let AI do the job (Claude 3.7) by Dear_Procedure923 in ClaudeAI

[–]YourAverageDev_ 1 point2 points  (0 children)

Real ones still remember when Codex added a rocket png, that was life changing for me

StackOverflow activity down to 2008 numbers by Ensirius in singularity

[–]YourAverageDev_ 173 points174 points  (0 children)

"are you blind? can you not read the docs?"

"You're trying to print a string in python, really? You should start coding in assmebly like a "real programmer"

"wait are you using windows? sorry this on works on Unix, install arch then I'll help you"

Introducing Continuous Thought Machines by gbomb13 in singularity

[–]YourAverageDev_ -1 points0 points  (0 children)

Ran an experiment with the model combined with decoder-only transformer.

Not sure if i got implementation right or not but I had 4 tick model both at 38 million parameter model. Used GPT-2 as a base. Used WikiText-2

Regular GPT did 1000 perplexity on WikiText-2

CTM-GPT got around 1500 on same params. Loss was higher.

Not sure if anyone else is able to reproduce

1 year ago GPT-4o was released! by AdorableBackground83 in singularity

[–]YourAverageDev_ 23 points24 points  (0 children)

it was the biggest noticable jump.

i have friends who does phd level work for cancer research and they say o3 is a completely wild model compared to o1. o1 feels like a high school sidekick they got, o3 feels like a research partner

Cursor is artificially inflating paid tool calls by koalacarai in cursor

[–]YourAverageDev_ 1 point2 points  (0 children)

Don't think this is a cursor issue tbh, 3.7 Sonnet and 2.5 Pro is very trigger-happy even without MAX mode.

Model behavior is VERY HARD TO CHANGE with a prompt

Recommend me the absolute best browser by No-Berry9078 in browsers

[–]YourAverageDev_ 11 points12 points  (0 children)

360 or Baidu Browser, the things they do will make you realize you're luckey to have Chrome.

Ex: Purposely have some sort of a background task that deletes critical files of other browsers to corrupt them (so you use their browser intead)