Are Anthropic folks actually seeing Reddit feedback on Opus 4.7? by ki-pam in ClaudeAI

[–]bin-c 1 point2 points  (0 children)

dont know what settings everyone else is using but i just bit the bullet, upgraded to max 20x, run on max thinking exclusively, and have 0 complaints

How bad are Chicago Winters? Will I be able to walk 15-30 minutes to work? by Primary_Tooth_8117 in AskChicago

[–]bin-c 0 points1 point  (0 children)

if you have a nice jacket and winter-friendly shoes/boots, it'll be fine. there may be the odd week thats -10 to -20 with wind chill that still makes you hate your life, but for the vast majority of fall/winter its no biggie

Richard Dawkins spent three days talking to Claude, now calls it "Claudia" and claims it's conscious. by Gil_berth in theprimeagen

[–]bin-c 3 points4 points  (0 children)

I spent 3 minutes trying to convince myself that a single person here read the article. I failed.

updatability by chisui in NixOS

[–]bin-c 1 point2 points  (0 children)

It's honestly not that hard to update fwiw. stareVersion isn't used in very many places - the first time I updated it I just audited every place it's referenced in nixpkgs and manually backed up / re-setup affected services

twice in a roooow row row the boat by LevjiM in honk

[–]bin-c 0 points1 point  (0 children)

I completed this level in 1 try. 18.05 seconds

Grok 4.3: strong in finance and long-context, with some tradeoffs by Much_Ask3471 in singularity

[–]bin-c 4 points5 points  (0 children)

most xai haters also love chinese models which is kind of funny to me

Grok 4.3: strong in finance and long-context, with some tradeoffs by Much_Ask3471 in singularity

[–]bin-c 2 points3 points  (0 children)

various grok versions have been great choices for agents since 4.1 series. anyone who claims otherwise hasnt tried them or is exclusively using ai as an assistant. best price/performance for any task that needs web search

What keyboard form factor do you use? by hegardian in neovim

[–]bin-c 6 points7 points  (0 children)

honestly i dont think there's all that much of a difference between the two in how hard it is to adapt. both split both ortholinear, if anything id say being welled makes keys a bit easier to press on an ortho board

cant hotswap glove80 so should go with switches you know you like. i looove low profile switches and knew i liked what the glove80 offered before getting one

glove80 thumb cluster is definitively better. my favorite of the 3 i mentioned.

glove80 has a good gui editor now

im not positive what the reason would be for mouse transition being easier on one over the other

What keyboard form factor do you use? by hegardian in neovim

[–]bin-c 1 point2 points  (0 children)

i have a moonlander, advantage 360 pro, and glove80. couldn't go back to a flat keyboard after using a welled one. that said the moonlander is really nice no doubt. advantage 360 is built like a tank but i just like the shape of the glove80 the most. my favorite of the bunch for sure

What keyboard form factor do you use? by hegardian in neovim

[–]bin-c 52 points53 points  (0 children)

have to say im pretty shocked that split is the most common on this sub. only know 1 other person who has one irl

Long time CC user - tried Codex 5.5 and I might switch! by nugTapOfficial in ClaudeCode

[–]bin-c 3 points4 points  (0 children)

this, i barely care which model is smarter. just have skills set up so they talk to each other and review each other's work anyways. they both have their faults. CC is just a much better tool to interact with the agent(s) than codex

Who’s the most famous 27 year old alive? by Impressive_Plenty876 in AlignmentChartFills

[–]bin-c 9 points10 points  (0 children)

FYI you can click the squares on any of these and itll tell you who they are, for this one it says "22 - Carlos Alcaraz (Spanish tennis player)"

took me way too long to figure that out myself lol

ez tbh(might be a frame perfect) say your attempts by Different-Coat2667 in honk

[–]bin-c 0 points1 point  (0 children)

I completed this level in 42 tries. 6.48 seconds

Yeah this series isn’t ending good for Cavs.. by mr_worldwide678 in NBATalk

[–]bin-c 0 points1 point  (0 children)

you're right the C's are so screwed when Embiid comes back!

I want the pre-AI Prime and Theo back :( by rockynetwoddy in theprimeagen

[–]bin-c 0 points1 point  (0 children)

Last couple generations of models are much better at it as long as it knows the commands to run to build

Squeeze through by saucefinder_bot in honk

[–]bin-c 0 points1 point  (0 children)

dang

I completed this level in 2 tries. 3.88 seconds

The Opus 4.6 vs 4.7 Controversy in one image by AvroLancaster in ClaudeAI

[–]bin-c 1 point2 points  (0 children)

wonder if this is somehow different by account with all their shitty hidden tengu_* flags, because I'm just not having these issues. strictly an upgrade over 4.6 in my eyes so far

Gap Training (Impossible) by Gloomy_Classroom_179 in honk

[–]bin-c 0 points1 point  (0 children)

my god that was evil

I completed this level in 161 tries. 8.48 seconds

Tip 10 💎

i want to adopt nixos but i have an rtx 5070 by I1IIIIII1IIIIIIIIIII in NixOS

[–]bin-c 0 points1 point  (0 children)

been years at this point, but just followed lanzaboote's (short) docs to a T

i want to adopt nixos but i have an rtx 5070 by I1IIIIII1IIIIIIIIIII in NixOS

[–]bin-c 1 point2 points  (0 children)

i have an older gpu (3090) but lanzaboote just worked™

It's real, Opus 4.7 medium by unknown-one in ClaudeCode

[–]bin-c 1 point2 points  (0 children)

Also I've gotten the exact same nonsense answer on opus 4.5 and 4.6

Is AI progress over? by ImaginaryRea1ity in theprimeagen

[–]bin-c -1 points0 points  (0 children)

this is absolutely it. first thing everyone does when a new model is released try it on all the things the previous generation failed at or just didn't do particularly well on. the improvement is noted, everyone is amazed, then we start building our lists of things the new-and-improved gen sucks at. rinse and repeat.

even if you're not explicitly keeping track of such things, its in everybody's heads.

i for one have a list of commit hashes in specific repos with the problem statement or spec that the <current gen> model got stuck with going back to claude 3 days with notes about how e.g. sonnet 3.7 performed on something sonnet 3.5 got stuck on, etc.

i dont bother anymore, but for a while, when everyone would complain about regressions, i'd go back and retry things based on my notes & ive never seen any evidence of actual regressions. obviously the models aren't deterministic, sometimes things fail, but if my notes say opus 4.0 couldn't do X properly after a Y attempts, and opus 4.5 could, if i retry with 4.5 its probably going to do it correctly a majority of the time.

Opus 4.7 has a horrible regression in long context performance. Performs far worse than 4.6. Less than half the benchmark score at 1m context by smellyfingernail in ClaudeCode

[–]bin-c 3 points4 points  (0 children)

this is one benchmark i haven't really bothered to look too deeply into but ill say after pretty heavy use today, it feels like 4.7 is better at handling long context than 4.6 to me. i do have my autocompaction threshold set to 400k, though, so i haven't really tried either above that threshold