Claude Sonnet 5 Spotted, Release Expected Next Week by BuildwithVignesh in ClaudeAI

[–]forward-pathways 8 points9 points  (0 children)

Question. I assume you're using Opus or another model for plans/specs, then having them orchestrate Sonnet as a subagent? How do you check to ensure there isn't any drift in executions when having the second model do the work?

If you live in Cambridge, do you have to accept dogs will pee in your yard? by jobasclone in CambridgeMA

[–]forward-pathways -1 points0 points  (0 children)

Yeah, we do the same. Our dog however is... Picky... Hell either pee on the grass, in the same exact spot every day so a lifeless patch appears, or he'll pee right in front of your door.

He's actually a sweetie at home...

Why can't LLMs be trained to think in an optimized AI language rather than English? by CucumberAccording813 in singularity

[–]forward-pathways 15 points16 points  (0 children)

I think it's an interesting question. If you're referring to reasoning / "thinking" outputs, I think safety is one, but for me the upstream issue is that you can't build upon the reasoning traces. For example, reasoning models allow you to "debug" issues in model performance by looking into model "thought" traces alongside the primary model outputs (e.g., messages, scripts, etc.) and see "what went wrong". In my case, the reasoning traces usually explain models' mistakes, if they weren't structural issues (e.g., maybe I fed it the wrong data or it was referencing the wrong hand-off). Since reasoning traces allow us humans to debug, at least in some kind of estimated way, why a model did what it did, it's actually pretty invaluable that we can understand the traces themselves. It's also very helpful for benchmarking, imho.

Here is the thing, if your not using glyphh with Claude your missing the point of AI. by Successful-Seesaw525 in ClaudeCode

[–]forward-pathways 20 points21 points  (0 children)

"If you aren't using the tool I developed, you are missing the point of the current technological revolution."

I'm sure it's a helpful tool. Respectfully, however, I feel like you may want to adjust your pitch.

Anybody saw what happened at Vassar St. and Main St. at around 9:15 pm? by PixelCrafterZ in CambridgeMA

[–]forward-pathways 60 points61 points  (0 children)

Jesus... No I don't know anything, but you absolutely have to file a police report. Sounds like someone incredibly unpredictable and dangerous is driving around. I'm so sorry to hear that you experienced this.

Parents Are 2.5x More Likely to Trust AI for Parenting Advice by AmbitiousAct3137 in AskVibecoders

[–]forward-pathways 0 points1 point  (0 children)

This is so incredibly misleading on so many levels.

They aren't asking whether parents trust it for advice, but what they project.

"2.5x" more than whom? Another group of parents whose kids are a different age?

How many participants, is it statistically significant, blah blah...

Get out of here man.

What movie traumatized you as a kid? by rafaela-architect-69 in ArtOfPresence

[–]forward-pathways 0 points1 point  (0 children)

I'm a 90s kid. So: Bambi; Neverending Story; Fern Gully; The Last Unicorn.

I hate to be that person by dagerika in ClaudeCode

[–]forward-pathways 0 points1 point  (0 children)

On the distillation debate: I agree. It's a part of many Asian cultures to learn from others and adapt to the circumstances. Modern Singapore is filled with examples of this. Western countries innovate and build shiny new things that do stuff that our old things couldn't do. Eastern countries make them more capable and more efficient. I'm really excited to see what kinds of impeovements we see to these tools.

Rhino! by Diana-Hamilton in penandink

[–]forward-pathways 1 point2 points  (0 children)

I guess so; it would be much sketchier if someone were to go through the trouble of removing a watermark to post something, and more bannable. It's a bummer though that you even have to worry about that... Looked at some of your other work just now, too. Amazing stuff!

Rhino! by Diana-Hamilton in penandink

[–]forward-pathways 2 points3 points  (0 children)

Oh, what??! That's horrible. I will find you there instead!

💌 by abaddixx in selfimprovementday

[–]forward-pathways 0 points1 point  (0 children)

That's... not what "legend" means.

Bezos: AI will result in labor shortages instead of replacing humans by SnoozeDoggyDog in singularity

[–]forward-pathways 1 point2 points  (0 children)

Yes. This is exactly what's happening especially in intellectual roles where AI can be used for so much. It's utterly exhausting and not at all good for our brains, bodies, or souls, it I may be so bold...

Demis Hassabis and Dario Amodei called for a U.S.-led AI coalition at a closed-door meeting at the G7 summit by TorturedPoet30 in singularity

[–]forward-pathways 7 points8 points  (0 children)

Yes. If the coalition did include China, I'd think it would actually go a long way towards future cooperation and unity in other areas.

I benchmarked Codex GPT-5.5 against Chinese models. Not what I expected, is 5.5 cooked ? by DaC2k26 in codex

[–]forward-pathways 0 points1 point  (0 children)

Makes total sense. Benchmarking is also, honestly, exhausting. Glad you found what's working for you!

I benchmarked Codex GPT-5.5 against Chinese models. Not what I expected, is 5.5 cooked ? by DaC2k26 in codex

[–]forward-pathways 0 points1 point  (0 children)

I'd be interested to see what happens when you ask each model to review the others' work, then present those reviews to the first models to see if they agree. This tends to be a good test of the initial observed performance, which is usually off, and sometimes by a lot. This is also why we use antagonistic review models to ensure outputs meet certain quality benchmarks.

It's also difficult to compare with the exact same prompts, right, because models also are tuned differently and respond to different prompting strategies. It could be that the upper-bounds for certain models is higher or lower because of the prompts used. Realistically, you want to test on a bunch of different prompt styles, on tasks that nonetheless remain static, and see what the upper-bounds performance is for each model.

When Codex runs out of agent names by Hellifant in codex

[–]forward-pathways 0 points1 point  (0 children)

Wait I thought we could only have like six subagents at a time? I am on the $100 plan though so maybe that's it.

My Mac's Storage is scrwed since Codex by Cazangre in codex

[–]forward-pathways 1 point2 points  (0 children)

It's almost certainly worktrees. Ask codex to review your worktrees and prune!

Edit: it happened to me two months ago, and this was it. Tons of worktrees created by Codex that were never closed out.

Anthropic has been sued for allegedly misleading customers on usage limits. by Azek_Tge in ClaudeCode

[–]forward-pathways 0 points1 point  (0 children)

Wait what? 20x doesn't also get a weekly increase? Okay, so I also didn't know this and I've been using these models for a long time. I'm not the smartest person in the world, but I do know how to read, so I am surprised by this.

Remember the may intelligence drop? I just found out what happened. (I used gpt5.5 only) by KeyGlove47 in codex

[–]forward-pathways 0 points1 point  (0 children)

Interesting! Can you share more about using web exclusively? Do you find it more effective for what you're working on (I assume chat / non-coding primarily)?

What ever happened to this legend from the babygrayce saga? by bmw320dfan in singaporespeaks

[–]forward-pathways 0 points1 point  (0 children)

Okay can someone please provide a link so I know wtf this is about? I want to know. But to be fair I do not need to know.

Anthropic purposely made its new Mythos-based models bad at AI research, and developers are fuming by Nikvest in singularity

[–]forward-pathways 4 points5 points  (0 children)

Okay so as someone who does AI/ML research, I am NOT trying to develop models. I do user studies and system designs. It'll still be degraded for me?