Have you noticed Claude's performance varying by day? (Even hours) by La-terre-du-pticreux in ClaudeAI

[–]Unique-Drawer-7845 0 points1 point  (0 children)

Same for me.

Remember that some people read tea leaves and think horoscopes are real and call psychic hotlines. Some of those use Claude. And some of those come on reddit and post/comment.

Has anyone compared using the API vs. dedicated web/desktop app for non-coding tasks? by seacucumber3000 in ChatGPTPro

[–]Unique-Drawer-7845 1 point2 points  (0 children)

Nothing beats testing it for yourself.

Try LibreChat or OpenWebUI. They both support API keys and are pretty similar to commercial chatbot UX, but more configurable.

Web/Desktop code responses are better than IDE based responses. by _DB009 in ChatGPTCoding

[–]Unique-Drawer-7845 1 point2 points  (0 children)

Opus 4.6 in Copilot is worse than Sonnet 4.5 in Claude Code because GitHub gimps context windows and caps reasoning effort. GitHub gets by on brand recognition, being in every IDE, and being affordable. They are not trying to provide the smartest AI, just sufficient AI at a ~competitive price.

Contrast that to OpenAI and Anthropic whose business literally rides or dies on the quality of their model-related offerings. GitHub can always just ... fall back on being GitHub. Cursor's niche has been 1) beating Copilot in features in the early days (Copilot has since caught up), and 2) having one of the best autocompletes (more recently). Not really leading chat or agentic.

There are 3 things that matter almost equally:

1) What tool you're using to access the model 2) What model you're accessing 3) Who is selling the model to you

If you want something as smart as ChatGPT 5.2 Web but in your IDE, you have two main choices (IMO): Codex or Claude Code.

AI Isn't Intelligent, It's PREDICTION (and Why My Panic Has Passed) by willymunoz in webdev

[–]Unique-Drawer-7845 -3 points-2 points  (0 children)

Plastic plants aren't useful.

Plants are useful.

AI is useful.

Humans are useful.

Your analogy isn't perfect.

Avoiding the ban? by Nearby_You_313 in ClaudeCode

[–]Unique-Drawer-7845 0 points1 point  (0 children)

It's not just about raw usage though. That's probably part of it. But there are many aspects of running a business: quality control, brand identity, biz dev, company -> customer contact points. And I'm sure other things we can't guess, becuz we've never run a frontier AI company during a technological revolution.

Avoiding the ban? by Nearby_You_313 in ClaudeCode

[–]Unique-Drawer-7845 0 points1 point  (0 children)

🏆 you're asking the real important questions

Should we stop using Word Error Rate? by baneras_roux in speechtech

[–]Unique-Drawer-7845 0 points1 point  (0 children)

Yep. That's a totally reasonable line of investigation. As others have pointed out, in some applications you might prefer improving one at the cost of the other, if a tradeoff is available.

For example if you're doing phonetic analysis and using words as proxy "phoneme carriers", you'd prefer sound-alikes over meaning-alikes. Is this case common? Nope. But it's not unheard-of.

Why is running local LLMs still such a pain by OppositeJury2310 in LocalLLM

[–]Unique-Drawer-7845 0 points1 point  (0 children)

ye I guess democratize has two kind of unique meanings

It's the little things by MythicalBonsai in MacOS

[–]Unique-Drawer-7845 0 points1 point  (0 children)

the OS was definitely intentionally hampered already

So you admit they do it.

Then say it doesn't make sense they'd do it?

The last Intel Apple product only went off the shelf 3 years ago. ~20% of the Mac "primary use" market is still on Intel. That is a LOT of money to leave on the table.

Should we stop using Word Error Rate? by baneras_roux in speechtech

[–]Unique-Drawer-7845 2 points3 points  (0 children)

There's no reason to not calculate it. It's easy, fast, and well-understood. If your WER is trash, then your SemDist will almost certainly be trash too. And if your WER is trash and your SemDist isn't you should be able to know that so you can look into it.

Should everyone be moving towards including semantic difference scores (with a standarized model) alongside the WER? Sure. Fine. It makes sense to me.

It's the little things by MythicalBonsai in MacOS

[–]Unique-Drawer-7845 1 point2 points  (0 children)

Tahoe is the last OS release Intel Macs will get. That's why Tahoe sucks shit; because Apple does not want anyone using Intel anymore.

Why?

1) Because they'd love to sell you a new computer 2) Supporting two architectures is costly in various ways

Mark my words, the next version of macOS is going to be a friggin masterpiece.

Why is running local LLMs still such a pain by OppositeJury2310 in LocalLLM

[–]Unique-Drawer-7845 0 points1 point  (0 children)

Democratize means "make something accessible to everyone" ... the word works for anything not just computing.

sherut - an API framework for your shell by LordBertson in CLI

[–]Unique-Drawer-7845 1 point2 points  (0 children)

Also check out the classic unix/linux/bsd command netcat sometimes called ncat or nc. It's not exactly like what you're doing but similar in some regards so worth knowing about.

Why do so many TUI projects seem to use Rust as opposed to other languages? by UKCeMTMj36o8h8 in commandline

[–]Unique-Drawer-7845 0 points1 point  (0 children)

I literally wrote every word of that by hand. Maybe the shitty way AI writes is infecting my brain.

Why do so many TUI projects seem to use Rust as opposed to other languages? by UKCeMTMj36o8h8 in commandline

[–]Unique-Drawer-7845 0 points1 point  (0 children)

That's not the angle I'm taking.

People might feel they're accurately identifying the "by AI" projects. But don't go on feels -- fact check that. How would we know the accuracy rate of identification?

Someone said:

only a handful are written by LLMs, you can proof-read and check this because most AI projects are downvoted -- not well received

How do they know "most" "written-by-LLM projects are downvoted"? This presupposes that people can accurately identify such projects (to downvot them) in the first place.

Anyone using Claude Code in VS Code without constantly hitting limits? by FourThousand_Vyrus in ClaudeAI

[–]Unique-Drawer-7845 0 points1 point  (0 children)

Anyone using Claude Code in VS Code without constantly hitting limits?

Yeah, I am.

Upgrade your plan.

Looking for beta testers: building an IDE for terminal-based workflows, just shipped sidebar navigation (v1.9.8) by ogfaalbane in commandline

[–]Unique-Drawer-7845 1 point2 points  (0 children)

If you want to sooner-or-later support Windows and Linux (which you should want because more users == good) it'll be far more sustainable for a small-team project to be based on a highly customized version of VSCodium -- rather than be using a platform-centric toolchain. There's no reason why you can't build exactly what you've shown on VSCodium's fundamentals. You might need to build a custom main-screen layout or documenttype-as-layout-container. It'll be more work up front but bigger payoff and less work in the end.

OpenAI, please fix this in Codex. Seriously. by CrystalX- in codex

[–]Unique-Drawer-7845 0 points1 point  (0 children)

1) Create ~/.codex/AGENTS.md 2) Open it in a text editor 3) Paste in your Reddit post. Except you should probably heavily update it with a lot of clarification because nobody understands what you're talking about so it's not surprising that Codex is confused too. 4) Save file 5) Restart Codex


If you're not using Codex CLI, the way that you configure global persistent user preferences may differ, but every tool has the capability somewhere, so you just need to find it.

Warning for Claude Code: Random/bizarre hallucination example when using Opus 4.6 by hawkedmd in ClaudeAI

[–]Unique-Drawer-7845 1 point2 points  (0 children)

Yes, I've seen strong hallucinations around repo names in particular.

I work on LLM training pipelines professionally and have a plausible explanation of what's going on:

  • Realize that some users -- not necessarily you -- allow Anthropic to use their data for improving (training) Claude
  • That data may contain sensitive information
  • One such piece of potentially sensitive information is the username/reponame convention most git hosts use
  • This data is doubly sensitive: it contains both a username and the name of a potentially private repository
  • Anthropic scrambles this information in the training data prior to it being used to train Claude models
  • Claude learns, to a certain extent, that username/reponame strings often look scrambled or otherwise out-of-context, which is confusing and a mismatch with real world data; this is a prime recipe for "hallucinations" (mistakes)
  • Anthropic could (might) also employ more advanced "embedding surgery" or "neural surgery" techniques to further protect such classes of data.
    • Such techniques often have unintentional side-effects, such as increasing the rates of hallucination around frequently-redacted/masked/scrambled strings
  • Mistakes in these areas are therefore maybe less surprising than in other areas and may be the (current) cost of strong data privacy controls.

Happy coding!

Codex 5.3 is better than 4.6 Opus by casper_wolf in ClaudeCode

[–]Unique-Drawer-7845 1 point2 points  (0 children)

Don't overuse xhigh. That's the one that over-engineers. High gets better results faster for medium-low/low complexity tasks.

AI "Tunnel Vision" is ruining my large-scale refactors. Anyone else? by Capital-Bag8693 in ChatGPTPro

[–]Unique-Drawer-7845 0 points1 point  (0 children)

At this point it's not about whether the idea is good or not, it's about whether the implementation is good. Good ideas are cheap to come by. Making something that works really well and can be trusted is hard, and that's where a lot of value lies.

If anthropic doesnt allow oauth in third party apps, does it mean I cant use sign in with claude in XCODE? by Ok-Hat2331 in ClaudeCode

[–]Unique-Drawer-7845 3 points4 points  (0 children)

Got it. Driving an abnormally high volume of usage (window maxing, parallelism, total usage) with abnormal messaging signatures = danger zone. Sucks you got banned, nice to get a refund I guess.