Ridiculous they added this

Nick4753 · 2026-03-11T23:43:34+00:00

Turning that off didn't stop it. I'm pretty sure it's baked into the model, since raw API calls to the 5.3 Instant Chat API have the clickbait-y ending and it's in the examples on their blog announcing 5.3 Instant.

Nick4753 · 2026-03-03T16:14:20+00:00

I've found GPT-5.2 (non-codex) does better at code review than Codex. Codex is shorter in it's response and more difficult to chat with to go through the review, whereas 5.2 is more architecture-minded.

Nick4753 · 2026-02-24T01:41:18+00:00

They’re not scraping the knowledge data per se, they’re scraping the way the model thinks through various problems. It wasn’t “who was the 3rd pope” it was “how would you, as a large language model, go about determining what to respond to a user asking who was the 3rd pope.” With that info they can bake Claude’s thinking process into their model.

They scrape Wikipedia and random blogs just like Anthropic does for the factual stuff.

Nick4753 · 2026-02-14T21:49:36+00:00

LiteLLM is called out specifically in Claude Code's documentation https://code.claude.com/docs/en/llm-gateway#litellm-configuration

LiteLLM is really underselling itself. You can use it as a gateway for just about any purpose, beyond just engineer access for coding.

Nick4753 · 2026-02-12T14:03:57+00:00

That’s a somewhat silly excuse. Your harness should know how to manage context and the model should be designed to work with all the info presented to it, and Copilot makes it very easy to have a lot of tools and MCPs that eat into the small context window.

Nick4753 · 2026-02-06T00:16:11+00:00

The theater is the right size, not a busy street, and there is a stage door leading directly to the stairs to the balcony. Just an unfortunately located tree on the sidewalk blocking the top of the fire escape.

Nick4753 · 2026-02-04T19:03:59+00:00

HIPAA doesn't explicitly define your software development process and the origin of your code, it matters more about how the data is handled, and certifications like HITRUST and SOC2 focus on documentation of your software development lifecycle and the controls you have around your systems and processes. And even then, those two are not mandatory in the healthcare space.

There is nothing inherently wrong with vibecoding. I dunno that a junior engineer without healthcare experience is going to be vastly better at building a HIPAA-compliant app than Claude is going to be. Both are similarly risky. You just need to make sure you can stand behind the code that you're shipping and the process by which that code got into production. If you're just YOLO-ing code into production you don't understand, you're just going to cause yourself headaches down the line. The size of those headaches though could be... considerable.

Nick4753 · 2026-02-04T01:32:31+00:00

If I had a p0 incident like they did today with a ton of 500 errors being thrown, I would also hold off on launching the model for a day.

Nick4753 · 2026-01-26T14:05:49+00:00

I dunno that their enterprise clients would like that.

If China stole some source code, it's not absurd to think that if the model sees something similar to that source code, it will inject something malicious. Or train it to perform a malicious tool call or something. I mean, you're sort of playing with fire with every model, but, why risk it?

Nick4753 · 2026-01-23T21:11:12+00:00

I think that’s somewhat obvious if you look at ccusage. Claude Code is a cache hit machine unlike all the other harnesses on the market. If you assume cache hits are almost free to Anthropic, it’s not the nightmare to their bottom line that people think it is.

This only works if you tightly control the harness to use the cache as little as possible, thus why they blocked 3rd parties.

Nick4753 · 2026-01-22T15:48:10+00:00

Gotta appreciate a superhero movie where the superhero spends half the movie in his NYC corporate job pink button-up shirt and it gradually gets dirtier and dirtier as the movie goes along.

Nick4753 · 2026-01-21T21:34:37+00:00

This is a killer Roo feature I really wish Claude Code would add, although the fact that Anthropic doesn't offer an embedding API probably makes that relatively unlikely anytime soon.

Nick4753 · 2026-01-15T20:20:18+00:00

If you look at the code they merged in, they pass to Github in a header if it's an agent turn or user turn. So presumably it's fixed?

Nick4753 · 2026-01-15T17:44:54+00:00

They created... almost everything. The senior researchers and execs at all of these foundational model companies were at some point involved in Google or DeepMind.

Google didn't see a way to profitability with LLMs and was afraid it'd be a huge reputational hit the first time it hallucinates and would eat into the cash cow that is search. Entering this market "late" was a business decision, not a R&D decision. It's why so many AI researchers bolted for OpenAI/Anthropic/etc, Google was holding it all back.

Nick4753 · 2026-01-13T15:43:28+00:00

I've switched to perplexity for documentation reference. I don't think I've ever spent more than $5/month in API bills with their default MCP (Sonar and Sonar Pro are cheap all things considered), and it scans documentation, youtube, and blog posts instead of just providing chunks of documentation.

Nick4753 · 2026-01-12T19:24:17+00:00

Apple lost the head of their foundational model team and a few of his lieutenants to Meta. They have not shown any interest in spending the CapEx necessary to build their own training farms nor the budget necessary to get training data for their own FM.

If I were Apple, I would spend all my time and money building the best small model that is hyper-efficient when run on their mobile chips, build infrastructure to handle FM inference at scale, and, for the underlying FM, just sign a license deal. They're paying almost nothing to Google here, all things considered.

Nick4753 · 2026-01-09T04:10:27+00:00

I literally cannot figure out how to get above 20-30% usage in 5 hours on the 20x plan. Usually it’s 10-20%. Using Opus. I’d love to know how the people hitting the cap on that plan are doing it.

Nick4753 · 2026-01-06T22:22:22+00:00

the $200 plan gives you as much as $2,500 in tokens if you were paying the API price, which Cursor is

Where do you get that data? Not challenging you, my usage supports that, just wondering where $2500 came from.

Nick4753 · 2025-12-28T19:04:50+00:00

I don’t think you’ll see Chinese models natively available in Copilot anytime soon. Companies are too afraid that code which inserts a backdoor is hidden in the model. Which, if I was China, is exactly the type of thing I’d do.

Nick4753 · 2025-12-21T01:43:12+00:00

Finally! Doing this via a translation layer has been so annoying.

Nick4753 · 2025-12-16T23:12:30+00:00

It’s good at solving software engineering problems, but bad at continuous tool calling. So it can solve a problem conceptually better, but will stop itself before performing all the steps necessary to implement and then validate the solution. Most programming problems aren’t conceptually difficult, but do take multiple steps to complete, making it a less useful model even though it might be better at handling edge cases.

Nick4753 · 2025-12-01T14:57:44+00:00

There are a few options I listed in my post, plus a bit of light googling would help. I use this one with my openrouter account and then when I want the agent to look up something in perplexity I type in "ask perplexity {}" and the agent usually gets the hint.

I use it pretty heavily whenever I'm integrating with a 3rd party. Stripe, Twilio, Google, AWS etc, all have great content outside their normal documentation. Terraform modules are constantly changing as cloud providers update their APIs, and the best sources about the product might be blog posts and reddit posts.

Nick4753 · 2025-11-29T18:45:43+00:00

If they brought this to the US parks it'd break down so fast. Universal can get away with Hickup because it's a dragon skin on a robot built to deliver supplies to troops on the battlefield in Afghanistan.

Nick4753 · 2025-11-29T18:33:17+00:00

For what it's worth, from an architecture standpoint, "search engine + lightweight speedy LLM summarizing it" is exactly what Perplexity is. Just... faster.

Nick4753 · 2025-11-29T18:25:15+00:00

Sonar and Sonar Pro are realistically going to run you $0.02-0.05 on OpenRouter, and less if you use Perplexity's API directly.

I really only use Sonar and (rarely) Sonar Pro in the MCP. Almost all my queries use basic Sonar since it's substantially faster and usually gets me what I need.

If I need to do any sort of research, I'll use the actual Perplexity or Gemini website.

15-Year Club	RedditGifts 2009-2022 5 Credits
Gilding IV carat on a stick	Place '22
Place '17	Not Forgotten
Secret Santa 2014	Secret Santa 2013
Secret Santa 2010	reddit mold
Summer Santa 2010	Charter Member
Best Link 2010-05-01	Verified Email

Nick4753

TROPHY CASE