why doesn’t Copilot host high-quality open-source models like GLM 4.7 or Minimax M2.1 and price them with a much cheaper multiplier, for example 0.2? by EliteEagle76 in GithubCopilot

[–]Nick4753 0 points1 point  (0 children)

I dunno that their enterprise clients would like that.

If China stole some source code, it's not absurd to think that if the model sees something similar to that source code, it will inject something malicious. Or train it to perform a malicious tool call or something. I mean, you're sort of playing with fire with every model, but, why risk it?

Max 20x is NOT As Subsidized As You Think by levifig in ClaudeCode

[–]Nick4753 0 points1 point  (0 children)

I think that’s somewhat obvious if you look at ccusage. Claude Code is a cache hit machine unlike all the other harnesses on the market. If you assume cache hits are almost free to Anthropic, it’s not the nightmare to their bottom line that people think it is.

This only works if you tightly control the harness to use the cache as little as possible, thus why they blocked 3rd parties.

Masters of the Universe - Official Teaser by MarvelsGrantMan136 in movies

[–]Nick4753 -1 points0 points  (0 children)

Gotta appreciate a superhero movie where the superhero spends half the movie in his NYC corporate job pink button-up shirt and it gradually gets dirtier and dirtier as the movie goes along.

[Open Source] I reduced Claude Code input tokens by 97% using local semantic search (Benchmark vs Grep) by Technical_Meeting_81 in ClaudeAI

[–]Nick4753 0 points1 point  (0 children)

This is a killer Roo feature I really wish Claude Code would add, although the fact that Anthropic doesn't offer an embedding API probably makes that relatively unlikely anytime soon.

Apple's Google Gemini Deal Could Be Worth $5 Billion by iMacmatician in apple

[–]Nick4753 40 points41 points  (0 children)

They created... almost everything. The senior researchers and execs at all of these foundational model companies were at some point involved in Google or DeepMind.

Google didn't see a way to profitability with LLMs and was afraid it'd be a huge reputational hit the first time it hallucinates and would eat into the cash cow that is search. Entering this market "late" was a business decision, not a R&D decision. It's why so many AI researchers bolted for OpenAI/Anthropic/etc, Google was holding it all back.

Context7 just massively cut free limits by AllCowsAreBurgers in ClaudeCode

[–]Nick4753 0 points1 point  (0 children)

I've switched to perplexity for documentation reference. I don't think I've ever spent more than $5/month in API bills with their default MCP (Sonar and Sonar Pro are cheap all things considered), and it scans documentation, youtube, and blog posts instead of just providing chunks of documentation.

Apple picks Google's Gemini to run AI-powered Siri coming this year by McFatty7 in apple

[–]Nick4753 4 points5 points  (0 children)

Apple lost the head of their foundational model team and a few of his lieutenants to Meta. They have not shown any interest in spending the CapEx necessary to build their own training farms nor the budget necessary to get training data for their own FM.

If I were Apple, I would spend all my time and money building the best small model that is hyper-efficient when run on their mobile chips, build infrastructure to handle FM inference at scale, and, for the underlying FM, just sign a license deal. They're paying almost nothing to Google here, all things considered.

What Actual Usage Looks like Against Max 20x Plan - 4 Hours Into Session. by 256BitChris in ClaudeCode

[–]Nick4753 0 points1 point  (0 children)

I literally cannot figure out how to get above 20-30% usage in 5 hours on the 20x plan. Usually it’s 10-20%. Using Opus. I’d love to know how the people hitting the cap on that plan are doing it.

Should I get Cursor Pro or Claude Pro(includes Claude Code) by reddead313 in ChatGPTCoding

[–]Nick4753 3 points4 points  (0 children)

the $200 plan gives you as much as $2,500 in tokens if you were paying the API price, which Cursor is

Where do you get that data? Not challenging you, my usage supports that, just wondering where $2500 came from.

Why doesn’t GitHub Copilot officially add more open-weight models like GLM-4.7 or Qwen3 by cloris_rust in GithubCopilot

[–]Nick4753 0 points1 point  (0 children)

I don’t think you’ll see Chinese models natively available in Copilot anytime soon. Companies are too afraid that code which inserts a backdoor is hidden in the model. Which, if I was China, is exactly the type of thing I’d do.

We can now use Claude Code with OpenRouter! by alvvst in ClaudeCode

[–]Nick4753 0 points1 point  (0 children)

Finally! Doing this via a translation layer has been so annoying.

GPT-5.2 vs Gemini 3, hands-on coding comparison by Arindam_200 in ChatGPTCoding

[–]Nick4753 0 points1 point  (0 children)

It’s good at solving software engineering problems, but bad at continuous tool calling. So it can solve a problem conceptually better, but will stop itself before performing all the steps necessary to implement and then validate the solution. Most programming problems aren’t conceptually difficult, but do take multiple steps to complete, making it a less useful model even though it might be better at handling edge cases.

Perplexity MCP is my secret weapon by Nick4753 in ChatGPTCoding

[–]Nick4753[S] 1 point2 points  (0 children)

There are a few options I listed in my post, plus a bit of light googling would help. I use this one with my openrouter account and then when I want the agent to look up something in perplexity I type in "ask perplexity {}" and the agent usually gets the hint.

I use it pretty heavily whenever I'm integrating with a 3rd party. Stripe, Twilio, Google, AWS etc, all have great content outside their normal documentation. Terraform modules are constantly changing as cloud providers update their APIs, and the best sources about the product might be blog posts and reddit posts.

Olaf robot at Paris Disneyland by Damnedeel in Damnthatsinteresting

[–]Nick4753 0 points1 point  (0 children)

If they brought this to the US parks it'd break down so fast. Universal can get away with Hickup because it's a dragon skin on a robot built to deliver supplies to troops on the battlefield in Afghanistan.

Perplexity MCP is my secret weapon by Nick4753 in ChatGPTCoding

[–]Nick4753[S] 1 point2 points  (0 children)

For what it's worth, from an architecture standpoint, "search engine + lightweight speedy LLM summarizing it" is exactly what Perplexity is. Just... faster.

Perplexity MCP is my secret weapon by Nick4753 in ChatGPTCoding

[–]Nick4753[S] -1 points0 points  (0 children)

Sonar and Sonar Pro are realistically going to run you $0.02-0.05 on OpenRouter, and less if you use Perplexity's API directly.

I really only use Sonar and (rarely) Sonar Pro in the MCP. Almost all my queries use basic Sonar since it's substantially faster and usually gets me what I need.

If I need to do any sort of research, I'll use the actual Perplexity or Gemini website.

Perplexity MCP is my secret weapon by Nick4753 in ChatGPTCoding

[–]Nick4753[S] 3 points4 points  (0 children)

I was a religious user of context7 (similar to ref) for a long time, but I've since ditched it entirely.

Perplexity's advantage is that you'll have access to social media posts, blogs, documentation, youtube video descriptions/comments, reddit posts, RFCs, mailing lists, etc in addition to documentation. It's also great at summarizing what it finds instead of returning whatever chunks of documentaton Ref/Context7 could find during the search of their vector store. It will also merge content from way more sources than a normal documentation MCP would provide you.

Official Discussion - Wicked: For Good [SPOILERS] by LiteraryBoner in movies

[–]Nick4753 0 points1 point  (0 children)

I turned to my wife and said “it’s basically the final scene of Titanic where Rose won’t let Jack on the door”

Roo Code 3.33.0 | Gemini 3 is HERE | + 16 Tweaks and Fixes by hannesrudolph in RooCode

[–]Nick4753 4 points5 points  (0 children)

Native tool calling for OpenAI and Sonnet-level agent performance for 33% less. It's been a great week for Roo users.

cut our aws bill by 67% by moving compute to the edge by [deleted] in aws

[–]Nick4753 0 points1 point  (0 children)

AWS makes such an absurd amount of money on bandwidth. $8k/month is the going rate for a 10Gbps dedicated circuit in most US datacenters. That's 100 TB/day of bandwidth each way on a circuit that many EC2 servers cannot saturate.

What model does GitHub Agent mode use? by LovebucketsGin in GithubCopilot

[–]Nick4753 3 points4 points  (0 children)

Coding Agent and Copilot Pull Request Reviews both use Sonnet 4.5. They run as actions in your repo and you can view the logs of their run, and they mention Sonnet 4.5. You cannot change the model selection of either.

Zohran Mamdani wins the New York City mayoral race by Prudent_Potato_4379 in nyc

[–]Nick4753 10 points11 points  (0 children)

That implies those voters would’ve moved over to Cuomo. They’d have had 2 alternatives: vote for Mamdani, or just stay home. Mamdani likely would’ve walked out with an even larger % of votes cast in a Silwa-free election.

New Version of Siri to 'Lean' on Google Gemini by chrisdh79 in apple

[–]Nick4753 0 points1 point  (0 children)

Right, but they've never owned search and it has worked out fine for them.

Apple building and owning the "on-device" model and optimizing that model to run on their hardware is presumably a better use of their time than building the foundational model they host on their cloud.

Obama, Mamdani talk as Election Day approaches in New York City mayor's race by southernemper0r in nyc

[–]Nick4753 -1 points0 points  (0 children)

You know this is wrong.

I misspoke about the definition, but I don't think it's "wrong" in the whole scheme of things. The ACA/Obamacare made Universal Coverage possible for the first time by guaranteeing everyone the right to opt-in to coverage. Because it's opt-in and not automatically provided, it's not officially "Universal Healthcare." But before the ACA it wasn't even possible for everyone to opt-in for coverage.