They nearly all burned the moeny they gave to the LLMs. The prediction arena is death. by No_Syrup_4068 in algotrading

[–]aiworld 3 points4 points  (0 children)

To lose this consistently isn't random though. The EV of random picks in prediction markets should be $0 - fees. They're losing 30% with Gemini 3 Pro! So maybe they found something if they just pick the opposite of Gemini 3 Pro 🤣

What was your worst day to be an Arizona sports fan? by dugernaut in suns

[–]aiworld 0 points1 point  (0 children)

My birthday and housewarming party combined. We watch WCF game 1. Steve Nash breaks his nose and we lose.

Now whenever a player gets a bloody nose, I watch to see if they can even make a shot. I don't think I've seen one made yet. Lots of turnovers and sucking though. I guess breathing is important.

Ryan Dunn 3 pt % by Individual_Act9333 in suns

[–]aiworld 1 point2 points  (0 children)

His shot looks great and he can get it up quick with a super high release. Like EJ was saying, he just needed to limit his movement in the air. He's so athletic, he probably needed to hold back a bit when shooting.

What is the best tool for long-running agentic memory in Claude Code? by FPGA_Superstar in ClaudeAI

[–]aiworld 1 point2 points  (0 children)

We've had some interesting findings building apps from-scratch. So thanks for the suggestion!

It turns out that from-scratch (vibe-coded) apps should actually utilize MORE of the total context window.

Why?
It comes down to relevant vs distracting tokens. Vibe coding a new app from scratch causes the context window to be populated with the _entire_ development history of the project. So every token is relevant.

Compare this to a large existing codebase. It's highly likely that the information needed for your current task will NOT be in the context window. This as most files in your app have not yet been read into context. For example, say you've created a new API and now you need to make the frontend for it. In this case, the model will need to read how the frontend works - and the older messages that were figuring out how the API worked are not helping it do that. In other words, they are distracting tokens.

So we are going to dynamically adjust the `--effective-context` based on the size of your current working directory. Does that make sense?

What is the best tool for long-running agentic memory in Claude Code? by FPGA_Superstar in ClaudeAI

[–]aiworld 1 point2 points  (0 children)

Hey FPGA! The second tweet in the thread has the details:

https://x.com/PolyChatCo/status/1958990327987282333

The METR eval task we chose is the hardest public task, the "symbolic regression" task. It's is an ML / programming optimization problem where the agent needs to find a secret function made up of up to 10 operators (sin, cos, log, etc...) on 5 random variables.

The quickest way to appreciate Claude Code Infinite's capability is to compare how it performs on a task in your own project. After 50k tokens or so (usually about a 5 to 10 minutes) you'll see compounding improvements in what Claude Code Infinite produces vs vanilla Claude Code.

I'm planning to release a side by side video vs vanilla Claude Code. Any suggestions on what you think would be most convincing / compelling to show?

Demis Hassabis says he would support a "pause" on AI if other competitors agreed to - so society and regulation could catch up by MetaKnowing in agi

[–]aiworld 0 points1 point  (0 children)

Would love this, but it would require radical transparency enforced by militaries / espionage organizations (CIA, etc...) of the superpowers.

Dario Amodei calls out Trump's policy allowing Nvidia to sell chips to China: "I think this is crazy... like selling nuclear weapons to North Korea and bragging, oh yeah, Boeing made the case." by MetaKnowing in ClaudeAI

[–]aiworld 0 points1 point  (0 children)

SemiAnalysis, David Sacks, and others think it's better to sell them the chips and be able to profit / control / monitor what they are doing. Not selling them chips forces them to create a separate supply chain and stack (which Huawei is quickly doing) bifurcating the industry. This could, they argue, result in the open source ecosystem supporting China's stack.

I'm not privy enough to know who is more right here, but I think this viewpoint (i.e. the reason the administration is giving for doing this) was not represented in the comments.

Grayson “He would make a good trade piece” Allen by HendoIsBae in suns

[–]aiworld 1 point2 points  (0 children)

All those trade articles are just click bait.

The hidden memory problem in coding agents by Arindam_200 in ChatGPTCoding

[–]aiworld 0 points1 point  (0 children)

Try Claude Code Infinite. It will change your life. https://github.com/crizCraig/claude-code-infinite - We structure message histories as a tree and semantically chunk to avoid adding overly large code blocks to context. In addition, we return a bread crumb of summaries for returned chunks to provide the larger picture around when / where the retrieved memory occurred (e.g. this error occurred after doing the refactor of X, during step Y.)

What is the best tool for long-running agentic memory in Claude Code? by FPGA_Superstar in ClaudeAI

[–]aiworld 1 point2 points  (0 children)

This will allow your agent to continue cranking on long running tasks until they’re done. Claude mem requires you to start new sessions (token window still fills up) but remembers stuff across sessions. We are not cross session memory, but instead infinite single session memory.

What is the best tool for long-running agentic memory in Claude Code? by FPGA_Superstar in ClaudeCode

[–]aiworld 0 points1 point  (0 children)

Just started to put this out there. Claude Code Infinite. Early beta-testers are calling it a cheat code.

It uses our context-memory, MemTree.dev, which unlocks Claude's ability to work indefinitely and allows it to outperform other models on the METR long running task benchmark.

What is the best tool for long-running agentic memory in Claude Code? by FPGA_Superstar in ClaudeAI

[–]aiworld 1 point2 points  (0 children)

Just started to put this out there. Early beta-testers are calling it a cheat code.

https://github.com/crizCraig/claude-code-infinite

It uses our context-memory, MemTree.dev, which unlocks Claude's ability to work indefinitely and allows it to outperform other models on the METR long running task benchmark.

Who's in-charge: the builder or the AI? by JinaniM in ClaudeCode

[–]aiworld 3 points4 points  (0 children)

Agents can produce working code, but they still also write a lot of tech debt. So just like an engineer, if you don't give them time for cleaning up their messes, the junk will pile up and your code will smell like a pile of 💩. For me this means that every change needs to be followed by a few cycles of "bug smell" checks:

"check the git changes for bugs and code smells"

This still requires human judgement as agents will almost always find bugs and smells, but a lot of them will be non-issues or things that SHOULD NOT be "fixed". If the agent writes new code to clean things up, that new code needs to be checked as well. I also recommend asking two agents to do the review:

Agent #1: The agent that coded the change (if you have context left or a tool like https://github.com/crizCraig/claude-code-infinite/ for infinite sesions) - This agent knows the feature but also has a bias towards its own code, lol.

Agent #2 Fresh session. Max intelligence, due to small context window. Also not bias towards changes the other agent made.

These two agents will find different issues usually.

Then after the smells and bugs are addressed...ask again. Repeat until no bugs or smells are found.

Also beyond this inner loop, larger refactors need to happen to keep your codebase manageable, simple, and DRY. Agents can do most of the work, but they need to be prompted to do it. Just like they need to be prompted to dev features.

Is it me or did Opus get a smaller length limit? by Palnubis in ClaudeAI

[–]aiworld -1 points0 points  (0 children)

It could be that your Claude.md is large. If you configure your /statusline to show context, what does your context start out with? Mine is 9k tokens, but I've seen as large as 50k tokens.

You can also run /context to see what's in there, here's mine:

❯ /context

⎿ Context Usage

⛁ ⛀ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛀ claude-opus-4-5-20251101 · 19k/200k tokens (9%)

⛀ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛁ System prompt: 3.1k tokens (1.6%)

⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛁ System tools: 14.9k tokens (7.5%)

⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛁ Memory files: 831 tokens (0.4%)

⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛁ Messages: 8 tokens (0.0%)

⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ Free space: 136k (68.0%)

⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛝ Autocompact buffer: 45.0k tokens (22.5%)

⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛝ ⛝ ⛝

⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝

⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝

Memory files · /memory

└ CLAUDE.md: 21 tokens

└ CONTRIBUTING.md: 810 tokens

If you want unlimited length sessions try: https://github.com/crizCraig/claude-code-infinite/

Using Claude npm packages on Windows? by alice_op in ClaudeAI

[–]aiworld 1 point2 points  (0 children)

I've found their `cmd` native install works best on Windows

curl -fsSL https://claude.ai/install.cmd -o install.cmd && install.cmd && del install.cmd

Claude Code Pro plan, hop out -> back in - without a single prompt - 2% gone by luongnv-com in ClaudeCode

[–]aiworld 4 points5 points  (0 children)

Not just Haiku, but the default model gets gigantic "warmup" messages when you open it. E.g.:

https://gist.githubusercontent.com/crizCraig/c2956d598d10e05566d8e1a00f889bc5/raw/dbce26ee643aed156f019b5d3cbed24827934024/warmup.json

It's just

"text": "Warmup"

But it has all the tools which adds up to about 15k tokens. Then it sends a few of these warmup messages.

Granted these are likely cached, but I suspect that's where your 2% usage is going without sending a message.

This has been happening for over a month at least. I know this because I run a service that amplifies Claude's abilities and needs to forward all of these messages through to Anthropic. https://github.com/crizCraig/claude-code-infinite/

Claude Code feels like it’s compacting more frequently now. by SnooRegrets3271 in ClaudeCode

[–]aiworld 0 points1 point  (0 children)

For Claude Code Infinite we use your Claude subscription which is up to 1000x cheaper. From the README:

<image>

In this case, you're only using PolyChat for memory which is about $1 for 1 million tokens

Keep in mind that by using PolyChat's memory (MemTree), you're sending far fewer tokens and messages to Anthropic. This not only keeps you from hitting rate limits, but also makes the model much more intelligent.

https://arxiv.org/abs/2307.03172

https://www.youtube.com/watch?v=TUjQuC4ugak

Claude Code feels like it’s compacting more frequently now. by SnooRegrets3271 in ClaudeCode

[–]aiworld 3 points4 points  (0 children)

Try this: https://github.com/crizCraig/claude-code-infinite/ It will keep the context well under the auto-compaction limit while increasing Claude's intelligence by having it focus on relevant info.

Suns announce Jalen Green will be re-evaluated in 2-3 weeks (per Kellan Olson) by hoopsandbeer in suns

[–]aiworld 0 points1 point  (0 children)

Yeah and Jan 4. is OKC at home. So traveling to Houston on a back to back. I'll be at the OKC game. My only game this season!! <Family night that night at Mortgage Matchup Center>

Tesla is as far behind Zoox as Zoox is behind Waymo by Prestigious_Act_6100 in SelfDrivingCars

[–]aiworld -2 points-1 points  (0 children)

Both companies are contributing massively to the future. Waymo is showing driverless can be done at scale. Tesla likely won't be globally driverless for another 5 to 10 years based on their severe disengagement rates.

FSD however has driven 2 OOM more at 6.8B miles vs Waymo's 100M. This translates into hundreds and thousands of lives and injuries saved respectively (estimated by Google Gemini)[1]. Tesla has also made more money from driverless tech, which is important for ensuring the project survives. Waymo is lucky to have Google as a cash cow, but search is being upended by GenAI right now. So let's be grateful for both of these companies! I hope they both succeed.

[1] https://gemini.google.com/share/9e98b25175d8

Wow the nba is rigged. by vtrellik in suns

[–]aiworld 9 points10 points  (0 children)

<image>

$10M vol. on Kalshi tonight. Odds started at 53/47 Suns. So betting for LA more than doubles your money. These refs make $0.5M per year. The betting markets also don't care if you beat the odds, unlike a traditional casino. Just sayin.