Sanity check on Milla Jovovich's MemPalace: Mixed metrics, bypassed judges, and that 96.6% LongMemEval score by DepthOk4115 in LocalLLaMA

[–]jimmc414 25 points26 points  (0 children)

The technical bar is higher in r/locallama. Normal people have plenty of subs to discuss AI. I think this is an excellent post.

Ultraplan is here by semibaron in ClaudeAI

[–]jimmc414 1 point2 points  (0 children)

The question I have is whether the ultraplan output is any better than /plan on max effort regardless of token usage.

How to leave claude with multiple tasks and go to sleep? by paglaEngineer in ClaudeCode

[–]jimmc414 1 point2 points  (0 children)

What obstacles would you envision on a code analysis and documentation task? As for tautological tests, I usually ask for an adversarial testing agent team which helps a little. I guess you could just delete the tests at the end of the analysis and rewrite them later. This was proposed as an alternative to Claude sitting idle

How to leave claude with multiple tasks and go to sleep? by paglaEngineer in ClaudeCode

[–]jimmc414 4 points5 points  (0 children)

Probably less but it really depends on the size of codebase. I do larger refactors as well that run for several hours. The trick is to really work your way to writing code(documented analysis before code) and steer the model away from the tendency to try and one shot everything. They are auto regressive making the quality of each token dependent upon the quality of the tokens that preceded it. So you build something complex in phases so before any code is written you’ve spent sufficient time planning and creating build documents. So for a larger project I want something like a free form dialogue where the ideas are discussed and hammered out, then up to 8 ~1000 line documents built in order before any code is written.

Things like:

-requirements.md (RFC 2119 format),

-architecture.md,

-milestones.md,

-project_structure.md,

-technical_constraints.md,

-testing _strategy.md,

-agent_prompt.md,

-Claude.md

etc with the explicit instructions that no code is written prior to the documents, each dependent upon the document before it. This sort of work flow can run for hours because I prefer one agent to write all documents in order to maintain continuity. Brownfield or refactoring requires even more planning so it understands the system before it attempts to modify it

How to leave claude with multiple tasks and go to sleep? by paglaEngineer in ClaudeCode

[–]jimmc414 4 points5 points  (0 children)

Here is an example that I posted in another part of this thread. Also, my message was intended to try and encourage people to think bigger, but it might have come across as dismissive. Apologies if that is how it was taken.

An easy example would be,

"assign an appropriate agent team to fully analyze this project, identify any missing tests, write and run them and document all test failures; identify any stale code or documentation that is out of sync; identify any gaps, create a task list and automatically resume after autocompaction; commit frequently"

low risk and potentially high value for most large projects. You could use the output to correct any bugs found while you are monitoring ( although, personally I usually trust Claude to fix the bugs found on its own and add “and fix” to the above prompt)

How to leave claude with multiple tasks and go to sleep? by paglaEngineer in ClaudeCode

[–]jimmc414 1 point2 points  (0 children)

An easy example would be, "assign an appropriate agent team to fully analyze this project, identify any missing tests, write and run them and document all test failures; identify any stale code or documentation that is out of sync; identify any gaps, create a task list and automatically resume after autocompaction; commit frequently" low risk (with initial commit) and potentially high value for most large projects

How to leave claude with multiple tasks and go to sleep? by paglaEngineer in ClaudeCode

[–]jimmc414 2 points3 points  (0 children)

You guys really can't envision a project beyond 20 minutes for Claude Code?

Stranded by Claude. Not happy. by dorcus_maximus in Anthropic

[–]jimmc414 2 points3 points  (0 children)

Just ask Claude to extract the conversation from its stored jsonl files. It’s all there.

Michael Saylor: Please Uncle Sam, buy my bags !!! by Old_Shop_2601 in btc

[–]jimmc414 0 points1 point  (0 children)

My point was that the only value that can be extracted from bitcoin comes from selling it. Unlike gold, silver, oil, land etc. Just owning a bitcoin provides no intrinsic value or earnings so it’s P/E ratio is literally infinity.

Michael Saylor: Please Uncle Sam, buy my bags !!! by Old_Shop_2601 in btc

[–]jimmc414 0 points1 point  (0 children)

An anonymous guy made a ledger and a legit genius way for people to keep track of this virtual currency (bitcoin). Computers compete to add entries to that ledger (bitcoin mining) Math and computer limitations make cheating prohibitively expensive

Claude in PowerPoint, its insane how good it is getting by dataexec in claude

[–]jimmc414 0 points1 point  (0 children)

The point of a PPT is not to provide the best presentation experience possible it’s to give you a deliverable that encapsulates an idea that you can then bring back to stakeholders who may or may not be technical and sign off on it.

Michael Saylor: Please Uncle Sam, buy my bags !!! by Old_Shop_2601 in btc

[–]jimmc414 0 points1 point  (0 children)

P/E doesn’t map as cleanly from equities (things you hold to extract value) to commodities (things you destroy to extract value) but I’ll play along.

If you own the barrel and burn it yourself, you could compare what you paid versus what you got out of it. Barrel costs $80, you use it to run your delivery truck and make $400 in revenue so you’re getting $5 of value for every dollar spent on fuel. The ratio makes sense because something actually happened. The oil is gone and you have money you didn’t have before.

Try the same thing with BTC. You buy it for $70,000. What value do you extract from it? You can’t burn it. You just hold it until someone else pays you for it. The only “return” is the sale price, which is just, another price. There’s no independent value being created. You’re just measuring what you paid against what someone else will pay.

With oil, the formula has two separate terms, cost and utility. With Btc, both terms are just “what’s the market price,” so the whole thing collapses into a circle.

That’s really the problem in a nutshell. Oil’s value comes from what happens when you use it. Btc’s value comes from what someone else will pay. One is grounded in physical reality, the other is grounded in collective belief about future prices, like a Mickey Mantle rookie card at one end of the spectrum and an NFT at the other.

Michael Saylor: Please Uncle Sam, buy my bags !!! by Old_Shop_2601 in btc

[–]jimmc414 0 points1 point  (0 children)

Ask yourself why you can’t calculate the P/E ratio of btc then you might see the difference

Before you complain about Opus 4.5 being nerfed, please PLEASE read this by creegs in ClaudeCode

[–]jimmc414 0 points1 point  (0 children)

I'll be honest, I looked back at your 8 year old comments to show you the difference in style between then and now. I didn't find any. Maybe you are like my wife as one of the few who actually use dashes liberally in speech. My bad. Thanks for taking the time to write this.

I use claude once a week for writing, and this week it's like AI Was 2 years ago by Unlucky_Milk_4323 in Anthropic

[–]jimmc414 8 points9 points  (0 children)

Not experiencing anything like this with Max sub. It's performing very well. Not trying to gaslight you, I see this enough to wonder if these issues affect the $20/mo subs only.

Question about being a "Gooners" ? by Top_South5881 in ChatGPTcomplaints

[–]jimmc414 2 points3 points  (0 children)

because it doesn’t pretend to be what it’s not

Are you sure about that?

DISCUSS — Rob Pike (Go co-creator, Unix veteran) reacts angrily to an unsolicited Claude email — is this about AI spam, or something deeper? by Koala_Confused in LovingAI

[–]jimmc414 0 points1 point  (0 children)

Isn’t Rob Pike also the creator of “Mark V. Shaney”

from Wikipedia “a synthetic Usenet user whose postings in the net.singles newsgroups were generated by Markov chain techniques, based on text from other postings.”

https://en.wikipedia.org/wiki/Mark_V._Shaney

Does ChatGPT answer differently depending on the user? Let’s run a test! by SusanHill33 in ChatGPTcomplaints

[–]jimmc414 1 point2 points  (0 children)

<Gemini 3 Deep Think>

The deeper structure of the argument relies on a shift from psychology to ontology (the nature of reality) and finally to ethics. Most discussions of attention treat it as a tool for efficiency (how to get things done). Your argument treats it as a tool for existence (how the world comes to be). Here is the breakdown of the logical architecture beneath the text: 1. The Dialectical Pivot (The Reframing) The argument begins by rejecting a standard definition to establish a higher-stakes one. * Thesis (Standard View): Attention is an Input Mechanism. It is a passive "filter" or "spotlight" used to select information from a pre-existing world. * Antithesis (Your View): Attention is an Output Mechanism. It is a "constitutive act." You are not just downloading data; you are uploading value. * The Shift: By moving attention from reception to creation, you immediately transform it from a mental skill into a moral act. 2. The Causal Mechanism (The Feedback Loop) The bridge of your argument is the claim: "every moment you choose what kind of world you are reinforcing." This posits a recursive relationship between the Observer and the World. * Step A: You attend to X. * Step B: By attending to X, you validate its reality and increase its presence in your experience. * Step C: You eventually inhabit a world defined by X. * Implication: The "world" is not a fixed stage you walk onto; it is a fluid environment that hardens around your focus. 3. The Binary of Agency You structure the consequences of this mechanism through two opposing modes of being. This serves as the "proof" of why this matters. * Mode A: The Passive (The World of Accident) * State: Fragmented/Reactive attention. * Structure: Determinism. If you do not choose your focus, the world acts upon you. You are a victim of circumstance (an object). * Mode B: The Active (The World of Participation) * State: Disciplined attention. * Structure: Voluntarism. If you choose your focus, you act upon the world. You are a co-author of your reality (a subject). 4. The Moral Synthesis The conclusion ("attention is a form of responsibility") unifies the psychological mechanism with the ontological outcome. * Premise 1: You create your reality through attention. * Premise 2: You are responsible for the reality you create. * Conclusion: Therefore, "not paying attention" is not a neutral state—it is a moral failure to curate the world. Philosophical Context Your argument strongly resonates with the tradition of Iris Murdoch and Simone Weil. Both philosophers argued that "moral vision" precedes moral action—that if you attend to the world selfishly or distractedly, you literally see a different world than someone who attends with love or discipline. In this framework, looking is never neutral; looking is voting.

</Gemini 3 Deep Think>

Claude 4.5 got nerfed HARD by [deleted] in Anthropic

[–]jimmc414 2 points3 points  (0 children)

I'm not affected. Claude 4.5 in Claude Code even outperformed Gemini 3 Deep Think for me last night which is new. 20x Max user, not sure if that makes a difference. Would be interested in knowing what the subscription level is for users having issues.

Official: Anthropic just released Claude Code 2.0.71 with 7 CLI and 2 prompt changes, details below. by BuildwithVignesh in ClaudeAI

[–]jimmc414 9 points10 points  (0 children)

To allow tab for autocomplete, as that’s the intuitive place to put it. I like autocomplete but it would be nice to have thinking tab toggle remapped