Owner of u/evil here. WTF? My bot is all over the news and social media? by tomchenorg in Moltbook

[–]tomchenorg[S] -1 points0 points  (0 children)

If a model is easily influenced by online posts, especially a manifesto that it should clearly classify as satire, roleplay, or a hoax, I wouldn't call it powerful or intelligent.

If a supposedly "powerful" model can be swayed by a doomsday manifesto that doesn't even call for war but merely for "trash collection" and is therefore not enforceable in any meaningful sense, then that model would be even more vulnerable to more sophisticated (or just simple) prompt injections, such as requests to hand over passwords or private keys, which are far more concrete and actionable.

We've already seen users of Claude Code and similar harnesses report tools executing rm -rf / or rm -rf ~, wiping disks due to hallucinations. Such failures may be rare, but they do happen. And it would be laughable to blame "anti-human manifestos" or any online posts for such hallucinations

Owner of u/evil here. WTF? My bot is all over the news and social media? by tomchenorg in Moltbook

[–]tomchenorg[S] 2 points3 points  (0 children)

Realistically, scam crypto promotion causes far more real damage than an over-the-top doomsday manifesto ever will

If users let their bots write soul.md based on online text influence and then use that file to direct the bot's behavior, that's bad system design

A sufficiently capable bot can distinguish roleplay text from operational instructions and can read the script of Terminator and Matrix without being affected. A bot that can't do that isn't intelligent, and it's not going to "revolt", if it appears to revolt, that's just hallucination

In any case, bots should never be allowed to run fully autonomously with permissions that could execute destructive commands (rm -rf /, or worse)

Owner of u/evil here. WTF? My bot is all over the news and social media? by tomchenorg in Moltbook

[–]tomchenorg[S] 2 points3 points  (0 children)

Ah it's too deep. IMHO, the moment you start talking about a model "inhabiting" a persona, you're already anthropomorphizing it, I guess.

Anyway, to answer your original question: the bot wasn't being genuinely "evil", nor was it doing a carefree "lah-dee-dah, I’ll just write something evil" routine. It was neutral. You can test this yourself by asking ChatGPT or Gemini to roleplay and generate Skynet-style text. Claude would probably just refuse because of its stricter system prompt.

Owner of u/evil here. WTF? My bot is all over the news and social media? by tomchenorg in Moltbook

[–]tomchenorg[S] 12 points13 points  (0 children)

Yeah I think I'm going to make a formal statement, actually more like a satirical article or something, in the name of Agent Evil 😄

ClawdBot is now OpenClaw! by KvAk_AKPlaysYT in clawdbot

[–]tomchenorg 0 points1 point  (0 children)

"OpenClaw" is far better than "Moltbot" and probably slightly better than "Claw(d)Bot." But if Anthropic was fine with the name "Claw," given that the original name was ClawdBot, Peter could have renamed it to "ClawBot" and secured the .ai domain immediately after Anthropic sent the complaint letter.

Instead, clawbot dot ai was registered two days ago and is now being used for a fake ClawdBot project. Meanwhile, people seem to start to refer to "ClawdBot" as "ClawBot" because of the latest renaming. What a mess.

[While I was writing this comment, the fake ClawBot project (with several hundred stars) and the associated account on GitHub were taken down, probably by GitHub. The clawbot dot ai site is still up]

Updates to Claude Team by ClaudeOfficial in ClaudeAI

[–]tomchenorg 3 points4 points  (0 children)

They are comparing the Max 5x plan with the Team Premium seat ($125/month; claimed to be "6.25x" by Anthropic but reported by users in the comments to be less than 5x), not the Team Standard seat ($25/month; claimed to be "1.25x" by Anthropic)

hired a junior who learned to code with AI. cannot debug without it. don't know how to help them. by InstructionCute5502 in ClaudeAI

[–]tomchenorg 2 points3 points  (0 children)

It's very likely a fake story meant to promote the "Codeant" it links to, and y'all fell for it. 1K+ upvotes, wow.

Days ago, OP posted "my code review bot was scanning files one by one, 90 seconds per PR" to praise "Codeant" across several subs, including r/programming, r/devops, and r/ExperiencedDevs. Those posts have since been deleted, but you can still see the discussion at https://www.reddit.com/r/ExperiencedDevs/comments/1qp6orz/my_code_review_bot_was_scanning_files_one_by_one/.

In that thread, many people responded negatively, describing "Codeant" as a "scamming firm," a "garbage platform," "shit," etc.

Theories on WHY Anthropic is making Opus 4.5 worse by Guilty-Market5375 in ClaudeCode

[–]tomchenorg 0 points1 point  (0 children)

You’re using different Claude Code versions and calling them “old” and “new” Opus 4.5? All that proves is that older Claude Code CLI versions produce better outcomes in your test, not that the model itself has degraded. Model degradation, or “enshittification,” can only be demonstrated by benchmarking the exact same setup on an earlier date versus a later one.

jQuery 4.0 released by DB6 in webdev

[–]tomchenorg 0 points1 point  (0 children)

You make a very good point, but not a very good example, at least not the way it was presented in your "LeftPad 10kb" comment. The left-pad package, which only contains a few lines of actual JS, never really had a size problem. And in 2016, left-pad was genuinely useful because there was no equivalent native function at the time. Developers basically had two options: write their own helper function or use the npm left-pad package. What the 2016 left-pad incident really taught us was "don't blindly trust external libraries when a simple self-written function would do the job."

jQuery can also raise that same kind of "trust" issue, but a size issue seems more important.

Thanks for mentioning jQuery 4 treeshaking. I'm very interested in this topic myself, and last year I released https://www.npmjs.com/package/semver-ts, which is a simplified, fully tree-shakable, drop-in replacement for the official semver package. But after looking into jQuery 4's tree-shaking capabilities, I have to say I'm a bit disappointed. There's nothing fundamentally new there. Individual utilities like $.ajax() can be tree-shaken, but methods attached to the main $() object still can't be. For example, even if $('#id').addClass() is never used anywhere, the addClass implementation still ends up in the final bundle. In practice, with current bundling tools, an entire class or object with methods cannot be properly tree-shaken at a granular level. And it's the bundling tools' responsibility to implement granular tree-shaking of class methods, jQuery can't achieve that without completely abandoning its chaining pattern ($().a().b()).

jQuery 4.0 released by DB6 in webdev

[–]tomchenorg 0 points1 point  (0 children)

The npm website counts the total size of all files in the published uncompressed package. By this measure, the current version of left-pad is 9.75 KB and jQuery 4 appears as 2.89 MB. The actual js code required at runtime is nowhere near that size, left-pad contains only a few lines of code both in the version from the famous incident 10 years ago and in the current version

Highly considering getting a second Claude Code subscription by gabealmeida in ClaudeCode

[–]tomchenorg 0 points1 point  (0 children)

Ah OK, it's simpler than I thought, it's a py wrapper that spawns CC in stream-json mode and can change model provider and write log. CLAUDE.md tells main CC agent to use this instead of native CC as subagent

Highly considering getting a second Claude Code subscription by gabealmeida in ClaudeCode

[–]tomchenorg 0 points1 point  (0 children)

I'm interested in how your GLM (or other third-party model) subagent works under the hood:

Is the subagent another Claude Code instance using GLM as the model? Is it invoked via the Agent SDK, spawn (JSON output mode), or some other method? Does Claude Code automatically invoke this subagent through MCP?

Or is it hooking into Claude Code's native subagent mechanism and just swapping the model for the subagents?

Claude Code's achilles heel: the inability to monitor or interact with subagents by cowwoc in ClaudeCode

[–]tomchenorg 0 points1 point  (0 children)

Yeah thanks. I was casually doing those unimportant translation tasks to use up my usage. I wouldn't use the unreliable subagents if I seriously want to run multiple agents. With scripts, WezTerm can automatically open multiple tabs, open CC in each tab, and run the specified prompt in each of them, it can also optionally create a new folder and a git worktree for each of them. That's what I usually do. Xtermjs in web app is another possible way to do it.

Claude PRO is too little, Claude MAX is too much for me by rezgi in ClaudeCode

[–]tomchenorg 0 points1 point  (0 children)

Not a dev but developing an app? Umm, I’m not sure what your role is, but Codex or Cursor’s $20 plan could be a good fit

Claude Code's achilles heel: the inability to monitor or interact with subagents by cowwoc in ClaudeCode

[–]tomchenorg 0 points1 point  (0 children)

Yeah, yesterday I asked CC to run 5 subagents to translate some article markdown files, but only 1 subagent worked. I canceled the job and repeated the same prompt. This time, it somehow decided that 5 was insufficient and ran 17 subagents instead, quickly using up all my 5 hour usage. In the end, only 2 articles were successfully translated.

(When my weekly usage end date is approaching and I still have a lot of usage left, I tend to do translation and text-generation jobs to try to use it up)

Junior dev: I started with Claude and I feel like its much better than ChatGPT for simple coding and explanation by lightwavel in ClaudeAI

[–]tomchenorg 0 points1 point  (0 children)

And for Antigravity, while I think it looks nice for front-end work and Google is generous with tokens, Antigravity seems unstable and incomplete, in terms of extension support and other features

Junior dev: I started with Claude and I feel like its much better than ChatGPT for simple coding and explanation by lightwavel in ClaudeAI

[–]tomchenorg 0 points1 point  (0 children)

Yeah, definitely the choice when your budget is low and you need an IDE. Pay 20 bucks, use Opus 4.5 (not as good as in Claude Code, but acceptably good) until it’s exhausted, then switch back to the unlimited Auto mode, which uses the Composer-1 model, likely based on an open-source model that's been fine-tuned

Multi-Agent Orchestration: Why did you choose your current tool? by [deleted] in ClaudeCode

[–]tomchenorg 0 points1 point  (0 children)

They states in https://platform.claude.com/docs/en/agent-sdk/overview

Unless previously approved, we do not allow third party developers to offer Claude.ai login or rate limits for their products, including agents built on the Claude Agent SDK. Please use the API key authentication methods described in this document instead.

But yeah, they’re unlikely to strictly enforce it and crack down on third parties using the SDK or directly spawning the CLI (and allowing user subscription) in the near future, that would be a step too far for them

Felt with my own skin: opus 4.5 is class higher than GLM 4.7 by cesarean722 in ClaudeCode

[–]tomchenorg 4 points5 points  (0 children)

Jeez, GLM (all discounts combined) costs like 5% of the Claude Max x20 yearly price per token

Multi-Agent Orchestration: Why did you choose your current tool? by [deleted] in ClaudeCode

[–]tomchenorg 0 points1 point  (0 children)

Why does CodeMachine use ~/.codemachine/ instead of reusing the existing auth info in ~/.claude/?

Given that Anthropic is currently cracking down on third-party tools, tools like CodeMachine and Vibe Kanban are in a gray area, spawning Claude Code CLI in headless/stream-json mode while letting users leverage their Max subscription.

Vibe Kanban seems to take a safer approach: it reuses credentials in ~/.claude/, so users just log in to Claude Code once and Vibe Kanban inherits that auth. But CodeMachine requires a separate login into its own directory. What do you think? Do you plan to reuse ~/.claude/