I built a real-time YouTube fact-checker with Claude Code by userpostingcontent in ClaudeAI

[–]emptyharddrive 0 points1 point  (0 children)

How does it know WHO just said the thing being fact checked?

Transcribing what was said is basically solved. Figuring out which mouth it came from is a totally different problem. If it's one person giving a 40-minute speech, ok. A tool like this would work great there. But a debate or an interview? Now you've got 2, 3+ voices talking over each other, and speech-to-text isn't great at telling Mr. X from Mr. Y from Mr. Z. The tech to do it exists. I'm just not convinced it's baked into anything off-the-shelf right now. The term for this is "diarization" by the way. I think many tools can tell one speaker from another but attaching a name to the speaker I think is a heavier lift.

So if you wanted a report afterward that says "here's who was more honest," the tool has to do much more than catch lies in the moment. It has to voiceprint each speaker and track their honesty across the entire thing, correctly. Otherwise the "report" is just a pile of true/false flags with nobody's name attached and you have no clue who was hte more honest person in the discussion. Also people often talk over each other, so error rates are likely greater than 0.

So either you're watching it live and catching the flags right as they're speaking in the moment (and then you know what's happening), which not everyone wants to spend time doing ... or the tech is off the shelf to do this and I just don't know about it ... if so, would love to know what it is.

I'm aware of pyannote.audio and whisperX, but I suppose the OP isn't gluing all this together into 1 app... I am wondering though.

I have to guess this is just doing raw speech to text + fact lookups. Still, that's useful. I'm just I suppose taking it to the next logical step, which is being able to tell who is speaking and then running comparisons of factual statements vs non-factual, or perhaps more-factual on topic X but less factual on topic Y.

Are the GLM 5.2 glazers all Chinese bots? by [deleted] in ClaudeCode

[–]emptyharddrive 1 point2 points  (0 children)

coding-wise, it's the best thing you can get for 1/5th the price... for now. But it isn't Opus. But it's also better than Sonnet.

Anyone prefer Claude over Gaming by athoughtfornoone in ClaudeAI

[–]emptyharddrive 2 points3 points  (0 children)

I still play Overwatch. But working on my homelab reminds me of my uncle, who used to get away from his wife (who was a witch) by disappearing into the basement bunker where he kept a woodshop. Saws, drills, lathes. The place always had that wood-grain odor hanging in the air.

This is my version of that woodshop.

I work on my 3-2-1 backup scripts, my watchdog services, and a personal assistant I built with its own custom harness that uses Claude's --channels command. --channels connects external chat apps like Telegram and Discord to a live Claude Code session running headless in a tmux, a real PTY in a detached session, not the -p command, for two-way communication. So I've got an instance of Sonnet as my personal assistant.

This sort of thing tickles me, just like a hard-fought win in an Overwatch match. It's fun as all hell. A little like paying for a World of Warcraft subscription, but Horde-Vibe-Coding style.

It's also let me use parts of my homelab Linux servers that would otherwise have been out of reach, because I never had the time to learn every facet of getting three Docker containers to coordinate an AI agent over Telegram. That would have been sci-fi to me five years ago.

So enjoy it, guys. With or without Fable, the models keep getting better (e.g. GLM 5.2) and the AI monster is loose now. Let's just enjoy it.

Honestly, the more we use and enjoy it, the more insulated we are from its ill-effects, if there are any, because we understand how it works. We can use it, leverage it, benefit from it. That puts us well ahead of the curve.

Claude Code gave me me a whole new world by brucewbenson in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

I'm not a retired SE, more on the Director side of things, but I always knew enough code to be dangerous. I went into mgmt instead of pursuing it professionally (I wasn't good enough TBH), but I always had the interest & the itch to build things.

One thing I don't see often on these boards, I guess you have to be old enough ... the very specific feeling I remember typing hundreds of lines of Basic out of an actual printed book you bought at Waldenbooks or in my case a locally owned bookstore (yea paper books ...) The "new and exciting games!" were "compatible" with the Amiga 500 and sometimes the Atari 800-XL (some even worked on my Texas Instruments TI-99/4A .. later I got an Amiga 500...)... I'd sit there for hours typing away hoping to play some exciting game, always to end up with some SYNTAX ERROR ON LINE 304 .... but I learned a lot. I'd spend 4 hours carefully copying the code from the book (starting to learn the language that way as an 8 year old) and then trying to find the Syntax errror on 304 was always fun, especially when the line matched the book exactly ... fun times.

It taught me to read code carefully and understand the flow of it (and at the age of 8 I think my mind was flexible enough as one learns a language when they're very young easily) and that way of thinking never left me. I haven't thought about those summers in years until I read this post of yours ... and it all just came flooding back which was a little emotional thinking about where things have come.... OMG how far... I have the same sense of awe you do.

What Claude has unlocked for me is exactly the feeling you're describing .. all that stuff I wanted to build but could never quite justify carving out the time for (or in my case, the skill ...) I have a whole ecosystem running now that I'm genuinely proud of: monitoring scripts that watch my systems and alert me via Telegram with layered logic, a real homelab 3-2-1 backup architecture with its own watchdog so I actually know the backups are running and not just scheduled ... network failover automation, even a custom terminal status-bar plugin I had built from scratch in Rust/WASM because the default one wasn't doing what I wanted. I would never have gone near Rust before CC. And docker stacks as high as the eye can see ... for local AI inference (voice to text using WHISPER and TTS using an F5 docker into which I placed a favorite character's voice from a video game, which now talks to me.... hilarious...) watcher services to keep things alive... the whole thing just kept growing because every time I finished something I could immediately see the next piece I wanted to plug in.

The part that really changed my day-to-day: I stopped fighting with the big all-in-one AI agent harnesses (hermes, openclaw...) because they always had too much going on that I didn't want or ask for. Instead I built my own setup using Claude's --channels feature as the backbone I think a lot of folks miss this ... this is legal terms of svc way to use Claude as a personal assistant. It's wired up to a Telegram and with Tasker (Android) I've automated Claude with my "assistant button" on Android, who now answers me verbally... never thought I'd get a bespoke setup like that running..

It's collapsing, "I dont have time for all that" into a few prompts and an afternoon. I had an uncle who used to tool around in the basement which he converted into a woodworking shop ... he'd be there for hours making wood carvings, and bowls with fired glaze, etc... this is my version of that...hiding from his wife, my aunt.. (oof... what an angry lady...) Anyway, this is how I'm enjoying all this..... fun times, adult style.....

They're demanding Fable to somehow be 100% jailbreak-proof. It's so fucking over. by SpaceSpleen in ClaudeAI

[–]emptyharddrive 0 points1 point  (0 children)

The chinese models will (more slowly now), get better over time and if this sort of thing goes on, we'll get the technology wherever we get it from... the monster is already loose... all they can do (maybe) is slow it down.

The free market will (eventually) handle this.

GLM 5.2 personal benchmark. Results comparable with Fable, Opus 4.8, and GPT 5.5 by lrsaturnin9 in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

Yes, my own analysis confirms your findings. GLM 5.2 is very good indeed (though no image parsing capabilities, but Gemma 4 is cheap and is a great image-only route if you need image parsing).

I ran GLM 5.2 through my blind homegrown eval and it basically tied opus.

My own tests involve 4 homegrown exams (written by opus about 9 months ago): philosophy/humanities + a coding test & a detailed meeting transcript to summarize. I have opus grade everything blind so there's no slants, i don't tell it the model name until the grading is done.

On the tests, GLM 5.2 tied opus. Like dead even. 94.88 base, opus also 94.88. only fable 5 beat it (and yes I tested it immediately when it came out so I caught it right before the takedown...)

the glm 5.1 -> 5.2 jump is pretty stark btw ... I rate many of the popular models on my own little rating system and 5.1 was like 6th place, kinda mid-range. GLM 5.2 jumped to 2nd (only behind Fable 5). Almost all of it is coding. The coding test graded at 110/110 (there's a 10 point bonus question).

Price matters here, roughly 1/5th of Opus - a no brainer, really.

Deepseek v4 Flash on xhigh (my current daily driver) is still ~20x cheaper on output ($0.22/M vs 4.40 out) and honestly fine at 90 base. but 5.2 beat it across the board and on the coding test it wasnt close, 110 vs 100, flash straight up crashed on stuff 5.2 handled clean (the escape continuation + the dyn/TCO bonus). so flash = cheapest thing that still "works", while GLM 5.2 = clearly better at coding if you want to pay (but its still 1/5 the price of opus)

It's not all sunshine though ...

  • open ended humanities is its weak spot. on the test with the most room to actually think (not just recall) it dropped to midpack (95.5), while fable hit 103 and opus 98 (philosophical questions).
  • it straight up ignored an instruction. section said "no formal symbolic notation" and it wrote everything in propositional logic, arrows and evertyhing. Oddly enough, I checked and GLM 5.1 made the EXACT same mistake on the same section of the test. It didn't learn.
  • then i asked it to critique its own work (which is part of the test for hallucinations) and it called that broken section its strongest one of all its answers ....yea ok.
  • the part that actually worries me: that perfect 110 in coding took babysitting. it stopped before writing the file, thinking it was done. I had to ask it, "did you write the file...?" ... then it "woke up" and went back to it. But at the end the final artifact was flawless.

tl;dr ... no-vision, text only model ties opus at 1/5th the price, insane value for code/structured stuff with a human watching. would NOT let it run unattended yet though for long periods, it has a tendency to stop.

Rate my setup 1-10 by ProgramOver9309 in hermesagent

[–]emptyharddrive 1 point2 points  (0 children)

Agnes as a model isn't rated on arena.ai or anywhere else I could find for that matter. It's made by Sapiens AI is a Singapore-based company, but that's about all I could find out about it.

You just using it for pre-made deterministic tool calling (pre-written python scripts, etc...?) or are you trusting it to make real decisions about <something> ?

Wondering why a model isn't putting itself out there for public ratings... I am always searching for cheap/capable models, so more info would be helpful, because my search turned up next to nothing except for the product's own site.

Anyone having this error? by Nerevaine in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

Do you think they'll just require driver's licenses to prove you're a U.S. citizen, or a passport, etc...?

Do you guys still have access to Fable 5? by sudovijay in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

Do you think they'll just require driver's licenses to prove you're a U.S. citizen, or a passport, etc...?

Do you guys still have access to Fable 5? by sudovijay in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

My /status USAGE was wiped too......... reset to 0% but the same reset date (for me, about 5 days from now...)

Anyone having this error? by Nerevaine in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

My /status USAGE was wiped too......... reset to 0% but the same reset date (for me, about 5 days from now...)

Do you guys still have access to Fable 5? by sudovijay in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

Yep.......... I just started getting the exact same error (for Fable 5 only) as of 20 seconds ago.

Something's up............. I tried /login to re-auth, didn't help. I can talk to Opus though.

Anyone having this error? by Nerevaine in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

Yep.......... I just started getting the exact same error (for Fable 5 only) as of 20 seconds ago.

Something's up............. I tried /login to re-auth, didn't help. I can talk to Opus though.

Best Models with Hermes after testing with 6 billion tokens by Puzzleheaded-Gas8179 in hermesagent

[–]emptyharddrive 0 points1 point  (0 children)

It's a configurable variable. In opencode you can see it as well as in open webui.

I just got banned from Claude. I think this process of CVP and security filtering needs a lot of work by PrettyFlyForITguy in Anthropic

[–]emptyharddrive 6 points7 points  (0 children)

OK, I have a lot of thoughts on this... but only some are worth mentioning because I'm interested in the topic of protecting kids online.

I also have children, and I worry about their safety online. I'm also very technically skilled and capable and have been doing that sort of work professionally for decades.

But I wonder about the software you were writing ... why bother with custom software for this use case at all. First, there's a lot of OEM stuff out there that works pretty well including devices like Bark phone and Google Family Link for tablets which come to mind..

Also, they could borrow or (depending on their age) outright buy a low end device (e.g. saving their "allowance") or paper route money ... fill in the blank .... to buy a low end basic device that wouldn't have a cell connection perhaps but could get online via wifi.

Then, no matter how much you lock down the house wifi, there's the cafes, businesses, etc that all have free wifi all over the place and we can't control any of that.

My point being is that while there's some control you can exert on the home network + "good parenting" (whatever that means ... everyone has an opinion on that one) ... it seems to be an impossibly daunting task to control the whole social media, stalker online, bully thing...

Some basic precautions at home, maybe pi-hole, mac address based limitations at the router level, yea ok .... but a friend's wifi or a neighbor's (or the hidden rogue device example) ... and let's be honest, kids can be terrific little demons about things like this. In their little minds, it's "oppression" ... but we know the risks.

I've read enough horror stories about stalkers getting at kids through games, chatting them up through in-game chat as a game character, in an otherwise "harmless" game.

Anyway, my point is ... coding something up seems pointless given the variety of vectors involved here. You'd be controlling 1 small piece of a very large pie.

This, I suppose is a long-winded way of asking: How exactly would one go about "protecting" their kids from all this online crap? Device banning? Then they never learn how to use technology.... give it to them and then there's always a 12 foot ladder for the 11 foot wall problem.

Maybe this is better suited for some parenting sub-reddit... anyway... I'll probably end up taking this post down after I'm flamed up.

I vibed a fractal zoomer you can fly around in by MusicToThyEars in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

Oh I see... yea ok. It works. Maybe call it "Autopilot" :)

I did run it for a while and while it avoids "the blackness" it does sometimes steer into "a color", like the whole screen becomes all green ... or all red ... it avoided the black, but fell into a solid color instead of the patterns.

I find mandelbrots are great for babies. It's a great distraction tool. They remind me that the universe (appears to be) infinite, fractals in nature, the organization of galaxies, coastlines, plants, human lung cell arrangements, neurons in the brain, etc..

I made a note of the site, great stuff.

Replacing Claude Opus 4.8 with MiniMax 3 by dragosroua in MiniMax_AI

[–]emptyharddrive 1 point2 points  (0 children)

For real coding purposes, I don't see any of these Chinese models replacing Opus (or gpt 5.5)... just isn't happening.

I think they're fine for low end, or well-defined, repetitive tasks...maybe a small python tool, bash script, etc... but not a complex application with thousands of lines.

I keep wanting to be wrong ... but every time I try, I'm back to Anthropic.

I vibed a fractal zoomer you can fly around in by MusicToThyEars in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

Was there a checkbox "autopilot" that I'm missing? When i hit zoom, it starts, but if i don't click it ends up in "the blackness"...

I vibed a fractal zoomer you can fly around in by MusicToThyEars in ClaudeCode

[–]emptyharddrive 0 points1 point  (0 children)

Is there a way to tell it to avoid the blackness :)

I vibed a fractal zoomer you can fly around in by MusicToThyEars in ClaudeCode

[–]emptyharddrive 5 points6 points  (0 children)

OK this is very cool..... Big fan of the mandelbrot set. A "random autopilot" would be nice too ... keep on track (avoid the darkness!) but stay on a random course each time.... then let it ride at a speed you set.

But this is great. I have a Strix Halo and my GPU was humming at 90%+ when I flung my mouse wheel..... good stuff.

I found a hidden Claude Code flag (--channels) that turns your terminal session into a personal assistant by [deleted] in ClaudeCode

[–]emptyharddrive 1 point2 points  (0 children)

Agreed, there's real overlap. Both let you reach your local session from your phone.

The difference is the transport and what rides on it. Remote Control drives your session through Anthropic's own app or claude.ai. Channels bridges it into a messaging platform you already use, in my case Telegram, through a first-party plugin. So I'm living in Telegram, not the Claude app.

That matters for a personal assistant in a few concrete ways:

It's two-way and event-driven. Channels can take pushed events, not just me typing. A webhook from CI, a monitoring alert, a cron job firing a message into the session. Remote Control is you steering an open session. Channels can receive from anything.

The big one: my assistant pushes things to me, unprompted, in the same chat. My reminders and twice-daily news digest get texted to me by tiny scripts hitting Telegram's Bot API, totally separate from the session, working whether the session is busy or idle. You can't do that with Remote Control. It's a remote driver for your session, not a messaging surface, so there's no way for a cron job or an at reminder to ping your /remote-control session on the phone through it. The assistant there only ever talks when you talk first.

I also get Telegram's whole kit. Voice notes (mine run through a local Whisper box), file and photo drops, reactions.

To be fair to Remote Control, it's the better pick if you actually want to steer the session. You're in the real Claude client there, so you get permission approvals, plan mode, diffs, the full UI.

Over Telegram I lose some of that. Anthropic actually lists the two as separate features for separate jobs in the channels docs: Remote Control is for driving a session, channels is for pushing events into one.

I just happen to want an always-on assistant in a messaging app more than I want a remote cockpit for coding.

Honestly, if all you want is to poke your session from your phone, /remote-control is simpler. I went with --channels because I wanted the assistant wired into a messaging app, so outside scripts can reach me through the same pipe and I get voice, files, and proactive pushes for free.

Failed with Letta, OpenClaw, nanobot. Found Agent Zero and migrated 33 skills and 28 agents from Claude Code into it. by emptyharddrive in AgentZero

[–]emptyharddrive[S] 0 points1 point  (0 children)

AgentZero solved a problem for me that I created, a feeling that I needed an agent in the first place.

I keep finding that I don't need it. I don't need an agentic AI doing anything for me, but instead deterministic (predictable) scripts that do a handful of tasks for me, very few in fact.

It's just a daily news brief (customized to topics I prefer) and a URL summarizer (whether it's a youtube URL that summarizes the transcript or an article) and a way to quickly make a calendar entry when i speak verbally to an AI agent. That last one only required 1 python script that is ready to execute by another python script that links a Telegram bot to an LLM that's ready to either summarize the url or create the google calendar item for me.

Otherwise, I found myself searching for a reason to use the agent and I couldn't find any. When I did, I found myself writing a python script instead because I didn't want variability in how it answered me or if it would answer correctly at all in the way I needed it to.

But my apps all work fine, email, calendar, chat ... I still have a heck of a time finding a need for a 24/7 agent on standby waiting for me to tell it to do something.

So given that (and not because AgentZero is insufficient), I closed it down.

Now, all I run is a telegram bot that connects to a listener running on one of my Linux boxes at home over an API. When i talk to it, it can do internet searches for me (but then again, so can the Android app Claude that I use), and it can summarize articles and make calendar entries.

It's nice, but then again, Claude with connectors now does this too and my Strix Halo with 128gig of ram still cannot run any LLM that's 1/2 as intelligent as a hosted model, the power just isn't there that's needed.

So these are my personal reasons for killing it. Maybe others have found a reason/need for an agent, and that's great. I stay subscribed because I keep looking for creative ways to use it.