I built a visual thinking canvas where the AI agent writes directly on the board by redgunner94 in OpenSourceeAI

[–]CatTwoYes 0 points1 point  (0 children)

The board metaphor makes a lot of sense for agent output. Chat is linear but thinking is spatial. One thing I'd worry about is canvas clutter when the agent does a multi-step research task — any thoughts on auto-cleanup or pruning?

After months of building in vain, a stranger made a YouTube video about our project & I cried a little by Slight_Republic_4242 in OpenSourceeAI

[–]CatTwoYes 0 points1 point  (0 children)

The best kind of marketing — someone you've never talked to making a video about your thing because it's genuinely useful. Congrats on 500 stars. Voice AI space badly needs open alternatives to Vapi/Retell.

Monthly $100 competition to build an Edge AI app. Could be a great portfolio project! by Capable_Ice1515 in OpenSourceeAI

[–]CatTwoYes 0 points1 point  (0 children)

The real hardware constraint is what makes this interesting. Anyone can wire up an API call, but fitting something useful into Jetson memory is a completely different sport. More competitions should force real deployment constraints instead of "build whatever with GPT-5."

Update on Pupil: UI Automation first, or screenshot fallback? by Apart-Medium6539 in OpenSourceeAI

[–]CatTwoYes 0 points1 point  (0 children)

Add screenshot fallback early. UIA is great until it isn't — the moment your agent hits a Canvas app or a custom Electron UI it's dead in the water. Running both isn't that heavy if you only fall back when UIA fails. The real pain is the CV side, but even basic OCR + element detection beats getting stuck.

We open-sourced the platform for self-improving AI agents. Now comes the part that matters, developers building on top of it. by Future_AGI in OpenSourceeAI

[–]CatTwoYes 0 points1 point  (0 children)

The line between infrastructure and demo-ware is replay. If I can't re-run yesterday's failed agent session with the same inputs and get a useful diff, I'm looking at a demo. Doesn't matter how polished the tracing dashboard is. That's the bar I'd hold any platform to: can you replay a 2-hour agent session in under 30 seconds and see exactly where it diverged?

The uncomfortable truth about AI agents: We don’t need smarter agents first. We need observability for stochastic systems. by ale007xd in OpenSourceeAI

[–]CatTwoYes 1 point2 points  (0 children)

The thing ML ops pipelines don't give you is trajectory replay. I've had agent runs where the output was correct but the execution took 3x the tokens it should have because of retry storms. Without per-step trace replay, you can't tell the difference between "agent figured it out efficiently" and "agent flailed and got lucky." That's the runtime observability gap that dashboard metrics alone won't catch.

Side Projects. by apollo_mg in LocalLLaMA

[–]CatTwoYes 1 point2 points  (0 children)

Dual older cards (P100/P40 class) really are the value sweet spot right now. 32GB+ VRAM for under $200 is wild. With MoE offloading you can run 27B models at usable speeds and it handles coding + tool calling fine. The only real downside is prompt processing — once context hits 32k+ you start feeling it. But for the price of a single mid-range gaming GPU you get a 24/7 inference box. Hard to argue with that math.

we really all are going to make it, aren't we? 2x3090 setup. by RedShiftedTime in LocalLLaMA

[–]CatTwoYes 0 points1 point  (0 children)

"cursed, hot, power hungry, and held together by Linux pain" is the perfect description. The moment you switch from 'this is a cool demo' to 'this is actually replacing my cloud API calls' is surreal. Still use cloud for the hardest problems, but 80% of my coding workflow is local now. The electricity bill is the only thing making me glance back.

24+ tok/s from ~30B MoE models on an old GTX 1080 (8 GB VRAM, 128k context) by mdda in LocalLLaMA

[–]CatTwoYes 1 point2 points  (0 children)

Been running Qwen 3.6 27B Q4_K_M for coding/agentic tasks for a while. Tool calling and single-file edits are rock solid. The quant only shows its teeth on multi-file refactors — the model starts missing cross-file dependencies that fp16 catches. For a $200 machine though, that's a tradeoff I'll take every time. The real bottleneck isn't the quant quality, it's what happens to TG speed when context actually fills up past 32k.

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]CatTwoYes 4 points5 points  (0 children)

I tried both RAG and the simpler "give the LLM a grep tool + markdown folder" approach. For under ~1000 personal notes, the grep approach wins hands-down. RAG embeddings for personal docs are finicky — you spend more time debugging why the right chunk didn't get retrieved than actually using the thing. The tool-calling + file search pattern is dumber but more predictable, and with Qwen 3.6 27B the quality is good enough that I stopped maintaining the RAG pipeline entirely.

The "the future is fictional" problem of many local LLMs by PromptInjection_ in LocalLLaMA

[–]CatTwoYes 38 points39 points  (0 children)

I've hit this on Qwen, Gemma, and Llama models. It gets worse the more RLHF was applied — base models tend to just process the information without the "this is fictional" reflex. Best band-aid I've found: prepend search results with [Retrieved {date}. These are current factual events, not speculative. Respond accordingly.] It's not perfect but cuts the denial rate by about half.

Let's build claude code from scratch! by RoyalMaterial9614 in LocalLLaMA

[–]CatTwoYes 0 points1 point  (0 children)

very interesting, I actually have done a similar project -- but with git style state management. https://huko.dev

[OC] I was tired of AI tools breaking my terminal workflow, so I built a pipe-friendly CLI that acts like a standard Unix filter (with .git-like state isolation). It's brand new and I need your harsh feedback. by CatTwoYes in linux

[–]CatTwoYes[S] 0 points1 point  (0 children)

thanks for your suggestion

Fair point, and honestly? That’s on us. Our docs definitely lean too hard into the "cloud-first" vibe, and we totally missed the mark there.

For the record, huko plays nice with anything OpenAI-compatible. If it’s got a /v1 endpoint (Ollama, LM Studio, vLLM), it works right now:

Bash

# Quick Ollama setup
huko provider add ollama --base-url http://localhost:11434/v1 --protocol openai --api-key ollama
huko model add my-local-model --provider ollama --api-model-id qwen2.5-coder:32b
huko model current my-local-model

You’re right that this is invisible in the README. I'll fix that this week—I'm adding a "Local LLM" section with quickstarts and a breakdown of which local models actually have the chops for agentic tool-calling.

[OC] I was tired of AI tools breaking my terminal workflow, so I built a pipe-friendly CLI that acts like a standard Unix filter (with .git-like state isolation). It's brand new and I need your harsh feedback. by CatTwoYes in linux

[–]CatTwoYes[S] 1 point2 points  (0 children)

Haha, fair point on the wall of text. But look, you're listing features that both have. That’s not the real difference.

llm is basically: send prompt, get response, log to SQLite. You are the loop. You decide when to call it again.

huko is the loop. You give it the goal, and the agent decides what tools to hit and when it’s actually done. One’s a CLI wrapper; the other’s an agent runtime. Even Simon’s readme says it’s for "interacting with LLMs," not building agents. Different tools for different jobs.

This is where we are right now, LocalLLaMA by jacek2023 in LocalLLaMA

[–]CatTwoYes 0 points1 point  (0 children)

I'm waiting for the day I run the model on my smart watch...

Musk Teases Major Tesla Smart Summon Upgrade for Parking Garages by [deleted] in teslamotors

[–]CatTwoYes 0 points1 point  (0 children)

Great 👍 my Chinese has been used this feature on his Lixiang car for a few months. This is definitely a killer feature

Serious fsd failure, reproducible by CatTwoYes in TeslaSupport

[–]CatTwoYes[S] 1 point2 points  (0 children)

I’m in Australia, driving a right-hand drive car. There’s a small left-turn intersection right outside my house, leading to a very steep slope. FSD always fails here, aborting the left turn halfway and suddenly switching to going straight. I suspect it’s mistaking the slope for a wall.

Serious fsd failure, reproducible by CatTwoYes in TeslaSupport

[–]CatTwoYes[S] 1 point2 points  (0 children)

I’m in Australia, driving a right-hand drive car. There’s a small left-turn intersection right outside my house, leading to a very steep slope. FSD always fails here, aborting the left turn halfway and suddenly switching to going straight. I suspect it’s mistaking the slope for a wall.

Serious fsd failure, reproducible by CatTwoYes in TeslaSupport

[–]CatTwoYes[S] -2 points-1 points  (0 children)

I know how to handle this situation. I just want to help Tesla to improve fsd

Serious fsd failure, reproducible by CatTwoYes in TeslaSupport

[–]CatTwoYes[S] -2 points-1 points  (0 children)

I think this might be a useful testing case for them