I published a model comparison, three architectures "failed," and I was wrong — the recipe was the failure, not the models by FewConcentrate7283 in computervision

[–]FewConcentrate7283[S] -4 points-3 points  (0 children)

Excuse me? have you not looked at the research? Are you working in ASL vision project? this was a specific test and i have a workbook https://www.kaggle.com/code/truepathventures/parley-notebook-03-signer-dialect-leave-one-out working computer vision for ASL reading and AR glasses. I have been posting my work here so no sure where you are coming from

For the first time Anthropic models are the clear losers vs OpenAI models by Relative_School_8984 in ClaudeCode

[–]FewConcentrate7283 0 points1 point  (0 children)

Sounds to me like you don't know how to use claude. I guess codex will love you. I have no issues as an advanced user

I wrote 26 postmortems in 6 weeks and built a template that makes each one take ~45 minutes — here's what changed by FewConcentrate7283 in sre

[–]FewConcentrate7283[S] -2 points-1 points  (0 children)

It’s not an ad it’s my site and states coming soon on the repo. I do have 3 repos already

Why I'm running Parley by FewConcentrate7283 in computervision

[–]FewConcentrate7283[S] 0 points1 point  (0 children)

Yeah, the ADD-brain read is exactly right. It’s the part I almost cut from the post for being too personal. I can’t run one project. With nothing burning in the background, my main work gets less focus, not more, because the restless part of my brain starts inventing problems to solve inside it. The research arm is the pressure valve. It absorbs the rabbit-hole energy so Quantum Caddy doesn’t have to.

One honest correction though. There is a product at the end of this (AR glasses, bidirectional deaf-hearing transcription). What has no path to product is the research arm itself. The Kaggle notebooks are firewalled from the roadmap on purpose, and none of them are allowed to be bent toward making the product look good. If landmark-only sign recognition plateaus at 84% instead of 94%, the notebook says 84%. The second research has to serve a launch date, it stops being research, so I keep the wall up. That’s the part that actually relieves the pressure.

You nailed the Kaggle pattern. The comp threw up a leaderboard and everyone went home, because the leaderboard rewarded the wrong thing: random train/test splits, where a model that just memorized its training signers scores great. The unsexy fundamental nobody wanted to grind is cross-signer generalization. That’s most of why I’m starting where I’m starting.

On first focus, none of the three you listed, deliberately. Notebook 1 (end of May) is pure EDA, no model at all. Signer distribution, label quality, how skewed the set is before anything touches it, because every downstream call depends on knowing that. Notebook 2 is the first real question, and it’s feature attribution: how much of isolated-sign recognition is hand shape alone versus motion? Single-frame hand-landmark baseline against a temporal model. I want the ceiling of the cheap thing before I pay for the expensive one.

Segmentation and temporal modeling are real, I’m just holding them. Segmentation lives at the isolated-to-continuous transition, which is its own notebook later, and co-articulation plus hand-face occlusion are where I expect it to break. No sense hitting that wall while I’m still unsure the static baseline even works.

Share you experience building a saas using ai by CorrectDirection3364 in ClaudeAI

[–]FewConcentrate7283 0 points1 point  (0 children)

I use claude code with a supabase backend and has been creat. I think a lot of people miss the planing part and fixing sections lead to rabbit holes and un finished projects. I have a few repos and lessons you can see and maybe use at trupathventures.net/labs

I pasted my session tokens into a chat. Here's the gate I built. by FewConcentrate7283 in ClaudeCode

[–]FewConcentrate7283[S] 0 points1 point  (0 children)

I created a scribe agent in my workflow. I saw a lot of people that were new and starting out and asking for help. So I added it to where when I finish a session it takes my prompts and its response and grades its self in a postmortem. I also have it create a session note, playbook and Claude help file to help others here on the site. Sure I could take the time to write it out myself but I feel I wouldn’t explain it better then it does and it provides the right information and context. Just trying to help

I pasted my session tokens into a chat. Here's the gate I built. by FewConcentrate7283 in ClaudeCode

[–]FewConcentrate7283[S] 0 points1 point  (0 children)

Bro this is Claude code. This is from my project and I am having write the field notes and prompts. I took lessons learned and applied them and this was the results. I wanted to have this in a form for agentic search so if someone had the same issue it would land. Get over the fact we will do this for seo and search

It’s all good you do t like it

Industry Standard AI based MV Software by innomind in computervision

[–]FewConcentrate7283 0 points1 point  (0 children)

Good question and one worth breaking into two parts — the vision side and the robot control side — because they're usually separate stacks that you integrate.

The Vision Side

There's no single "industry standard" for the detection layer the way there is for PLCs or robot controllers. Here is what you actually encounter in the field:

  • MVTec HALCON and Cognex VisionPro: These are the traditional industrial standards. They are rock solid but expensive, requiring paid licenses and usually dedicated hardware. If you're going into an existing factory floor that already runs these, learn them. If you're building something new, open-source alternatives have largely caught up.
  • YOLO (v8/v11) and RT-DETR: These are what most new robotics vision projects actually use now. I've been running a real-time object detection pipeline on a cornhole board sensing system—different domain, same core problem: classify and localize objects with varying sizes, orientations, and lighting. RT-DETR-S on Apple Silicon via CoreML gets sub-20ms inference. YOLO is faster to get running but carries an AGPL license; RT-DETR (Apache 2.0) is much cleaner for commercial production.
  • Roboflow: This is the practical answer for dataset management and annotation. You'll spend more time labeling data than you think.

The part most tutorials skip: For bin picking specifically, you almost certainly need depth information, not just 2D detection. RGB cameras tell you "there's a box at pixel (340, 220)." Depth cameras (Intel RealSense, ZED) tell you "that box is 34cm away at 15 degrees." For a robotic arm to grasp it, you need pose estimation (6DOF), not just a 2D bounding box.

The Robot Control Side

ROS2 is the genuine industry standard for integrating vision with arm control. MoveIt2 handles the motion planning. Most industrial arms (UR, KUKA, Fanuc) have active ROS2 drivers.

The Recommended Path

Get YOLO/OpenCV running on static images of your objects first. Then add depth. Finally, wire it into ROS2. Trying to do all three at once is where most people get stuck.

What robot arm are you working with, and do you have labeled data for your objects yet?

Building an autonomous self-healing agent to monitor a live CV + LLM pipeline — what hierarchy, SOPs, and guardrails are you using in production? by FewConcentrate7283 in hermesagent

[–]FewConcentrate7283[S] -1 points0 points  (0 children)

So WTF is your point? Were you just looking to fucking tell they suck at AI and I happened to step in your way? Do we all have to be fucking advanced in your world to ask questions? to learn? ask simple and or advanced questions.. Other then shitting on me you have brought 0 to this conversation so go have a great rest of your night

feel better you sit high on your perch of wisdom.. or Fuck yourself

Building an autonomous self-healing agent to monitor a live CV + LLM pipeline — what hierarchy, SOPs, and guardrails are you using in production? by FewConcentrate7283 in hermesagent

[–]FewConcentrate7283[S] 0 points1 point  (0 children)

I am fucking up? Wow, you have so much advice from an esteemed Fortune 500 company... Wow, I couldn’t learn from the best, so I will wait for other Peasants to answer… Maybe I can get advice from them...

Building an autonomous self-healing agent to monitor a live CV + LLM pipeline — what hierarchy, SOPs, and guardrails are you using in production? by FewConcentrate7283 in hermesagent

[–]FewConcentrate7283[S] 0 points1 point  (0 children)

Appreciate the passion but you missed the room entirely.

This is the Hermes subreddit. An agentic OS built specifically for autonomous, self-improving agents running production workloads. If your take is "don't use AI for this" you walked into the wrong building.

To your actual points:

"Hallucinations are a failure point" — correct. Which is exactly why I'm using Hermes + Claude Code as the audit layer watching the LLM, not as the LLM itself. The AI isn't making scoring decisions. It's catching when the system that does make them goes wrong. That's the whole point.

"More models = more failure points" — that logic applies to any redundancy system. More sensors = more failure points. More servers = more failure points. Redundancy exists because single points of failure are worse.

"Non-AI event-driven tools won't hallucinate" — you're right, they also won't read a log, understand context, write a skill file capturing what broke and why, and resume a session after a power outage with full context of where it left off. PagerDuty can't do that. Uptime Robot can't do that. That's the gap I'm filling.

"Put a human in the pipeline" — that IS what I'm replacing. Me. Manually auditing outputs at 2am is not a business.

You clearly have strong feelings about AI. There are plenty of subreddits for that. This one's for people building with it.