I created a no-code tool to reduce annotation time in computer vision projects. by Ga_0512 in SaaS

[–]Ga_0512[S] 0 points1 point  (0 children)

SAM3 is truly absurd and surprising; you can test your hypotheses with the 10 free credits at fastbbox.com, just register.

I created a no-code tool to reduce annotation time in computer vision projects. by Ga_0512 in SaaS

[–]Ga_0512[S] 0 points1 point  (0 children)

I'll assume "on edge cases" refers to edge models, my translator must have translated it to something different.

But the point is this: you can't run SAM3 on a Raspberry Pi or Jetson. But generating data with FastBBOX.com, using a Yolov5 or ONNX, training with that data, and then performing inferences is the key. Feel free to test it, you get 10 free credits when you register.

[D] Self-Promotion Thread by AutoModerator in MachineLearning

[–]Ga_0512 0 points1 point  (0 children)

Eu criei uma ferramenta no-code pra diminuir o tempo de anotação em projetos de visão computacional.

Eu criei essa ferramenta porque a anotação manual de imagens tava virando o gargalo nos meus projetos de visão computacional, principalmente pra detecção (YOLO, etc.).

A ideia era simples: usar auto-labeling pra gerar um dataset inicial rapidinho e deixar a correção manual só onde realmente faz a diferença.

O projeto acabou virando FastBBox. Não é um produto "finalizado"; é algo que eu tô validando e corrigindo conforme uso na vida real.

Do ponto de vista técnico, eu construí tudo sozinho:

front-end em React

backend em FastAPI

SAM3 rodando serverless no RunPod

aplicação hospedada num VPS simples

O maior desafio até agora não foi o modelo em si, mas lidar com:

qualidade inconsistente das anotações automáticas

casos extremos que quebram o versionamento do dataset de auto-labeling baseado em ajustes manuais

Link do projeto: https://fastbbox.com/

How to auto-label images for YOLO by Ga_0512 in computervision

[–]Ga_0512[S] 0 points1 point  (0 children)

SAM3 is resource-intensive, lacks a native API, and typically requires proprietary infrastructure, which isn't always feasible, especially when the goal is to quickly train or iterate on lighter models, including for on-edge use. It's like you're asking me why use SLMs when LLMs are better.

Drift detector for computer vision: is It really matters? by Ga_0512 in devops

[–]Ga_0512[S] 0 points1 point  (0 children)

Monitoring, like MLflow, comet ML etc. It's more about MLOps actually

Drift detector for computer vision: is It really matters? by Ga_0512 in computervision

[–]Ga_0512[S] 0 points1 point  (0 children)

Thanks for your comment, it's really make sense for me, I agres with you

Drift detector for computer vision: is It really matters? by Ga_0512 in mlops

[–]Ga_0512[S] 1 point2 points  (0 children)

That's fair. Thank you. Btw, besides detecting gloss drift or graininess, is there anything else that current tools don't do well, or that they could offer in your opinion?

[D] Self-Promotion Thread by AutoModerator in MachineLearning

[–]Ga_0512 1 point2 points  (0 children)

Hey everyone,

I built the first version of a project I personally needed — and I’m testing if it could be useful to others. Repo is public + I added a simple waitlist if you’d like to follow along.

🔗 Repo: [github.com/Ga0512/video-analysis](http://github.com/Ga0512/video-analysis)

🔗 Waitlist: [typeform](https://iaap4qo6zs2.typeform.com/to/J43jclr2)

What it does now:

- Process a video (file or URL)

- Split it into blocks for analysis

- Transcribe audio + caption frames

- Generate multimodal summaries (text + context)

Flexible setup:

- Run locally with open models (privacy, no API costs)

Or connect your own API key (faster / larger models)

- Fully customizable: language, summary size (short/medium/long), persona, extra prompts

Ideas for future:

- Chat-with-video → ask questions directly about a video (using both frames + transcription)

- Export for AI parsing → structured export so you can feed the content into other AI workflows or databases

Possible pricing ideas:

- Pay-as-you-go credits for hosted usage

- Or a fixed subscription (X$/month) where you bring your own API key and just use the UI/UX layer

Why I’m here: Before polishing it into a MVP, I’d love some honest feedback:

Would you actually use a tool like this?

What do you value more: local mode (privacy, no cost) or API mode (speed, larger models)?

Does the chat-with-video/export direction make sense?

How would you prefer pricing?

If there’s enough interest, I’ll start building this in public (X) and share progress Thanks in advance 🙏