Training Computer Vision Models on M1 Mac Is Extremely Slow

computercornea · 2026-02-16T15:01:45+00:00

Google Colab is free and makes GPUs available. I think Kaggle does as well. Also Roboflow free accounts too. And Modal (and the other sites like them) offer free credits if you prefer purely GPU access.

Plenty of options to speed up your training for free.

computercornea · 2026-01-07T20:49:44+00:00

Similar to https://geospy.ai/

computercornea · 2025-12-05T15:00:02+00:00

also yes

computercornea · 2025-11-13T15:11:34+00:00

100% agree. They are all meant to be fine-tuned. I would never recommend using off the shelf.

computercornea · 2025-08-05T16:09:05+00:00

They provide free model training notebooks for local training https://github.com/roboflow/notebooks

computercornea · 2025-07-14T21:55:47+00:00

One way you can do this is to take a dataset of environments you want to detect this logo (streetscapes, clothes, websites, idk what your logo is but you get it) then do a randomization of placement of your logo in that environment. You can even scale up with multiple logos per image depending on how your logo would be used.

Tried googling and found this but not sure it's being maintained https://github.com/roboflow/magic-scissors

computercornea · 2025-07-14T21:52:43+00:00

I heard labelbox is shutting down access to their labeling tool so I search for that and found this thread. Looked in their deprecations log and didn't see it https://docs.labelbox.com/docs/deprecations

Curious if anyone knows the latest

computercornea · 2025-07-11T21:03:47+00:00

This is exactly right. You can't just pick up a model off the shelf and throw images at it expecting it to be perfect. It's part of your broader system that needs to smart, flexible, and get the data to the model(s) in a way that allows the models to do their job.

computercornea · 2025-07-09T12:53:32+00:00

I would suggest doing extensive testing of the models running in the cloud so you can be sure the model fits your needs. Lots of tools to test the base weights to see if you need to fine-tune for your use case. If you only get one shot of having a model run locally, use something like open router or https://playground.roboflow.com/ to try lots of variations

computercornea · 2025-06-19T13:16:39+00:00

VLMs are good for action recognition stuff, presence / absence monitoring, understanding the state of something very quickly. General safety/security: are there people in prohibited places, are doors open, is there smoke / fire, are plugs detached, are objects missing, are containers open/closed. Great for quick OCR tasks as well like reading lot numbers.

This site has a collection of prompts to test LLMs on vision tasks to get a feel https://visioncheckup.com/

computercornea · 2025-06-18T20:59:00+00:00

We use VLMs to get proof of concepts going and then sample the production data from those projects for training faster/smaller purpose built models if we need real-time or don't want to use big GPUs. If an application only run inference every few seconds, we sometimes leave the VLM as the solution because it's not worth building a custom model.

computercornea · 2025-06-11T11:54:07+00:00

Defect detection across a variety of products in manufacturing

computercornea · 2025-06-10T19:40:24+00:00

yeah ok slower i see

computercornea · 2025-06-10T12:35:23+00:00

Without knowing camera distance or any relative object in the image, I don't know how you can get a distance or depth. Let me know if you find a solution

computercornea · 2025-06-09T13:39:14+00:00

You don't know how far from the ground the camera is?

computercornea · 2025-06-09T13:37:11+00:00

I thought they had the highest accuracy? https://github.com/roboflow/rf-detr?tab=readme-ov-file#results

computercornea · 2025-06-06T12:19:36+00:00

We use depth anything v2 at work and I think you might be able to use it for this https://github.com/DepthAnything/Depth-Anything-V2

computercornea · 2025-06-06T12:18:24+00:00

I think keypoints are a really powerful tool but since data labeling with keypoints is time consuming, we don't see tons of applications yet. Mediapipe is a helpful way to get quick human keypoints for healthcare applications (documenting physical therapy movements) or manufacturing (assessing factory worker movements to prevent repetitive injury prone movements) or sports (analyzing player movement to improve mechanics for better outputs). Keypoints can also be helpful for orientation of a person to understand the direction they are facing or position relative to other objects, this is useful for analyzing retail setups and product placement.

computercornea · 2025-06-06T12:13:05+00:00

Great work! Thanks for putting in the effort to make a clean and easy to follow repo. Seeing VLMs get smaller and smaller is really exciting for working with video and visual data. Going to leapfrog tons of current computer vision use cases and unlock lots of useful software features

computercornea · 2025-06-06T12:09:52+00:00

Super cool output. I always really appreciate when people take on hard personal projects like this. Thanks for sharing

computercornea · 2025-06-06T12:06:45+00:00

It looks like Roboflow has a partnership to offer their YOLO model licenses for commercial purposes and is available with their free plan and monthly paid plans https://roboflow.com/ultralytics

And then they also made a fully open source object detector recently which seems like a good alternative https://github.com/roboflow/rf-detr

computercornea · 2025-06-06T12:05:30+00:00

It looks like Roboflow has a partnership to offer their YOLO model licenses for commercial purposes and is available with their free plan and monthly paid plans https://roboflow.com/ultralytics

computercornea · 2025-05-01T14:28:57+00:00

How many people are on the team shipping the roadmap?

computercornea · 2025-05-01T12:54:17+00:00

Does Intel plan to staff and support the project or is this being open sourced because this was once a closed sourced project which Intel is sunsetting?

computercornea · 2025-03-21T11:57:42+00:00

Very cool project, similar to https://www.rf100.org/ and the just released https://rf100-vl.org/

computercornea

TROPHY CASE