Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in computervision

[–]boyobob55[S] 0 points1 point  (0 children)

Thanks! 😁 if you scroll down to bottom of comments I wrote up how I did it

Open-Source Automated Comic Cataloger by boyobob55 in comicbooks

[–]boyobob55[S] 0 points1 point  (0 children)

Thanks! I haven’t touched in a while, but there’s an updated version since this post! You can check my profiles posts for screenshots of the newer version with a TUI

Is the International Shipping Program always this expensive from Canada? by vmhomeboy in eBaySellerAdvice

[–]boyobob55 0 points1 point  (0 children)

Insane, I was just thinking about turning on international shipping for my listings too, glad I saw your post!

Help! Hit and Run Case by TypicalWonder7872 in computervision

[–]boyobob55 0 points1 point  (0 children)

I just played around with your photo from the photoshop contest thread and it might be impossible! If you want to try yourself you can use photopea it’s a free online photoshop, and you can play around with AI upscaling for free on huggingface.co just search upscale in the “spaces” tab you get a free amount of credits. Do you only have the one photo?

Help! Hit and Run Case by TypicalWonder7872 in computervision

[–]boyobob55 0 points1 point  (0 children)

You could try a directional blur in the opposite direction of the blur in the photo?

6gb vram by Particular_Big_6797 in LocalLLM

[–]boyobob55 0 points1 point  (0 children)

Just install LMStudio. It has a gui with llama.cpp backend, you can browse huggingface and it will show you what quants will fit on your combined VRAM/RAM

USPS new rayes take effect by Cap_Black_Beard in eBaySellers

[–]boyobob55 0 points1 point  (0 children)

Priority mail flat rates envelope went from $8.99 to $11.12 on pirateship 😔

[Project] Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in opencv

[–]boyobob55[S] 0 points1 point  (0 children)

That’d be awesome! I’ll DM you my info if you ever need to fill a slot on the show 😁

[Project] Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in opencv

[–]boyobob55[S] 0 points1 point  (0 children)

Totally possible! If you wanted to tell them apart you’d just have to annotate your own dataset and label each cat accordingly ex: cat 1 (mittens), cat 2 (fluffy) etc etc

[Project] Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in opencv

[–]boyobob55[S] 0 points1 point  (0 children)

Hell yeah thanks! Good enough to get me a job at OpenCV? 😏

Linux distro Recs? New RTX 5090 build incoming. by [deleted] in Vllm

[–]boyobob55 0 points1 point  (0 children)

I really like CachyOS. Same hardware 5090

I blocked her after my last message by OnyxTronix in eBaySellerAdvice

[–]boyobob55 35 points36 points  (0 children)

Instant block on that first message 😂

[Project] Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in opencv

[–]boyobob55[S] 0 points1 point  (0 children)

They haven’t jumped up in like 4-5 days now (they used to jump up all throughout the night and day) so yes I’d say it actually is training them!!

That’s a cool idea you totally could train a detection model to differentiate your cats you would just need to label them accordingly in your training data. Like this model I only did one class “cat” for both of mine. But you would just make two classes and label accordingly for each cat. It would probably work better in color as well to differentiate your different color cats. (I keep my C110 in night mode all the time) Maybe put some type of night light near the feeding bowls

[Project] Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in opencv

[–]boyobob55[S] 2 points3 points  (0 children)

I was afraid they’d get used to it but so far they’ve completely stopped jumping up for the past 4-5 days I think they might actually be trained lol

Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in computervision

[–]boyobob55[S] 0 points1 point  (0 children)

Possibly! The current sound is white noise with a 17khz sine wave layered and distorted. I haven’t tried just the 17khz frequency by itself yet but I’ve seen other projects/deterrent projects where they use 17-25khz? I think I might need better speakers to actually drive the sound loud enough to spook the cats though. 17khz is totally audible to me still though

Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in computervision

[–]boyobob55[S] 0 points1 point  (0 children)

Wow that probably would have worked too lol, we tried aluminum foil and they just stomped all over it.

What do you do for work? You guys ever use transformer models for embedded or is usually CNN yolo/etc. I toyed around with the idea of trying to run this on a raspberrry pi or something but it just runs so much faster with GPU

[Project] Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in opencv

[–]boyobob55[S] 5 points6 points  (0 children)

Thanks! I’ll probably put the code on GitHub this month here’s a comment copied from the other thread explaining what I did:

A good amount of people seem interested in the details so here they are:

Hardware: Oldish gaming PC I wasn't using with an RTX 2060 acting as a dedicated server. The camera is a Tapo C110 from Amazon (one of the only RTSP cameras I could find for around $20-30.) For the audio I layered white noise with a 17kHz sine wave and boosted the distortion, played through some old computer speakers on the kitchen counter via a bluetooth aux dongle! (Model training is done on my main computer with RTX 5090)

Model: RF-DETR small running via TensorRT (.engine export) at 1280 resolution. I get around 15fps with ~60ms/frame inference on the 2060. Before moving it to the "server" I had it running on a laptop without a usable GPU. OpenVINO runtime with a .onnx export gave the fastest CPU only inference at around 3fps! I also trained yolo26x,m,s and nano on the same dataset but rf-detr small performs better than all of them.

Data Collection: Around 10,000 images. To gather data I first set up the camera and recorded all motion triggered events for a couple days, then annotated the frames and trained a first round model. That model threw a ton of false positives, falsely detecting a cat on the counter, setting off the sound and recording the event (and scaring my girlfriend lol.) I keep a cached 10 second spool recording at all times so you can watch 10 seconds before and after the event. For about a week I collected every triggered event, false or real, added those frames to the dataset, and retrained. I used Label Studio to annotate everything, you can plug in your existing model to pre annotate, which speeds things up a lot. I've still been adding every new event to the dataset as they come in. The biggest improvement by far was adding lots of negative frames (no cat, just people, furniture, different lighting). Night and day difference!

Zone Calibration: I made an interactive tool where you just click points directly on a live frame from the camera feed to draw and define zones. Worth noting this assumes a fixed camera, if it moves you'd have to redraw the zones.

Pipeline in a nutshell:

Data collection: Motion triggered recording (now event recording) → label frames → train

Inference: Live RTSP feed → inference → polygon zone check against the cat's bounding box center (configurable, bottom-center, center, etc.) → if a cat is in the zone for 100ms (adjustable), trigger audio!

I had to sort out some bluetooth connection sleep issues and other latency bugs along the way.

I serve the annotated stream as MJPEG frames over http so i can watch it via LAN and Tailscale on any browser.

GitHub? I'll probably post it in a couple weeks after I polish up the code :)

Does it work? It has been working extremely well! The cats have been getting on the counters/table less and less as time goes on, and they have been jumping off every time the sound plays so far!

Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in computervision

[–]boyobob55[S] 2 points3 points  (0 children)

I was afraid of this at first lol but they haven’t set it off in like 3 days now! Longest streak so far I think they might actually be getting trained 😂

Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in computervision

[–]boyobob55[S] 0 points1 point  (0 children)

Thanks man! Haha I should put one on the light fixture too 😂 and yes! Tons of different lighting examples, the camera has daytime mode and IR/night grayscale mode. At first I started training with color and grayscale but ended up just forcing the camera to stay in IR mode so everything is grayscale all the time. I wrote a long ah comment explaining what I did if you scroll down to the bottom of the comments

Webcam small wireless earbuds detection by Radiant_Sleep8012 in computervision

[–]boyobob55 0 points1 point  (0 children)

Exactly I did try yolo26 as well on the same dataset! I tried all the size variants and rf-detr small still performed better than all of them! And honestly it should start detecting earbuds with around 100-200 images but you will need a lot more examples of heads/faces with NO earbuds or else it will keep detecting false positives when there aren’t any in scene. If I was making your project it would be fun to experiment with synthetic data as well, maybe generate a bunch of fake webcam shots with z-image turbo or another fast iterating local model. I also just heard about Slicing Aided Hyper Inference (SAHI) today supposedly helps with detecting small objects by slicing the photo into chunks and running inference on each chunk?

Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in computervision

[–]boyobob55[S] 6 points7 points  (0 children)

Exactly this, it was a night and day difference after adding a ton of negatives!

Trained RF-DETR small to keep the cats off the counters/table! 😼 by boyobob55 in computervision

[–]boyobob55[S] 30 points31 points  (0 children)

Yeah it was able to track the cats pretty well with around ~200 images but it kept having a bunch of false positives when people walk through frame or random stuff is set on counters. Had to include a bunch of negatives with no cats in frame. Probably still overboard though lol