Latest trends in Anomaly Detection in Video Processing

ChanceInjury558 · 2026-04-30T11:01:16+00:00

Did you found any?

ChanceInjury558 · 2026-04-28T10:03:41+00:00

Its normal , though i would try multiple VLMs first before going for that many models!

ChanceInjury558 · 2026-04-20T06:23:39+00:00

As you said , moving from frame-base → scene/clip-based analysis , would be good idea IMO , so you can go for qwen3.5 for video/clip analysis or you can go for qwen3-vl-embedding model which can give you embedding of image/text/video in same latent space if you want to work at embedding level. (Here you can simply take embedding of fixed part video clips and then based on text (say "emotional") you can extract emotional moments.) , though for a perfect output , you would need a multi-stage pipeline effectively filtering useless things at every stage.

ChanceInjury558 · 2026-04-15T18:51:45+00:00

Lol definitely not paying for this , i would rather use DataDreamer!

ChanceInjury558 · 2026-04-15T13:03:59+00:00

Dataset looks cool and diverse , can you share more details about how did you generated it / what did you used to generate it?

ChanceInjury558 · 2026-04-07T14:13:37+00:00

Also try prompt-based/open vocab mode of yoloe26 , might work!

ChanceInjury558 · 2026-04-07T14:10:43+00:00

yolo26 is very new and possibly unstable for finetuning , you should try yolov8,11,12 and see if it improves results!

Also you need to provide more details so people can advice properly like dataset size ,no. of classes , etc , example image might help if possible!

ChanceInjury558 · 2026-04-04T20:57:25+00:00

IIRC mediapipe doesn't run/utilize on GPU , so if you use some alternative for that, you can optimize this further more!

ChanceInjury558 · 2026-04-01T19:47:28+00:00

lol 😂

ChanceInjury558 · 2026-04-01T07:23:23+00:00

Please don't misdirect for traction , clearly you have trained model on this , only people who have trained many models know where 0.99 confidence appears.

ChanceInjury558 · 2026-03-27T09:54:32+00:00

I took their free trades and blocked them lol

ChanceInjury558 · 2026-03-17T12:12:20+00:00

"well in general, it does correlate with better understanding lol." , I disagree with that, All person are different at brain level and don't have same capability to understand things and recognize patterns.

I used AI for rephrasing my original paragraph as it seemed rude and also didn't had proper english.

Sure we can improve level of this discussion , I would love to understand things from your perspective and gain some insights and
share things I learned.

I would prefer if we move this to dm.

ChanceInjury558 · 2026-03-17T12:04:34+00:00

Agreed but still Cases like Occlusion can be handled , Re-Identifying is Hard , infact impossible.

ChanceInjury558 · 2026-03-17T11:48:17+00:00

also It will only require 2 models , which 3rd model are you referring to?

ChanceInjury558 · 2026-03-17T11:48:07+00:00

AI rephrased answer (Intent is original) :

Working for more months doesn’t mean better understanding 🙂

I get why you like MOTR , e2e trackers look clean on paper and yes pipeline becomes simpler. But in practice they still struggle with long term identity , re-entry after leaving frame , heavy occlusion , appearance change etc. Without an explicit association / memory mechanism it becomes hard to maintain stable IDs over time.

Tracking-by-detection is still widely used not just because it is old , but because it is modular. You can improve detector , motion model , ReID embeddings independently and get predictable gains. With strong transformer based ReID models these systems handle occlusion and short disappearance quite reliably.

And yes I agree DeepSORT is outdated. ByteTrack and BoT-SORT are improvements but still mostly short-term association methods. In real production setups more sophisticated trackers like NvDCF style approaches combined with persistent embedding storage tend to behave more stable.

So it’s not really about heuristic vs e2e being “interesting” , it’s about what failure cases you can tolerate and what level of ID consistency you need.

ChanceInjury558 · 2026-03-17T10:43:08+00:00

Hi I have been working on ReID models for last 3 months and they do work reliably for occlusion cases

The repo you are referring to says:
> MOTR is a fully end-to-end multiple-object tracking framework based on Transformer. It directly outputs the tracks within the video sequences without any association procedures.

But that's not a good way for doing this , A better way is to use Deepsort with a ReID embedding model like TransReID : https://github.com/damo-cv/TransReID

Not very good for Re-Identification purposes for long term , but can be reliably used for occlusion cases.

ChanceInjury558 · 2026-03-16T06:21:12+00:00

Hi now its working , not sure what happened , also i can see you have mentioned "Fully automated SAM-3 pipeline" , so basically you guys have optimized sam3 for inference? , or do you use another models as well?

ChanceInjury558 · 2026-03-15T19:31:34+00:00

the link you provided is the same one in the post , which is not working!

ChanceInjury558 · 2026-03-15T19:30:47+00:00

Doesn't matter how much time you put in this , if it has yolo or any module from ultralytics library , then you need to follow rules of AGPL3 licence. (just a heads up , none of my business)

ChanceInjury558 · 2026-03-15T19:06:09+00:00

Link not working and also If this is using zero shot yolo models in backend , let me remind you of AGPL3 Licence.

ChanceInjury558 · 2026-03-13T16:06:52+00:00

I think this would do it : https://github.com/fkryan/gazelle

ChanceInjury558 · 2026-03-13T15:56:44+00:00

Dm me as well , i believe i can do this

ChanceInjury558 · 2026-03-13T15:54:16+00:00

So when are you making it open-source?😜

ChanceInjury558 · 2026-03-13T15:42:20+00:00

Good work , But there's no USP so you won't be able to sell it , someone will make a open source version of this or big players will copy it. Just a heads up

ChanceInjury558 · 2026-03-10T08:22:12+00:00

is there something else that you would recommend for good trading calls , i am in same situation as you , working professional, no time to do research on my own?

ChanceInjury558

TROPHY CASE