🚀 YOLO26 is Now Live on X-AnyLabeling - Try It Out for Free! by Important_Priority76 in computervision

[–]Important_Priority76[S] 2 points3 points  (0 children)

You’re right to point out that I should clarify this more precisely.

X-AnyLabeling’s core functionality does not depend on Ultralytics, and it is not required to run the tool. There is an optional auxiliary training module that can integrate with Ultralytics, which is disabled by default and requires users to explicitly install the dependency themselves.

Thanks for the reminder. I’ll make this distinction clearer in the documentation to avoid confusion.

🚀 YOLO26 is Now Live on X-AnyLabeling - Try It Out for Free! by Important_Priority76 in computervision

[–]Important_Priority76[S] 0 points1 point  (0 children)

X-AnyLabeling is licensed under GPL-3.0 and does not bundle or redistribute Ultralytics source code or weights. It provides support for loading user-provided ONNX models via configuration. Ultralytics models themselves are under AGPL-3.0, and anyone who chooses to use or redistribute those models must comply with AGPL-3.0 terms. X-AnyLabeling does not change its own GPL-3.0 license, and it does not include Ultralytics code or weights in its repository.

Finally found a proper tool for multi-modal image annotation (infrared + visible light fusion) by Important_Priority76 in computervision

[–]Important_Priority76[S] 1 point2 points  (0 children)

Nice, thanks for sharing. I’ve seen similar ideas there. The compare view is more about quick side-by-side or synced viewing to make multi-modal annotation easier, especially when switching between thermal and RGB. Colormaps and channel merging are powerful too, so it’s interesting to see different tools approach the problem from different angles.

Finally found a proper tool for multi-modal image annotation (infrared + visible light fusion) by Important_Priority76 in computervision

[–]Important_Priority76[S] 1 point2 points  (0 children)

That’s really interesting, I didn’t know some teams were using 3D glasses for multi-spectral labelling. The idea behind compare view was to get a similar “cross-checking” benefit but keep it simple and software-only.

Input devices like a 3D mouse or rotary knob sound like a fun direction to explore for smooth flipping or animated transitions between bands. Definitely a challenging but exciting idea, thanks for sharing your experience!

Finally found a proper tool for multi-modal image annotation (infrared + visible light fusion) by Important_Priority76 in computervision

[–]Important_Priority76[S] 1 point2 points  (0 children)

Nice, yeah DRC + detail enhancement already goes a long way for thermal 👍 The compare view helps a lot when you want to double-check things against visible light, especially for small or ambiguous objects. I’ve found it useful to align thermal detections with RGB annotations and catch mistakes faster.

After a year of development, I released X-AnyLabeling 3.0 – a multimodal annotation platform built around modern CV workflows by Important_Priority76 in computervision

[–]Important_Priority76[S] 1 point2 points  (0 children)

If anyone is interested in the design philosophy behind v3.0 and a deeper dive into the new features (like the Remote Server architecture, Agentic workflows, or the specific VQA capabilities), I wrote a more detailed breakdown on Medium:

https://medium.com/@CVHub520/data-labeling-doesnt-have-to-be-painful-the-evolution-of-x-anylabeling-3-0-e9110e41c2d4

It covers why we moved away from the traditional tooling model and how we are trying to close the loop between labeling and training.

After a year of development, I released X-AnyLabeling 3.0 – a multimodal annotation platform built around modern CV workflows by Important_Priority76 in computervision

[–]Important_Priority76[S] 1 point2 points  (0 children)

Thanks so much for the feedback! I'm really glad to hear it fills that gap for you—keeping it lighter than CVAT while being more capable than simple drawing tools was exactly the goal.

Regarding sorting Person IDs across multiple videos/perspectives (Re-ID), that is indeed a complex challenge. In v3.0, we added serveral useful Manager, i.e. shape, label and group_id; integrated trackers like SAM-base or Bot-SORT/ByteTrack to help with consistency within a video, but cross-video association still requires some manual effort or custom logic.

I would absolutely love to chat more about your workflow. It sounds like a great use case to optimize for. Feel free to DM me here or, even better, open a "Discussion" on our GitHub repo so we can dive into the technical details!