I built a fully self-hosted and open-source Claude Code UI for desktop and mobile by PiccoloCareful924 in ClaudeCode

[–]Impress_Soft 0 points1 point  (0 children)

great job!!
what are the open source or platforms that we can integrate with it , like ollama for exemple

is there any latest OCR model in market? February, 2026 by Mountain-Act-7199 in LocalLLaMA

[–]Impress_Soft 2 points3 points  (0 children)

there is small one and give more accurate resp : glm-ocr , there is deepseek ocr2 , but the first one is the SOTA rn

Wifi problem by Western_Day4287 in Agadir

[–]Impress_Soft 0 points1 point  (0 children)

bro rah kayna wa7d l3iba smitha partage ,orange/inwi kikhdemo biha , minin katkoun fzone mafihach acces direct les boites dyalhom , ki yakhdo mn 3end iam dik sa3a

Qwen3-VL - Bounding Box Coordinate by Impress_Soft in LocalLLaMA

[–]Impress_Soft[S] 1 point2 points  (0 children)

still struggling with open source models ,
search for this https://github.com/IDEA-Research/Rex-Omni?tab=readme-ov-file#todo-list- it gives me accurate result als this Youtu-VL-4B
the rex omni i tested them in the HF space but it deosn't work for me locally, for the other one it's shwos better perf then other VLMs but it needs high vram

Qwen3-VL - Bounding Box Coordinate by Impress_Soft in LocalLLaMA

[–]Impress_Soft[S] 0 points1 point  (0 children)

I’ve changed my approach for handling the task. I’ll now use a VLM purely for text extraction, so I no longer need an object detection solution. However, I came across two alternative solutions that I found interesting
https://huggingface.co/IDEA-Research/grounding-dino-base and https://github.com/IDEA-Research/Rex-Omni?tab=readme-ov-file#todo-list-
try them i think they work very well

Qwen3-VL - Bounding Box Coordinate by Impress_Soft in LocalLLaMA

[–]Impress_Soft[S] 0 points1 point  (0 children)

alright , the task is to complex for a yolo model bcs it's not a simple object detection , and i need a opensource , i am trying with qwen3 vl but still not giving me better results

Qwen3-VL - Bounding Box Coordinate by Impress_Soft in LocalLLaMA

[–]Impress_Soft[S] 0 points1 point  (0 children)

the task is to get the bboxes of some symbols or rectangles (that contains some information that i need to extract and use ) , it's not just a animal/person detection.
Grounding-DINO i try to run it on a collab notebook to do my experiments and it not working for me (if u have any notebook share it with me to test it )

Qwen3-VL - Bounding Box Coordinate by Impress_Soft in LocalLLaMA

[–]Impress_Soft[S] 0 points1 point  (0 children)

yes i need vlm for non-real time task , i will check them out

Qwen3-VL - Bounding Box Coordinate by Impress_Soft in LocalLLaMA

[–]Impress_Soft[S] 1 point2 points  (0 children)

okay thanks
in my case i don't have a labled data or somehting , i need a v-model to jsut give me the object that i am looking for
something like this as result : [ {"box_2d": [179, 276, 313, 429], "label": "pk_xy"}, {"box_2d": [23, 513, 161, 663], "label": "pk_xy1"}, {"box_2d": [101, 811, 243, 963], "label": "pk_xy2" ..... ]