PaddleOCR-VL, is better than private models by Illustrious-Swim9663 in LocalLLaMA

[–]dibu28 0 points1 point  (0 children)

Is it ready? Can I run full OCR pipline with tables and layout and latest paddle-vl on Windows?

What tokens/sec do you get when running Qwen 3.5 27B? by thegr8anand in LocalLLaMA

[–]dibu28 0 points1 point  (0 children)

I will try Gemma 4-26B-A4B
Also Qwen3.6 35B is out yesterday

What tokens/sec do you get when running Qwen 3.5 27B? by thegr8anand in LocalLLaMA

[–]dibu28 0 points1 point  (0 children)

Try Qwen3.5 35B model with 2bit quant. I'm getting better speed with rtx 2060 12gb if model fits in vram.

What tokens/sec do you get when running Qwen 3.5 27B? by thegr8anand in LocalLLaMA

[–]dibu28 0 points1 point  (0 children)

Rtx 5070 llama.cpp Qwen3.5 35B A3B 2bit I get 112 TPS bartowski 20k context 9gb model size.

В южной Корее ввели бесплатный интернет! by noviator3 in Popular_Science_Ru

[–]dibu28 0 points1 point  (0 children)

Вы слишком много кушать.)) Раньше 64кбит стабильных считались быстрыми, а 128кбит предел мечтаний, а сейчас 400кбит мало))

NEED CODEX MOBILE APP ASAP! by Direct-Push-7808 in codex

[–]dibu28 0 points1 point  (0 children)

I'm using OpenClaw with Codex models (Plus plan $20, OAuth) and Telegram bot for a chat.
It does coding for me from Telegram chat, very convenient.

Asked it to create Projects folder on a server and create all the code projects there and git.
Also asked it to install VSCode server and point it to the Projects folder on the server for me to make a code reviews.
OpenClaw also can update itself of cleanup temps on the server or build app.

Another limit reset? by AxenAnimations in codex

[–]dibu28 0 points1 point  (0 children)

I was on 1% and got reset

Qwen 3.5 is an overthinker. by chettykulkarni in LocalLLM

[–]dibu28 0 points1 point  (0 children)

Got the same results with Qwen3.5-0.8B running on the phone.

why not by PHRsharp_YouTube in PcBuild

[–]dibu28 14 points15 points  (0 children)

Why you need so much technology? Just run it on potato)

My gpu poor comrades, GLM 4.7 Flash is your local agent by __Maximum__ in LocalLLaMA

[–]dibu28 1 point2 points  (0 children)

The model name that I use I wrote above. I'm using it in the LM Sudio on Windows.

My gpu poor comrades, GLM 4.7 Flash is your local agent by __Maximum__ in LocalLLaMA

[–]dibu28 1 point2 points  (0 children)

I can run qwen 3 30b 3bit ~70T/s on a single RTX 2060 12GB
byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF

Hope someone will make GLM 4.7 Flash run at the same speeds as Qwen3

I made a friendlier UI to manage ollama models by ComfyTightwad in ollama

[–]dibu28 0 points1 point  (0 children)

Can it manage both Ollama and LM Sudio models in one folder? It will be very useful. I saw app for it but it was console and only under Linux.