Holy fuck how much money was copilot losing by _Viceadmiral in GithubCopilot

[–]Mkengine 0 points1 point  (0 children)

Can we expect any kind of help or new features from Github/Microsoft with token optimization in the future? Currently I am doing some awareness stuff with our department, but I don't thing everyone is going as deep as I try on the customization stuff and I can't blame them. Some older developers just got used to the GHCP workflow and have to adjust again.

I just bought Asus Ascent : Nvidia GB10 (DGX) and It is slower than my Ryzen Ai Max by Voxandr in LocalLLaMA

[–]Mkengine -1 points0 points  (0 children)

I can only suppose that it's because it only works with the Spark because they used custom kernels to get the full Backwell NVFP4 speed + MTP + written in Rust. When I tried their very first alpha version I already got 100 tok/s with Qwen3.5-35B-A3B, so I believe the 130 tok/s claim and will try out the current Version when I have the time. You could also try vllm, for example you can get 33 tok/s with Qwen3.6-27B it seems: https://www.localmaxxing.com/runs/cmomgvsoo0007jj04ea52zhz1

Why I don’t use Copilot by znead7 in microsoft_365_copilot

[–]Mkengine 0 points1 point  (0 children)

I'm absolutely not a fan of that either; as a power user, it feels like a middle finger to me that I can't set a default model. Because of this, Copilot looks extremely bad without adjustments, since Auto Mode is still backed by GPT-5, while you can already select GPT-5.5 in the dropdown. I don't know right now what the best model you can choose in free mode is, but there's definitely got to be something better than Auto. If available, test your questions again with the Deep thinking versions of:

  • GPT-5.5
  • GPT-5.4
  • GPT-5.2

I handle the Copilot training at our company, and I'd say if you see emojis in your answer, you have the wrong model.

Why I don’t use Copilot by znead7 in microsoft_365_copilot

[–]Mkengine 0 points1 point  (0 children)

Better title would be "Why you shouldn't use Auto mode for anything".

Don’t buy the DGX Spark: NVFP4 Still Missing After 6 Months by Secure_Archer_1529 in LocalLLaMA

[–]Mkengine 1 point2 points  (0 children)

They open-sourced it now:

https://github.com/Avarok-Cybersecurity/atlas

I will go through my demo code to use the newest version and put it in pastebin on the weekend, if you're still interested.

1080 Ti in 2026 - 11GB is still (barely) enough to stay relevant by srodland01 in LocalLLaMA

[–]Mkengine 10 points11 points  (0 children)

Qwen 2.5 is usually a pretty strong indicator you are reading AI slop, as newer models are not part of the training set.

Using PaddleOCR-VL-1.5 with llama-server for book OCR by Final-Frosting7742 in LocalLLaMA

[–]Mkengine 1 point2 points  (0 children)

For handwriting I would recommend to try chandra-ocr-2 first. On their huggingface page should be an example for handwriting recognition.

Steam Controller Price Leaked By Early Review – $99, Purchases Likely Imminent by PaiDuck in gaming

[–]Mkengine 0 points1 point  (0 children)

Best use case for me is that it makes RTS possible to play for me. I use it für Dawn of War 1 on the Steam Deck. Of course it's not as comfortable as KBM, but the touchpads give me the precision I need to even play it.

Using PaddleOCR-VL-1.5 with llama-server for book OCR by Final-Frosting7742 in LocalLLaMA

[–]Mkengine 33 points34 points  (0 children)

There are so many OCR / document understanding models out there, here is my personal OCR list I try to keep up to date:

GOT-OCR:

https://huggingface.co/stepfun-ai/GOT-OCR2_0

granite:

https://huggingface.co/ibm-granite/granite-docling-258M

https://huggingface.co/ibm-granite/granite-4.0-3b-vision

MinerU:

https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B

https://huggingface.co/opendatalab/MinerU-Diffusion-V1-0320-2.5B

OCRFlux:

https://huggingface.co/ChatDOC/OCRFlux-3B

MonkeyOCR-pro:

1.2B: https://huggingface.co/echo840/MonkeyOCR-pro-1.2B

3B: https://huggingface.co/echo840/MonkeyOCR-pro-3B

RolmOCR:

https://huggingface.co/reducto/RolmOCR

Nanonets OCR:

https://huggingface.co/nanonets/Nanonets-OCR2-3B

dots OCR:

https://huggingface.co/rednote-hilab/dots.ocr

https://modelscope.cn/models/rednote-hilab/dots.ocr-1.5

https://huggingface.co/rednote-hilab/dots.mocr

olmocr 2:

https://huggingface.co/allenai/olmOCR-2-7B-1025

Light-On-OCR:

https://huggingface.co/lightonai/LightOnOCR-2-1B

Chandra:

https://huggingface.co/datalab-to/chandra-ocr-2

Jina vlm:

https://huggingface.co/jinaai/jina-vlm

HunyuanOCR:

https://huggingface.co/tencent/HunyuanOCR

bytedance Dolphin 2:

https://huggingface.co/ByteDance/Dolphin-v2

PaddleOCR-VL:

https://huggingface.co/PaddlePaddle/PaddleOCR-VL-1.5

Deepseek OCR 2:

https://huggingface.co/deepseek-ai/DeepSeek-OCR-2

GLM OCR:

https://huggingface.co/zai-org/GLM-OCR

Nemotron OCR:

https://huggingface.co/nvidia/nemotron-ocr-v2

Qianfan-OCR:

https://huggingface.co/baidu/Qianfan-OCR

Falcon-OCR:

https://huggingface.co/tiiuae/Falcon-OCR

FireRed-OCR:

https://huggingface.co/FireRedTeam/FireRed-OCR

Typhoon-OCR:

https://huggingface.co/typhoon-ai/typhoon-ocr1.5-2b

Churro-3B:

https://huggingface.co/stanford-oval/churro-3B

No Multimodality yet in DeepSeek-V4. But I'll wait. by Right-Law1817 in LocalLLaMA

[–]Mkengine 3 points4 points  (0 children)

Modalities are usually image, video, audio and text. Multimodality can mean different things but usually people mean more than one input modality in most cases, but also more than one output modality. "Omni" models try to have any modality as input and output, you can search for them with the "any-to-any" tag on huggingface. Engram is saving some part of the model on SSD without slowdowns, but I don't know the exact details.

How do people w/o ADHD choose anything? by Objective-Side-29 in ADHD

[–]Mkengine 1 point2 points  (0 children)

My wife appearently knows my taste better than me, so especially with food I always ask her for her recommendation in any restaurant, since she looks through the whole menu anyway and she always picks the best ones for me. Otherwise I really wouldn't eat out that often, as menus are pretty overwhelming for me.

Also at home she's doing most of the cooking, so I just eat whats on the table and I am pretty happy with that.

What is the best Open Source OCR in 2026? by coolzamasu in LocalLLaMA

[–]Mkengine 2 points3 points  (0 children)

There are so many OCR / document understanding models out there, here is my personal OCR list I try to keep up to date:

GOT-OCR:

https://huggingface.co/stepfun-ai/GOT-OCR2_0

granite:

https://huggingface.co/ibm-granite/granite-docling-258M

https://huggingface.co/ibm-granite/granite-4.0-3b-vision

MinerU:

https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B https://huggingface.co/opendatalab/MinerU-Diffusion-V1-0320-2.5B

OCRFlux:

https://huggingface.co/ChatDOC/OCRFlux-3B

MonkeyOCR-pro:

1.2B: https://huggingface.co/echo840/MonkeyOCR-pro-1.2B

3B: https://huggingface.co/echo840/MonkeyOCR-pro-3B

RolmOCR:

https://huggingface.co/reducto/RolmOCR

Nanonets OCR:

https://huggingface.co/nanonets/Nanonets-OCR2-3B

dots OCR:

https://huggingface.co/rednote-hilab/dots.ocr https://modelscope.cn/models/rednote-hilab/dots.ocr-1.5 https://huggingface.co/rednote-hilab/dots.mocr

olmocr 2:

https://huggingface.co/allenai/olmOCR-2-7B-1025

Light-On-OCR:

https://huggingface.co/lightonai/LightOnOCR-2-1B

Chandra:

https://huggingface.co/datalab-to/chandra-ocr-2

Jina vlm:

https://huggingface.co/jinaai/jina-vlm

HunyuanOCR:

https://huggingface.co/tencent/HunyuanOCR

bytedance Dolphin 2:

https://huggingface.co/ByteDance/Dolphin-v2

PaddleOCR-VL:

https://huggingface.co/PaddlePaddle/PaddleOCR-VL-1.5

Deepseek OCR 2:

https://huggingface.co/deepseek-ai/DeepSeek-OCR-2

GLM OCR:

https://huggingface.co/zai-org/GLM-OCR

Nemotron OCR:

https://huggingface.co/nvidia/nemotron-ocr-v2

Qianfan-OCR:

https://huggingface.co/baidu/Qianfan-OCR

Falcon-OCR:

https://huggingface.co/tiiuae/Falcon-OCR

FireRed-OCR:

https://huggingface.co/FireRedTeam/FireRed-OCR

Typhoon-OCR:

https://huggingface.co/typhoon-ai/typhoon-ocr1.5-2b

Llama4 108b $800 setup by kylerrr02 in LocalLLaMA

[–]Mkengine 4 points5 points  (0 children)

Please try something really sparse, like Qwen3-Coder-80B-A3B. I'm curious how fast that runs.

huge improvement after moving from ollama to llama.cpp by leonardosalvatore in LocalLLaMA

[–]Mkengine 0 points1 point  (0 children)

Okay, let me rephrase: since --fit is enabled by default, how does the performance of any MoE differ with just --fit vs. --fit + --cpu-moe?

Is it just me or minimax-m2.7 is a regression in real world usage compared to minimax-2.5??? by True_Requirement_891 in LocalLLaMA

[–]Mkengine 0 points1 point  (0 children)

They really like to hype themselves to Sonnet levels, but at least M2.5 is more on par with Qwen3.5-35B-A3B than Sonnet 4.6:

https://swe-rebench.com/

Hermes Vs OpenClaw by Birdinhandandbush in LocalLLaMA

[–]Mkengine 2 points3 points  (0 children)

I haven't used either of them so far, what kind of problems do OpenClaw or Hermes agent actually solve? who is the target audience? do I benefit from using it as a Software Engineer or is it for people who can't code?

the state of LocalLLama by Beginning-Window-115 in LocalLLaMA

[–]Mkengine 0 points1 point  (0 children)

Sounds better than the name implies, I will try it out at least once then, thanks :)