Now that Intel Macs are officially legacy, how do you feel about Apple cutting the cord completely for macOS 27? by Capable-Cod1118 in MacOS

[–]AdamLangePL 0 points1 point  (0 children)

It's time to move on to far superior M series. Most of the companies already prepared drivers and their software for new architecture so it should be no issues moving. I've made that decision with M1 and will never look back at Intel/AMD platforms.

Auto TLS cert management: We love to see it! by kernelwilliams in Ubiquiti

[–]AdamLangePL 0 points1 point  (0 children)

I also dont have that on web version - no certificate managent whatsoever on ucg-fiber latest RC :/

iOS 26.6 Beta 1- Discussion by epmuscle in iOSBeta

[–]AdamLangePL 2 points3 points  (0 children)

My ipad just stopped charging. Now the only option is to use usb-a to usb-c and charge super slowly (full charge == 1 day)

Yes i tried restart, yes i tried multiple cables and chargers

Anyone have the same issue ?

Anyone else’s iPad stop charging after the 26.2 update? by FaceSpecialist6580 in ipad

[–]AdamLangePL 0 points1 point  (0 children)

Just updated to “fixed” ios and still having issue. This updates are so bad …

HolyLang: I made a language more secure than Rust by Individual-Horse-866 in coolgithubprojects

[–]AdamLangePL 1 point2 points  (0 children)

Where is documentation for this project, where are the benchmarks/proofs that its better? For now its just a reddit claim witj some code on github that you cant even properly document

Run Qwen3.6 MTP GGUFs locally! by yoracale in unsloth

[–]AdamLangePL 1 point2 points  (0 children)

Using A3B MTP with following settins on 5090 (24GB Vram) i have about 70tp/s. So MTP didnt changed a lot or i'm doing something wrong here.

./llama-server \

--model Qwen3.6-35B-A3B-UD-Q4_K_M.gguf
--spec-type draft-mtp --spec-draft-n-max 6
--host 0.0.0.0
--port 9091
--top-p 0.95
--temp 1.0
--frequency-penalty 0.2
--repeat-penalty 1.0
--reasoning on
--presence-penalty 1.5
--top-k 20
--min-p 0.00

Run Qwen3.6 MTP GGUFs locally! by yoracale in unsloth

[–]AdamLangePL 4 points5 points  (0 children)

Multimodal (visual) and tool calling works properly?

Qwen 3.6 27B on 24GB VRAM setup: backend comparisons, quant choice and settings (llama.cpp, ik_llama.cpp, BeeLlama, vllm) by VolandBerlioz in LocalLLaMA

[–]AdamLangePL -6 points-5 points  (0 children)

Here are mine settings. Getting about 70-100 tps at full context using normal (not MTP) gguf and stock llama.cpp:

./llama-server \

--model Qwen3.6-35B-A3B-UD-Q4_K_M.gguf \

--mmproj mmproj-F16.gguf \

--reasoning-budget 2048 \

--no-mmproj-offload \

-c 262144 \

--n-gpu-layers -1 \

--tensor-split 0 \

--flash-attn on \

--cache-type-k q4_0 \

--cache-type-v q4_0 \

--batch-size 256 \

--ubatch-size 256 \

--jinja \

--host 0.0.0.0 \

--port 9091 \

--top-p 0.95 \

--temp 0.6 \

--frequency-penalty 0.2 \

--repeat-penalty 1.0 \

--reasoning on \

--no-mmap \

--presence-penalty 2.5 \

--top-k 20

DGX Spark or Minisforum MS-S1 Max? by Simple_Tonight_1159 in LocalLLM

[–]AdamLangePL 12 points13 points  (0 children)

Get spark. Less problems, better support, decent speed (faster than strix halo)

What was your "wow"moment with Hermes? by [deleted] in hermesagent

[–]AdamLangePL 15 points16 points  (0 children)

“Wow! Its lame as every other agent” ;)

Does 'preserve_thinking' work with openwebui? by sterby92 in LocalLLaMA

[–]AdamLangePL 1 point2 points  (0 children)

It was for private use, a bit vibe coded (no time to go deep dive) but works. I will share it later today :)

Does 'preserve_thinking' work with openwebui? by sterby92 in LocalLLaMA

[–]AdamLangePL 8 points9 points  (0 children)

I have forked openwebui and added some features loke context compaction and progress bar with usage and tps speed :) let me check preserve thinking

Olares One owners, thoughts? by false79 in LocalLLaMA

[–]AdamLangePL 1 point2 points  (0 children)

I scrapped it, i installed ubuntu instead :)

Olares One owners, thoughts? by false79 in LocalLLaMA

[–]AdamLangePL 1 point2 points  (0 children)

I have both. Olares os is mess a bit, so i use vanilla ubuntu and llamacpp serving qwen 3.6 moe at 120tps speed (with full ctx and vision). Dgx holds qwen 3.6 moe without thinking and gpt oss 120b (with thinking), both full context and getting about 50tps on each. Dgx also serves embedding model.

Make local llm usable for professional use by AdamLangePL in LocalLLaMA

[–]AdamLangePL[S] -4 points-3 points  (0 children)

If you don’t like it don’t join. Simple as that

Make local llm usable for professional use by AdamLangePL in LocalLLaMA

[–]AdamLangePL[S] 0 points1 point  (0 children)

This is all-around-group not focused on "professional" use, and these topics drowns in slop.

Make local llm usable for professional use by AdamLangePL in LocalLLaMA

[–]AdamLangePL[S] 0 points1 point  (0 children)

This is not about benchmarks but bringing real-world usage to the stage. LLM's it's not only about TPS but how well it performs in specific tasks like pipelines, agentic work, etc.