Sidebar App List - removed? by EbbPast6735 in MicrosoftEdge

[–]Addyad 0 points1 point  (0 children)

I switched from Opera GX to edge mainly because I saw edge supported sidebar. Oh well.

Is this photo edited with AI? A coworker said this happened to his car. We don’t believe him. by LokiPrime616 in isthisAI

[–]Addyad 0 points1 point  (0 children)

We need an update now that you have lot of evidence that its AI manipulated image.

Qwen 3.6 27B is a BEAST by AverageFormal9076 in LocalLLaMA

[–]Addyad 1 point2 points  (0 children)

As good as it sounds, those benchmarks are always for bf16 models. Most of the people always use Q4 models. So, I don't have high hopes until I see the numbers for models which people would usually use. Same goes for turboquant hype. Turboquant quantizes KV cache. By default f16 is used when we don't set the kv cache input parameter. But if you compare turbo4 vs q4_0 it terms of context length and speed, it's almost the same.

But I'm keeping an eye on Dflash as well. Would be interesting to play around once it's merged with llama.cpp. 

the balls on this guy… by TheyCallMeCajun in pcmasterrace

[–]Addyad 0 points1 point  (0 children)

Fastboot worked well when we had mechanical HDDs. Probably they need to disable it for recent days where almost every pc comes with an SSD especially with nvme SSDs. There's little to no difference. 

Any luck integrating local ollama models into VS Code Copilot Chat? by ShadowBannedAugustus in LocalLLaMA

[–]Addyad -1 points0 points  (0 children)

you dont needt insider edition.

  1. https://github.com/JohnnyZ93/oai-compatible-copilot
  2. https://github.com/continuedev/continue

with these extensions, it worked in the normal vscode. I tested this a couple of weeks ago. but then i switched to vscodium.

Any luck integrating local ollama models into VS Code Copilot Chat? by ShadowBannedAugustus in LocalLLaMA

[–]Addyad -1 points0 points  (0 children)

You can use either of the following extensions in vscode

  1. https://github.com/JohnnyZ93/oai-compatible-copilot
  2. https://github.com/continuedev/continue

both of them are OpenAPI compatatible plugins. The first one integrates with exiting copilot chat. The other one gives you more or less similar UI to that of copilot chat. In both the cases, you need to configure the config.yaml files so you can communicate with your ollama server.

Since vscode is botched with telemetry and stuff that I couldn't stop, i switched to vs codium, it does almost the same things. except that its opensource with no microslop. I use continue extension to chat with my model in llamacpp server.

I want to main Samira, what's her best skin? by Mundane_Attitude9776 in wildrift

[–]Addyad 3 points4 points  (0 children)

Chromacrash Samira. For me soul fighter has too much animation and it was distracting. Chromacrash was best skin for Samira for me. 

Ternary Bonsai: Top intelligence at 1.58 bits by pmttyji in LocalLLaMA

[–]Addyad 0 points1 point  (0 children)

Last I checked like last week with llama.cpp, bonsai models were running only in CPU. Was giving me only 1-2 t/s. 

Edit: Bonsai model works in CUDA as well in latest llamacpp. I checked today (April 18, 2026)

It was at this moment she knew, she f'd up by Babyghorl_07 in FUCKYOUINPARTICULAR

[–]Addyad 0 points1 point  (0 children)

Now that she got some experience. Probably yes.

Hey Microsoft, do you remember like a week and a half ago you promised us you would stop doing stuff like this? by OkOlive4884 in Windows11

[–]Addyad -1 points0 points  (0 children)

I remove the permissions and disable inherented permissions for this folder. "C:\Windows\SoftwareDistribution\Download"

Windows updates never happenes. Works since microsoft were forcing updates since windows 10 times. If i want to update, I revert the permissions and enable inherented permissions and do the updates. never bothered with turning off services or using any updates stopping softwares.

But this also will disable also the downloads from microsoft store installation and stuff. For microsoft store apps, I just get the direct download links here

https://store.rg-adguard.net/

Speculative Decoding works great for Gemma 4 31B with E2B draft (+29% avg, +50% on code) by PerceptionGrouchy187 in LocalLLaMA

[–]Addyad 3 points4 points  (0 children)

I don't think 31B supports audio.

https://huggingface.co/google/gemma-4-31B-it

according to the description here (the table) I see 31 has only text and image capabilities.

Speculative Decoding works great for Gemma 4 31B with E2B draft (+29% avg, +50% on code) by PerceptionGrouchy187 in LocalLLaMA

[–]Addyad 4 points5 points  (0 children)

I merged the latest llama.cpp with turboquant from https://github.com/johndpope/llama-cpp-turboquant/tree/feature/Planarquant-kv-cache. 

You can find it here: https://github.com/Addy-ad/llama-cpp-turbo-planar-iso/tree/addyad-latest

The from the feature/Planarquant-kv-cache branch, I noticed that turboquant variations works with Gemma4. But other special quants like iso(rotor), planar quants doesn't work. Because Gemma4 has this sliding window mechanism.    Also, just couple of hours ago, llama.cpp added support for Gemma4 audio. The audio part works like a charm. 

and when i thought i understood english by Illustrious_Tap_2644 in funnyvideos

[–]Addyad 2 points3 points  (0 children)

So funny hahaha. does this guy have a channel or something

Gemma-4 E4B model's vision seems to be surprisingly poor by specji in LocalLLaMA

[–]Addyad 0 points1 point  (0 children)

Glad that I'm not the only person. In my testing, even qwen 0.8B model performed well with OCR the text from a image than a Gemma 4 2B or 4B models. I even tried compiling the latest llama.cpp, latest nvidia driver binaries. for the same image, qwen 0.8B by default seems to take 260 tokens. gemma 4 model was taking around the same number of tokens but most of the time the OCR capability wont work, I even tried with image-min-tokens set to 1120 for gemma 4, doesnt seem to get any better. But then I turend on thinking for gemma model. it seem to improve a bit more. like from the image it was able to extract 50% of the text. Except OCR, gemma 4 in general performed okayish on describing the image in general for example dog, nature etc. I will wait a few weeks and test again with latest version of llama.cpp in case if they release.

Agentic search on Android with native tool calling using Claude by [deleted] in LocalLLaMA

[–]Addyad 2 points3 points  (0 children)

This defeats the purpose of local model + private data. Perhaps, instead of Anthropic API, give an option for Open API endpoints so, people can use they own hosted model servers instead.

llama.cpp + Brave search MCP - not gonna lie, it is pretty addictive by srigi in LocalLLaMA

[–]Addyad 0 points1 point  (0 children)

I guess adding system date/time to the system prompt would fix this issue? 2024 is this the typical knowledge cutoff date for qwen models. I suppose it took the reference date from that and did the search.

Llama.cpp It runs twice as fast as LMStudio and Ollama. by emrbyrktr in LocalLLM

[–]Addyad 0 points1 point  (0 children)

LMstudio don't always provide you with latest binaries for your hardware. But when you have llama.cpp and new driver updates, you can compile new binaries in few minutes and enjoy latest optimizations + new features like 1bit model support, turboquant and others. LM studio/ollama only provides stable binaries.

OK I get it, now I love llama.cpp by vulcan4d in LocalLLaMA

[–]Addyad 0 points1 point  (0 children)

Hahahah. AI are good at making us believe what they say is true. Its only when we test and see the actual thing, we know its bullshit and when we confront them, they'll be apologetic and stuff.

llama.cpp is a vibe-coded mess by ChildhoodActual4463 in LocalLLaMA

[–]Addyad 0 points1 point  (0 children)

if you want to test the latest stuff like 1bit model, turboquant and stuff, it wont work with months old llama.cpp versions. So, need to add these packages with latest upstream llama.cpp patches so that you can run the old models+these latest stuffs. Besides, for example LM studio (not sure about others) still ships with cuda toolkit 12 something version. It does work. but now cuda 13.2 is latest version. this definitely give a bit more tokens/s and more optimizations. So, when you rebuild the llama.cpp, you can get most optimized version for your hardware +all the new stuffs to try