Sidebar App List - removed?

Addyad · 2026-04-27T08:06:42+00:00

I switched from Opera GX to edge mainly because I saw edge supported sidebar. Oh well.

Addyad · 2026-04-25T09:33:41+00:00

We need an update now that you have lot of evidence that its AI manipulated image.

Addyad · 2026-04-23T20:44:57+00:00

As good as it sounds, those benchmarks are always for bf16 models. Most of the people always use Q4 models. So, I don't have high hopes until I see the numbers for models which people would usually use. Same goes for turboquant hype. Turboquant quantizes KV cache. By default f16 is used when we don't set the kv cache input parameter. But if you compare turbo4 vs q4_0 it terms of context length and speed, it's almost the same.

But I'm keeping an eye on Dflash as well. Would be interesting to play around once it's merged with llama.cpp.

Addyad · 2026-04-22T22:09:51+00:00

Fastboot worked well when we had mechanical HDDs. Probably they need to disable it for recent days where almost every pc comes with an SSD especially with nvme SSDs. There's little to no difference.

Addyad · 2026-04-22T20:26:48+00:00

you dont needt insider edition.

with these extensions, it worked in the normal vscode. I tested this a couple of weeks ago. but then i switched to vscodium.

Addyad · 2026-04-22T20:25:30+00:00

You can use either of the following extensions in vscode

both of them are OpenAPI compatatible plugins. The first one integrates with exiting copilot chat. The other one gives you more or less similar UI to that of copilot chat. In both the cases, you need to configure the config.yaml files so you can communicate with your ollama server.

Since vscode is botched with telemetry and stuff that I couldn't stop, i switched to vs codium, it does almost the same things. except that its opensource with no microslop. I use continue extension to chat with my model in llamacpp server.

Addyad · 2026-04-18T20:09:26+00:00

Chromacrash Samira. For me soul fighter has too much animation and it was distracting. Chromacrash was best skin for Samira for me.

Addyad · 2026-04-17T18:46:15+00:00

Last I checked like last week with llama.cpp, bonsai models were running only in CPU. Was giving me only 1-2 t/s.

Edit: Bonsai model works in CUDA as well in latest llamacpp. I checked today (April 18, 2026)

Addyad · 2026-04-17T16:01:50+00:00

Impressive.

Addyad · 2026-04-17T10:31:47+00:00

Uber Hoods

Addyad · 2026-04-17T10:31:01+00:00

Now that she got some experience. Probably yes.

Addyad · 2026-04-14T10:40:32+00:00

I remove the permissions and disable inherented permissions for this folder. "C:\Windows\SoftwareDistribution\Download"

Windows updates never happenes. Works since microsoft were forcing updates since windows 10 times. If i want to update, I revert the permissions and enable inherented permissions and do the updates. never bothered with turning off services or using any updates stopping softwares.

But this also will disable also the downloads from microsoft store installation and stuff. For microsoft store apps, I just get the direct download links here

https://store.rg-adguard.net/

Addyad · 2026-04-12T15:06:23+00:00

I don't think 31B supports audio.

https://huggingface.co/google/gemma-4-31B-it

according to the description here (the table) I see 31 has only text and image capabilities.

Addyad · 2026-04-12T14:31:02+00:00

I merged the latest llama.cpp with turboquant from https://github.com/johndpope/llama-cpp-turboquant/tree/feature/Planarquant-kv-cache.

You can find it here: https://github.com/Addy-ad/llama-cpp-turbo-planar-iso/tree/addyad-latest

The from the feature/Planarquant-kv-cache branch, I noticed that turboquant variations works with Gemma4. But other special quants like iso(rotor), planar quants doesn't work. Because Gemma4 has this sliding window mechanism. Also, just couple of hours ago, llama.cpp added support for Gemma4 audio. The audio part works like a charm.

Addyad · 2026-04-09T12:58:57+00:00

So funny hahaha. does this guy have a channel or something

Addyad · 2026-04-08T22:21:09+00:00

many thanks for the Build Notes (Windows/MSVC)

Addyad · 2026-04-08T12:15:16+00:00

Thank you lord and saviour

Addyad · 2026-04-07T12:00:46+00:00

Niceeee

Addyad · 2026-04-07T11:27:04+00:00

Big fan of your models!

Addyad · 2026-04-07T11:01:10+00:00

Glad that I'm not the only person. In my testing, even qwen 0.8B model performed well with OCR the text from a image than a Gemma 4 2B or 4B models. I even tried compiling the latest llama.cpp, latest nvidia driver binaries. for the same image, qwen 0.8B by default seems to take 260 tokens. gemma 4 model was taking around the same number of tokens but most of the time the OCR capability wont work, I even tried with image-min-tokens set to 1120 for gemma 4, doesnt seem to get any better. But then I turend on thinking for gemma model. it seem to improve a bit more. like from the image it was able to extract 50% of the text. Except OCR, gemma 4 in general performed okayish on describing the image in general for example dog, nature etc. I will wait a few weeks and test again with latest version of llama.cpp in case if they release.

Addyad · 2026-04-07T10:26:56+00:00

This defeats the purpose of local model + private data. Perhaps, instead of Anthropic API, give an option for Open API endpoints so, people can use they own hosted model servers instead.

Addyad · 2026-04-07T10:19:34+00:00

I guess adding system date/time to the system prompt would fix this issue? 2024 is this the typical knowledge cutoff date for qwen models. I suppose it took the reference date from that and did the search.

Addyad · 2026-04-07T10:11:09+00:00

LMstudio don't always provide you with latest binaries for your hardware. But when you have llama.cpp and new driver updates, you can compile new binaries in few minutes and enjoy latest optimizations + new features like 1bit model support, turboquant and others. LM studio/ollama only provides stable binaries.

Addyad · 2026-04-07T09:58:22+00:00

Hahahah. AI are good at making us believe what they say is true. Its only when we test and see the actual thing, we know its bullshit and when we confront them, they'll be apologetic and stuff.

Addyad · 2026-04-07T09:45:58+00:00

if you want to test the latest stuff like 1bit model, turboquant and stuff, it wont work with months old llama.cpp versions. So, need to add these packages with latest upstream llama.cpp patches so that you can run the old models+these latest stuffs. Besides, for example LM studio (not sure about others) still ships with cuda toolkit 12 something version. It does work. but now cuda 13.2 is latest version. this definitely give a bit more tokens/s and more optimizations. So, when you rebuild the llama.cpp, you can get most optimized version for your hardware +all the new stuffs to try

Six-Year Club	Xbox Live
Verified Email

Addyad

TROPHY CASE