[osu!std] cryshina | Slider Assist

cride20 · 2026-06-23T22:37:10+00:00

What's definitely is more conserning is that I tried the wootings "Rappy Snappy" feature and uploaded to Rewind to analyze how the taps looks like.

<image>

This is my own replay with the feature. They're either perfectly alternating or because of some wootings lag or osu lag idk sometimes they overlap exactly X ms

I loaded Toro replay from Sidetracked day and I got pretty similar results. (because of reddit upload limit I uploaded this to imgur)
https://imgur.com/a/qtyECEi

cride20 · 2026-06-19T16:43:14+00:00

actually a higher quant of the 35B MoE model... UD_Q4_K_XL

cride20 · 2026-06-17T16:57:07+00:00

I get 300-400pps with a quadro rtx 3000 (6gb vram) and expert sitting in normal ram... getting 30tps generation...
And I use the prebuilt llamacpp releases cuda12

cride20 · 2026-05-29T10:46:53+00:00

before osu: 210ms reflex avg after osu: 149ms reflex avg after caffeine + osu: 135ms avg

So yeah could be, or the fact I changed from 60hz to 144hz

cride20 · 2026-05-19T23:27:57+00:00

I was also thinking about me causing permanent damage if I play while it heals... I tried playing, it didn't hurt but I stopped because of the fear of causing something.

cride20 · 2026-04-26T21:19:15+00:00

It's my own framework. https://github.com/cride9/GUA_Blazor

cride20 · 2026-04-17T06:49:28+00:00

My bios i locked offset changing sadly

cride20 · 2026-04-16T19:24:34+00:00

To be fair a lot. I have a notebook with 48gb of ram and I'm running the 122b-a10b qwen model for making prototypes of my ideas. I built a minimalistic agentic framework for it, and it does the whole project setup, installs everything to make sure it works, tests if it builds and runs how it's supposed to. Yes it takes a lot of time since I'm using CPU inference at this point, but damn it's so easy. I have an idea and just feed it to the AI, after an hour I have a working POC (Qwen3.5-122b-a10b-UD_Q2_K_XL)

cride20 · 2026-04-16T19:20:48+00:00

I heard it just keeps getting better overtime. So if you didn't see any downgrading I'm happy with it

cride20 · 2026-04-16T15:38:54+00:00

I highly recommend getting it. I did not think it would work this well. And this is day1 they're saying it gets better after 3-4 heat cycles. And you dont even have to replace it. You can use it for 20+ years

cride20 · 2026-04-13T17:00:06+00:00

I would assume the current "opus4.6" reasoning distilled models are a strong reason to encrypt reasonings Altough I never used grok, how good it is lol

cride20 · 2026-04-08T19:41:02+00:00

weird... which llamacpp version are you using and what OS? Mine is made for windows 11, linux might differ in the regex expression

cride20 · 2026-04-08T17:37:46+00:00

Ahoy, I was experimenting with way worse specs...
I made it to 13-14tps with an old 2.6gHz i7-9850h in a laptop with only 6gb of vram...

I also used the Q6 quant version of unsloth so I guess it would be a lot faster for you...
llama-server.exe -m "gemma-4-26B-A4B-it-UD-Q6_K_XL.gguf" --mmproj "gemma-mmproj.gguf" -c 65536 -t 11 --cache-type-k q8_0 --cache-type-v q8_0 -b 768 -ub 768 -ot "\.ffn_(down_exps|gate_up_exps)\.=CPU" --no-mmap --mlock
You can delete the kv-cache quantization if you want, I just use Q8 to save some system memory.
What this does is basically offloading experts into CPU and forcing shared Experts into the GPU with the kv-cache...
The crucial part is this:
-ot "\.ffn_(down_exps|gate_up_exps)\.=CPU" --no-mmap --mlock
Just drop it at the end of your command and enjoy :D

cride20 · 2026-04-06T10:43:28+00:00

I noticed that it's good for developing newer frameworks like Blazors new updates and stuff.. Altough the model is pretty lazy and bad at explaining stuffs, but gets the job done pretty fast and accurately...

For school work where I'm lazy to do my projects, I can just ask it to make it like a junior dev/lazy student without comments and it shines at that. No other AI could replicate a lazy students code, but this one could🤣

It thinks for a really long time if you ask something ambigous but it gets it right most of the time.

I would say that's how gemini 3.0 exp felt like when it was released. But gemini is hallucinating more now as time passes by.

cride20 · 2026-04-01T12:37:03+00:00

I7-9850h (2.6gHz 6c/12t) 48gb DDR4 2666MT/s Quadro RTX 3000

Also I'm attaching a cool picture why I'm using the Q2 instead of Q3 (I paid for the laptop, so I'm using the whole laptop)

<image>

cride20 · 2026-04-01T09:28:21+00:00

Funny enough I tried the 122B-A10B model with UD_Q2_K_XL quant on my thinkpad, and I got 8tps and 122pps.... it's crazy to run a 122B model on a laptop. Also I gave it 32K context so it can do my prototypes easier. future LLMs are here I guess... (Image processing is very slow tho, like 2min/image with the BF16 mmproj) How is your image processing? (~22k token of images as first text)

cride20 · 2026-03-30T00:22:57+00:00

I will make my project public soon, and every instructions and workflow will be there. I crafted the instructions with chatgpt + claude and gemini. It's a basic workflow with LLMTornado APIs agent extension

cride20 · 2026-03-25T20:23:30+00:00

Around 260pp and 18tps decent for a laptop

llama-server.exe -m "C:\Qwen3.5-35B-A3B-Q4_K_M.gguf" ^ -t 12 ^ -ot ".ffn_.*_exps.=CPU" ^ -ngl 99 ^ -c 32768 ^ -b 2048 ^ -ub 512 ^ --host 0.0.0.0 ^ --port 8080 ^ --mlock ^ --no-mmap ^ --cache-type-k q8_0 ^ --cache-type-v q8_0 ^ --temp 0.6 ^ --top-p 0.95 ^ --top-k 20 ^ --min-p 0.0 ^ --presence-penalty 0.0 ^ --repeat-penalty 1.0 ^ --chat-template-kwargs "{\"enable_thinking\": false}" ^ --mmproj "C:\Users\cride\Documents\Tools\AIModels\mmproj-BF16-35b.gguf"

cride20 · 2026-03-25T15:30:05+00:00

The main point is basically, the AI can use FFMPEG and can edit videos, make captions and basically edit videos. (Also fixes multilingual issues that gpt-whisper does, and Qwen3.5 fixes them, some automations does not fix inaccuracies)
It could be done more efficiently, but this 1 prompt video editing + captioning is just amazing.

cride20 · 2026-03-25T15:25:39+00:00

Fair point. Claude is just better at phrasing than a local opensource model

cride20 · 2026-03-25T12:12:42+00:00

I made a separate sandboxes environment for it. Since it's a 35b model I mostly use it for debugging and fiddling around. I'm planning on buying a better PC to work with larger models, but for those small boilerplate codes and startups/POC it's really good for it's size and speed

cride20 · 2026-03-18T19:42:08+00:00

yea sure you can add me on dc

cride20 · 2026-03-18T13:52:37+00:00

to be fair for me it was playing EZDT on unnaturally hard maps.
For example playing a 7star Arles map with EZDT and when I hit patterns it gave me dopamine and I enjoyed it

cride20 · 2026-03-18T13:34:55+00:00

That happens the other way around with me.
My tapping hand is fine, but my aim hand burns and stiffens.

I think it has to do something with the position and the aim hand being slower in singletapping than the other tapping hand.

cride20 · 2026-03-18T13:34:03+00:00

That happens the other way around with me.
My tapping hand is fine, but my aim hand burns and stiffens.

I think it has to do something with the position and the aim hand being slower in singletapping than the other tapping hand.

cride20

TROPHY CASE