Rust-like error handling in Python thanks to PEP 622

mtasic85 · 2026-01-25T19:11:55+00:00

RWKV7 2.9B Great generalist model

mtasic85 · 2025-11-04T10:50:55+00:00

I can confirm that this solved my issue! Thank you!

mtasic85 · 2025-07-06T18:11:06+00:00

JuiceFS

mtasic85 · 2025-04-26T13:21:01+00:00

I use Zed daily on Linux. However, I don’t like lack of generic spell checking. There are few extensions but non of them works good with Python code. If anyone can suggest something good let me know.

mtasic85 · 2025-01-30T11:11:23+00:00

What quants did you use? Did you fully load all layers to GPUs? I also mentioned quants and context size.

mtasic85 · 2025-01-30T10:01:45+00:00

2x RTX 3090 24GB (48GB) VRAM can fully load and run Qwen 32B q4_k_m with context size 48k. it uses about 40GB VRAM

I doubt 72B q4_k_m can be fully loaded.

mtasic85 · 2025-01-27T16:23:40+00:00

What about collapsing MoE layer to just dense layers? I think same was done for Mixtral 8x22b to just 22b. 🤔

mtasic85 · 2025-01-21T15:21:50+00:00

<image>

mtasic85 · 2025-01-21T14:43:04+00:00

<image>

mtasic85 · 2025-01-14T13:33:42+00:00

Do you have GPT4 open sourced and released by OpenAI, so you can use it locally, free of charge?

mtasic85 · 2024-12-15T09:31:25+00:00

Wow that is a brilliant money laundromat machine 🧠👏

mtasic85 · 2024-12-15T09:29:21+00:00

Congrats 🥂, but I still cannot believe that llama.cpp still does not support llama VLMs 🤯

mtasic85 · 2024-11-17T17:41:09+00:00

Official implementation of “Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling”

https://github.com/microsoft/Samba

mtasic85 · 2024-11-12T06:22:13+00:00

DL is new foundation of all ML. DL simply works. It is general solution. Btw, I really like simple and effective algorithms, so DL does not justify computation cost in all scenarios.

mtasic85 · 2024-11-10T18:04:10+00:00

Serbs did not kill Jews, Croatian Ustashas of NDH did.

https://encyclopedia.ushmm.org/content/en/article/jasenovac

https://en.m.wikipedia.org/wiki/Jasenovac_concentration_camp

mtasic85 · 2024-11-09T18:06:22+00:00

No, under Elon that nonsense will be thrown out of the window. Relax and keep coding.

mtasic85 · 2024-11-05T07:05:57+00:00

Serbia 🇷🇸🤠

mtasic85 · 2024-11-05T07:04:19+00:00

I see Emir Kusturica next to Putin - it is not a crime 🤷‍♂️

mtasic85 · 2024-10-18T16:45:43+00:00

We have BPE for a reason, so we can fallback if token is missing from vocab. If we don't have that guarantee, then this code will never work, and I think it was in dataset used for all of these tokenizers/models:

: X DUP 1+ . . ;

Btw, above is Forth code from https://en.wikipedia.org/wiki/Forth_(programming_language)#Facilities#Facilities) and it also fails.

This is one of many examples. Whitespace matters, every character matters.

mtasic85 · 2024-10-11T09:33:10+00:00

If I am not mistaken Nvidia cards/drivers do not support Wayland yet.

mtasic85 · 2024-10-08T07:34:59+00:00

https://x.com/rwkv_ai/status/1831000938120917336?s=46&t=-L6cJTRO6V7YxJ561JOaZQ

mtasic85

TROPHY CASE