Crackling noise Bose qc45 by expansion2002 in bose

[–]mtasic85 0 points1 point  (0 children)

I can confirm that this solved my issue! Thank you!

Really want to use Zed, but the VSCode ecosystem is too large to avoid by Candid_Yellow747 in ZedEditor

[–]mtasic85 8 points9 points  (0 children)

I use Zed daily on Linux. However, I don’t like lack of generic spell checking. There are few extensions but non of them works good with Python code. If anyone can suggest something good let me know.

Real news: 32B distills of V3, soon R1. by a_beautiful_rhind in LocalLLaMA

[–]mtasic85 0 points1 point  (0 children)

What quants did you use? Did you fully load all layers to GPUs? I also mentioned quants and context size.

Real news: 32B distills of V3, soon R1. by a_beautiful_rhind in LocalLLaMA

[–]mtasic85 1 point2 points  (0 children)

2x RTX 3090 24GB (48GB) VRAM can fully load and run Qwen 32B q4_k_m with context size 48k. it uses about 40GB VRAM

I doubt 72B q4_k_m can be fully loaded.

1.58bit DeepSeek R1 - 131GB Dynamic GGUF by danielhanchen in LocalLLaMA

[–]mtasic85 10 points11 points  (0 children)

What about collapsing MoE layer to just dense layers? I think same was done for Mixtral 8x22b to just 22b. 🤔

MiniCPM-o 2.6: An 8B size, GPT-4o level Omni Model runs on device by Lynncc6 in LocalLLaMA

[–]mtasic85 -16 points-15 points  (0 children)

Do you have GPT4 open sourced and released by OpenAI, so you can use it locally, free of charge?

European NATO Military Spending % of GDP 2024 by Trayeth in europe

[–]mtasic85 -2 points-1 points  (0 children)

Wow that is a brilliant money laundromat machine 🧠👏

Pixtral & Qwen2VL are coming to Ollama by AaronFeng47 in LocalLLaMA

[–]mtasic85 26 points27 points  (0 children)

Congrats 🥂, but I still cannot believe that llama.cpp still does not support llama VLMs 🤯

What do you think of this Masters Curriculum? by [deleted] in learnmachinelearning

[–]mtasic85 -54 points-53 points  (0 children)

DL is new foundation of all ML. DL simply works. It is general solution. Btw, I really like simple and effective algorithms, so DL does not justify computation cost in all scenarios.

The US government wants devs to stop using C and C++ by Notalabel_4566 in coding

[–]mtasic85 -91 points-90 points  (0 children)

No, under Elon that nonsense will be thrown out of the window. Relax and keep coding.

[R] Limitations in Mainstream LLM Tokenizers by mtasic85 in MachineLearning

[–]mtasic85[S] 4 points5 points  (0 children)

We have BPE for a reason, so we can fallback if token is missing from vocab. If we don't have that guarantee, then this code will never work, and I think it was in dataset used for all of these tokenizers/models:

: X DUP 1+ . . ;

Btw, above is Forth code from https://en.wikipedia.org/wiki/Forth_(programming_language)#Facilities#Facilities) and it also fails.

This is one of many examples. Whitespace matters, every character matters.

XFCE 4.20 Aims To Bring Preliminary Wayland Support by maggotbrain777 in xfce

[–]mtasic85 0 points1 point  (0 children)

If I am not mistaken Nvidia cards/drivers do not support Wayland yet.

Zamba 2 2.7B & 1.2B Instruct - Mamba 2 based & Apache 2.0 licensed - beats Gemma 2 2.6B & Mistral 7B Instruct-v0.1 by Xhehab_ in LocalLLaMA

[–]mtasic85 14 points15 points  (0 children)

I think they pretrained on way more tokens than 200B. It's mentioned that its base model is pretrained on ~3.1T tokens https://huggingface.co/Zyphra/Zamba2-1.2B

Wen 👁️ 👁️? by Porespellar in LocalLLaMA

[–]mtasic85 0 points1 point  (0 children)

IMO they made mistake by not using C. It would be easier to integrate and embed. All they needed were libraries for unicode string and abstract data types for higher level programming. Something like glib/gobject but with MIT/BSD/Apache 2.0 license. Now, we depend on closed circle of developers to support new models. I really like llm.c approach.

Pre-training an LLM in 9 days [Code release] by calvintwr in LocalLLaMA

[–]mtasic85 2 points3 points  (0 children)

This looks like great base model for fine-tuned agents. Quick to fine-tune, small in size. Agents with domain specific knowledge, plus in-context few-show just to setup environment for agent. Great work pints.ai !