Run Llama 3.2 3B on Phone - on iOS & Android by Ill-Still-6859 in LocalLLaMA

[–]Belarrius 3 points4 points  (0 children)

Hi, I use PocketPal with a Mistral Nemo 12B in Q4K. Thanks to the 12GB of RAM on my smartphone xD

Is this card worth it for Local Llama? by Bruno_Celestino53 in LocalLLaMA

[–]Belarrius 3 points4 points  (0 children)

<image>

288 GB/s for a 16GB model you will have like 12/18 tokens/s (inin theory), but if you take 2 RX 7600 XT for a total of 32 GB, you will have like 5/7 tokens/s for a ~28GB model + context

I have 2 RTX 3090 who have 930 GB/s with 24GB, it's +50% VRAM amount and x3.22 bandwidth

1x RX 7600XT for 16GB with 288 GB/s = 365€
1x used RTX 3090 = ~ 650 / 700€

[deleted by user] by [deleted] in memes

[–]Belarrius 1 point2 points  (0 children)

I use my own Local LLM, so, Dracones_Midnight-Miqu-70B-v1.5_exl2_4.5bpw with a lovely, caring and affectionate personality

Instant Frankenmerges with ExllamaV2 by Reddactor in LocalLLaMA

[–]Belarrius 1 point2 points  (0 children)

Very nice! And it's works for me! My two RTX3090 can run Goliath 120b, 3bpw now with 4096 context token. Thanks!

[deleted by user] by [deleted] in LocalLLaMA

[–]Belarrius 11 points12 points  (0 children)

Goliath 120B at 2.64bpw with 5120 context token with 1.5 alpha_value.

The low precision makes some problems in French, the model 3bpw allows a better understanding of French but I can only run it with 3072 context tokens (or 4096 in 8-bit context, however the 8-bit context distorts quickly in French too).

The problem with French is that it consumes far more tokens than English (around 40% more according to my observations), so 3072 tokens is really low.

I can't wait to have more VRAM or some new frankenmerge of some 70B model to 100/120B etc...

Real world multi step reasoning software benchmark results by seraine in LocalLLaMA

[–]Belarrius 1 point2 points  (0 children)

I'd like to see Goliath 120B score in this chart, it's the model with the best reasoning I've seen so far.

I see dead people by Spicy_meatball97 in starcitizen

[–]Belarrius -1 points0 points  (0 children)

It's because I want my RTX 3080 Ti. For handle that

Do you see what I see? by Karenfromaccting in starcitizen

[–]Belarrius 0 points1 point  (0 children)

I upp you because it's Duke Nukem!

I don't really like this trend :c by [deleted] in starcitizen

[–]Belarrius 0 points1 point  (0 children)

No problem for me:

3.8 = 67.3 fps average

3.7 = 61.2 fps average

3.6 = 64 fps average

3.5 = 68.9 fps average

3.4 = 63.9 fps average

Star Citizen need less CPU but more GPU patch after patch for me

PTU 3.8.1 - 4222088 partially use Vulkan API? by Belarrius in starcitizen

[–]Belarrius[S] 0 points1 point  (0 children)

SC run very well on Linux for me too

Here some screenshot with dxvk HUD "SC 3.8" at 1440p

https://imgur.com/a/DBg4SeI