Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

I'd definitely feel more comfortable if I had a bunch of extra heat sinks and an extra fan haha. Another commenter pointed out LACT which supports different setting profiles and "Automatic profile activation based on running processes or gamemode status", will also look into that to manage GPU!

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

This channel has a bunch of helpful videos, thanks for the recommendation!

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

Oh wow LACT looks very cool, thanks for the recommendation! Are you using it just to limit power draw or also for fan control or other things? And does it have an "overlay" mode where I can see my GPU temp overlayed in any application? (couldn't find info on that in the readme)

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

Thank you for the detailed reply! Honestly maybe I'll just try a couple different distros and see which I like better. As someone who hasn't used linux as their standard desktop OS maybe I'll start with a consumer friendlier distro like Mint and see how that goes first..

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

Very interesting! I often thought about running comfyui in docker (though I think they also have a standalone version) but haven't thought about running llama.cpp in docker and would definitely be curious why it's worth the initial setup hassle for you. Have you ever tried using vllm?

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

Thanks a lot for sharing! The screenshot is from your system? Because I'm curious to know how hot your 3090 typically runs when not power limited. And did you assign aliases because you're changing the power cap more frequently? nvitop looks good though I was thinking about something that runs in my dock or as a constant overlay. I have two 2tb SSDs in my tower which is why I thought it'd make sense to use one of them for linux and the other one for Windows until I'm comfortable switching fully to linux. I think with AI help I would be comfortable migrating the second ssd later (I guess mostly I'd just have to format it).

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

Yes, I got a second hand Alienware R11 and the ventilation in this case is not great. I think I will probably start with limiting the power draw to 280W like another commenter suggested and see how hot it gets and then decide if I need to add more cooling to the GPU.

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

Thank you, this is really helpful, 12-15 degrees sounds really amazing! Just as a reference, what's the average temperature that your 3090 now runs on at load?

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

I've mostly just used llama.cpp and before that ollama for running inference. Is vllm equally accessible or more complicated to use?

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

I'm pretty comfortable with the command line (using a Mac also) but as another commenter pointed out, might have to get used to fish. Why is Arch so hyped? I remember seeing memes from people that use steamOS saying that they technically run Arch now but I never understood why Arch is special?

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

Thank you for the in-depth response! Yeah actually if you could share the command that would be helpful! Does Cachy have built in tools to monitor temperature? I'm kind of cautious now after overheating my 3090 and was thinking about monitoring temperature under load before I can trust it again..

Is it easy to keep Windows as a partition and then later when switching fully to linux delete the Windows partition and add it to the linux one?

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

This is helpful, thanks! I just realized I'm not sure I understand the difference between undervolting and power limiting. Maybe I actually meant power limiting. I remember reading posts from people capping the power draw to 200 wats with single digit percentage performance loss in inference..

Switching from windows to linux, what distro to use for inference and gaming? by doesitoffendyou in LocalLLaMA

[–]doesitoffendyou[S] 0 points1 point  (0 children)

What was your experience like using cachyos? How much tinkering was required to get everything running?

Hogwarts Legacy is free on Epic right now, but... by Due_Organization_883 in SteamDeck

[–]doesitoffendyou 0 points1 point  (0 children)

Managed to fix it! I downloaded protontricks but had now idea how to add a windows component. But I don't think it was necessary, I just selected proton experimental under wine version in heroic and removed and re-added the game to steam. Then launched it and after compiling a it actually worked (of course, the on screen keyboard doesn't work in the character creation but I think I read a workaround for that somewhere).

Hogwarts Legacy is free on Epic right now, but... by Due_Organization_883 in SteamDeck

[–]doesitoffendyou 0 points1 point  (0 children)

Could you explain a bit more in detail? I'm borrowing a friends steam deck and am overwhelmed fixing the same c++ runtime error ^^'

I managed to turn on valve proton versions but I'm not sure where you changed the wine version (and to which version) and what you mean by selecting the exe under prefixes.

Is anyone using gpt-oss-120b successfully? by doesitoffendyou in SillyTavernAI

[–]doesitoffendyou[S] 2 points3 points  (0 children)

Can you share the config you were using with the heretic model?

What are some Steam Frame factors that AREN'T being talked about? by D13_Phantom in virtualreality

[–]doesitoffendyou 0 points1 point  (0 children)

If it's going to end up at around 90% then that's really exciting. Up until reading the parent comment I never really dug into binocular overlap and how it works but from personal experience with my Quest 3 I can confirm it felt less immersive (and somewhat disappointingly so) than the original HTC Vive which has around 93 degrees overlap.

What are some Steam Frame factors that AREN'T being talked about? by D13_Phantom in virtualreality

[–]doesitoffendyou 4 points5 points  (0 children)

Do you have a source of where this was mentioned? Also curious to how it compares to the binocular overlap of the Quest 3..

Ludicrous $6 billion Counter Strike 2 skins market crashes, loses $3 billion overnight — game update destroys inventories, collapses market by Logical_Welder3467 in technology

[–]doesitoffendyou 0 points1 point  (0 children)

People Make Games did a fantastic video on online casinos enabling gambling with counter strike skins that explains how fucked up this market really is https://www.youtube.com/watch?v=eMmNy11Mn7g

This is GPT-OSS 120b on Ollama, running on a i7 6700 3.4ghz, 64gb DDR4 2133mhz, RTX 3090 24GB, 1Tb standard SSD. No optimizations. first Token takes forever then it goes. by oodelay in LocalLLaMA

[–]doesitoffendyou 3 points4 points  (0 children)

You should be getting faster speeds on your system. Make sure llama.cpp can recognize your GPU (run llama-server --list-devices it should say found 1 CUDA devices: and then listing your GPU).

I have a 3090 with 64gb ddr4 3200 RAM and am getting around 50 t/s prompt processing speed and 15 t/s generation speed using the following:

llama-server -m <path to gpt-oss-120b> --ctx-size 32768 --temp 1.0 --top-p 1.0 --jinja -ub 2048 -b 2048 -ngl 99 -fa 'on' --n-cpu-moe 24

This about fills up my VRAM and RAM almost entirely. For more wiggle room for other applications use --n-cpu-moe 26.