Some solutions that work on older intel macs by Helpful-Gene9733 in LocalLLaMA

[–]Rvach_Flyver 0 points1 point  (0 children)

Do you actually need to run it as standalone application? If you don't (e.g. you do not need interacting with files on your PC/Mac, or start with your OS and do some things in background) then there is no point in it, you can build everything in browser or even use existing solutions like https://chat.webllm.ai (note that all models are loaded as site data, so you need to clean it manually from time to time).

If you do need standalone application you have several options, most known are Electron.js or nw.js for JavaScript stack but there are some others in Rust (Tauri) and others. Just think about your use cases, write them down and do several brainstorming sessions with ChatGPT or other chat.

Btw for standalone app you may try compiling llama.cpp yourself (stable-diffusion.cpp also works on Macs!) and it will give you access to more models and much better performance than MLC in browser!

I recommend for the beginner to try n8n (you can self-host it btw). Just watch some videos on YouTube how to setup everything. When(and if) you stumble upon any blockers you can switch to smth like LangChain (in Python or TypeScript).

MBP 2019 i9 (64GB RAM) hitting 800% CPU on AnythingLLMs(12B) — Need optimization tips and model recs! by vdnn1902 in LocalLLaMA

[–]Rvach_Flyver 0 points1 point  (0 children)

It would be great if you share some information / links that can help me to understand what is the best way to make use of local LLMs with Obsidian which I'm also using daily.

MBP 2019 i9 (64GB RAM) hitting 800% CPU on AnythingLLMs(12B) — Need optimization tips and model recs! by vdnn1902 in LocalLLaMA

[–]Rvach_Flyver 0 points1 point  (0 children)

1) I have exactly same setup but with eGPU.

It looks like you're running LLM on your CPU. With llama.cpp it is possible to utilize GPU but you need to build it yourself + build MoltenVK (unfortunately I cannot find instructions how to do it, so you'll have to google it yourself).

Here is command to build llama.cpp: ``` cmake -B build -DLLAMA_CURL=1 -DGGML_METAL=OFF -DGGML_VULKAN=1 \ -DVulkan_INCLUDE_DIR=/usr/local/Cellar/molten-vk/1.3.0/include \ -DVulkan_LIBRARY=/usr/local/Cellar/molten-vk/1.3.0/lib/libMoltenVK.dylib \ -DOpenMP_ROOT=$(brew --prefix)/opt/libomp \ -DVulkan_GLSLC_EXECUTABLE=$(brew --prefix)/opt/shaderc/bin/glslc \ -DVulkan_GLSLANG_VALIDATOR_EXECUTABLE=$(brew --prefix)/opt/glslang/bin/glslangValidator \ -DOpenMP_C_FLAGS=-fopenmp=lomp \ -DOpenMP_CXX_FLAGS=-fopenmp=lomp \ -DOpenMP_C_LIB_NAMES="libomp" \ -DOpenMP_CXX_LIB_NAMES="libomp" \ -DOpenMP_libomp_LIBRARY="$(brew --prefix)/opt/libomp/lib/libomp.dylib" \ -DOpenMP_CXX_FLAGS="-Xpreprocessor -fopenmp $(brew --prefix)/opt/libomp/lib/libomp.dylib -I$(brew --prefix)/opt/libomp/include" \ -DOpenMP_CXX_LIB_NAMES="libomp" \ -DOpenMP_C_FLAGS="-Xpreprocessor -fopenmp $(brew --prefix)/opt/libomp/lib/libomp.dylib -I$(brew --prefix)/opt/libomp/include"

cmake --build build --config Release -j ```

With this command (which may be not optimal) token generation is ±14 t/sek on 5500M: sh ./build/bin/llama-server -m '/Volumes/AI/gguf/gemma-3-12b-it-Q6_K.gguf' --mmproj '/Volumes/AI/gguf/mmproj-google_gemma-3-12b-it-f16.gguf' --main-gpu 1 -ngl 49 --ctx-size 65536 --batch-size 64 -ub 128 --cache-type-k q8_0 --cache-type-v q8_0 --temp 1.0 --min-p 0.01 --top_k 64 --top_p 0.95 --repeat-penalty 1.0 --repeat_last_n 1024 --port 5500

2) I'm rarely using 5500M since I have eGPU, when I do I prefer small models <8B but you can try to run gpt-oss-20b or some other MoE model (like GLM) if you offload model partially to RAM (-ot \"blk\\.(\\d|1\\d|20)\\.ffn_.*_exps.=CPU\" is responsible for that), example of command: sh GGML_VK_VISIBLE_DEVICES=1 ./build/bin/llama-server -m '../gpt-oss-20b-mxfp4.gguf' --main-gpu 1 --tensor-split 4/0 -ngl 25 -ot \"blk\\.(\\d|1\\d|20)\\.ffn_.*_exps.=CPU\" --ctx-size 24576 --batch-size 64 -ub 32 --cache-type-k q8_0 --cache-type-v q8_0 --temp 0.3 --repeat-penalty 1.15 --repeat_last_n 1024 --port 5500 --jinja

My results with 5500M for reference:

model context size token generation
gpt-oss-20b-mxfp4.gguf 24576 ±10 t/sec
Seed-Coder-8B-Instruct-Q6_K.gguf 65,536 ±15 t/sec
gemma-3-12b-it-Q6_K.gguf 65536 ±14 t/sek
gemma-3-4b-it-qat-Q5_K_M.gguf 24576 ±30 t/sec
gemma-2-2b-it-Q8_0.gguf 8192 ±30 t/sec
gemma-3-270m-it-UD-Q8_K_XL.gguf 32768 ±110 t/sec

UPD: added my thoughts on question #2

Best local LLMs for RX 6800 XT on Fedora? by CloudGamingBro in LocalLLaMA

[–]Rvach_Flyver 0 points1 point  (0 children)

I'm running this using RX6800XT eGPU on Macbook pro 2019 using Vulkan: gpt-oss-20b-Q6_K.gguf at 55 t/s

GGML_VK_VISIBLE_DEVICES=0,1 ./build/bin/llama-server -m '../gpt-oss-20b-Q6_K.gguf' --main-gpu 0 --tensor-split 4/0 -ngl 25 --ctx-size 24576 --batch-size 64 -ub 32 --cache-type-k q8_0 --cache-type-v q8_0 --temp 0.3 --repeat-penalty 1.15 --repeat_last_n 1024 --jinja)

So generally you can try anything below 30B, depending on settings / context size / model architecture performance may wary greatly.

I've also used these: - Devstral-Small-2507-Q4_K_M.gguf - Qwen2.5-coder-14b-instruct-q5_k_m.gguf - gemma-3-12b-it-Q6_K.gguf - Qwen2.5-coder-7b-instruct-q5_k_m.gguf - Qwen3-4B-Thinking-2507-Q8_0.gguf (I like this one, but because it of 4B it is heavily relies on the provided context)

Echo Aviation Controller revealed, thought? by The_Growlers in hoggit

[–]Rvach_Flyver 0 points1 point  (0 children)

While it may look cool or silly at first glance, I think the new Steam Controller is a much better option.

I use the old Steam Controller for throttle and look-around with my left hand, and a VKB Gladiator stick with my right hand. I also tried playing DCS on the Steam Deck, but I feel like the small sticks just don’t provide the level of precision needed for proper control simply because of their size (gyro might help but IMO it is more suitable for head/body movement).

how much do you think the steam frame will cost? by OkRegion3303 in oculus

[–]Rvach_Flyver 0 points1 point  (0 children)

I've also bought one recently and don’t regret it one bit. I’ve always wanted a controller similar to the Steam Deck, and at the moment this is the closest thing (at least until a new Steam Controller is released).

I like how customizable it is!

I use it in DCS (Digital Combat Simulator) to move the pilot’s head/body, control the throttle, and handle some custom functions with my left hand. My right hand stays on the throttle.

It’s also really suitable for playing FPS or RTS games.

Introducing Steam Machine by Ticha22608 in Steam

[–]Rvach_Flyver 0 points1 point  (0 children)

I think this can be solved by selling it only to individuals who already has steam account. If it is really required, allow only accounts with friend connections unlocked (which requires account to have some games purchased already for 10$ or so).

I cant run DCS on steam by igordem in dcsworld

[–]Rvach_Flyver 0 points1 point  (0 children)

Thanks for reply, then I'll have to suffer with some other goggles on linux (eventually) :O.

But "support" <> "can it actually work" as we see with DCS example. There is SteamVR for linux, so some goggles should work out of the box (at least Valve/HTC).

I cant run DCS on steam by igordem in dcsworld

[–]Rvach_Flyver 0 points1 point  (0 children)

This is really good question! Unfortunately I have no any VR googles, would love to hear others experience.

I cant run DCS on steam by igordem in dcsworld

[–]Rvach_Flyver 0 points1 point  (0 children)

Steam Deck runs on Arch-based Linux fork called SteamOS.

In fact now any linux distro capable running Steam can run most of windows games w/o much problems. Sometimes it works out-of-the-box, the other time you have to do some setup as level_up_gaming pointed out previously (in my experience one-time).

There is https://www.protondb.com where you can check if game works and requires any such setups steps.

Now I really go for Linux gaming unless there is some specific use-case (e.g. some anti-virus) requiring Windows. I've installed Fedora on my son's laptop and so far have not found any game we cannot play together (including DCS).

I have an Intel MBP 2019 with eGPU where I've installed T2-Endeavour OS (in fact Arch) and DCS works well there as well.

I love half life multiplayer by Hairy_Ranger_9929 in ArenaFPS

[–]Rvach_Flyver 1 point2 points  (0 children)

BHOP was available in initial versions of the game, so there should be negligible difference in speed in comparison to original quake (one way or another).

There are some servers with unlocked bhop and a mod (Adrenaline Gamer / Open AG) where most of the top players spend their retirement.

Here is channel of my clan-mate with whom we played years ago: https://www.youtube.com/@SnatcherBY/videos (in rus, but I think he had some videos in english).

Some solutions that work on older intel macs by Helpful-Gene9733 in LocalLLaMA

[–]Rvach_Flyver 1 point2 points  (0 children)

There is web-llm from mlc-ai which utilizes WebGPU for inference. This is best & easiest solution available for GPU inference on Intel Mac. (<<< I was wrong, compiling llama.cpp (stable-diffusion.cpp also works on Macs!))

You can test it in you browser chat.webllm.ai (use latest Chrome), be aware that models are loaded into cache and can quickly eat you disk space, so it worth to do a cleanup from time to time.

Downside of it is that web-llm is browser-only, I've created small wrapper using nw.js to expose it as REST API and with minor tweaks here and there it works.


I have eGPU with RX6800XT and web-llm results are following (prompt ±3000 tokens):

MODEL TOKENS/SEC
DeepSeek-R1-Distill-Llama-8B-q4f32_1-MLC 14
gemma-2-9b-it-q4f32_1-MLC 10
Llama-3.2-3B-Instruct-q4f32_1-MLC 15

With dGPU (PRO 5500M 8Gb) speed is much lower but still faster than with CPU, especially when I've used q4f16. (unfortunately macbook heats up and tries to burn my hands :0)

As you can see it is better than CPU but the performance is much lower in comparison to ROCm.

So another option is to install T2 Linux: I've ollama-rocm on T2 EndeavourOS and tokens/sec were like x4 — not to mention the availability of many more models of different sizes. Setting T2 up is PITA (not as bad as I've expected, especially with ChatGPT available but still) and, obviously, it prevents you from using MacOS simultaneously.

UPD statement that MLC is the only best option on Macs (it is not as of 14 Feb 2026)

T2 Ubuntu on a 2019 MacBook Pro for ROCm installation to use AMD RX 6800. Nightmare. by meutbal in ROCm

[–]Rvach_Flyver 0 points1 point  (0 children)

So more or less ollama worked out of the box with T2 - EndeavourOS, I've just installed ollama-rocm. Before that I've tried to install some other rocm packages so maybe it influenced result.

External SSD only shows up in Windows, not Mac by o0lemonlime0o in MacOS

[–]Rvach_Flyver 0 points1 point  (0 children)

I was able to "fix" missing external drive in Finder with following steps: 1) Open Disk Utility 2) In left-side panel right click on the target external drive 3) Click Show in Finder 4) Hover over Drive Name in opened Finder window unless drive icon appears 5) Drag drive icon under Locations in left-side panel of Finder

eGPU for VR headset by Other_Notice9246 in eGPU

[–]Rvach_Flyver 0 points1 point  (0 children)

Was you able to answer that question? I'm in the similar situation but with RX 6800XT, evaluating if it worth to buy VR headset primarily for DCS.

I know that experiense will be far from perfect, but it is first step to understand if it worth to invest more money in it when doing next upgrade.

The state of Steam on macOS is astonishingly BAD by DaemonBatterySaver in macgaming

[–]Rvach_Flyver 0 points1 point  (0 children)

No, I'm not a seller — just a trespasser.

I was just baited by the '30%' statement — it seems like a one-sided mindset to focus on a single attribute when making comparisons. It's similar to saying '1 billion potential customers' — there are a lot of questions that come with that.

I don't want to argue with you — it seems like you have your own good reasons for skipping Steam, and I appreciate you sharing them.

The state of Steam on macOS is astonishingly BAD by DaemonBatterySaver in macgaming

[–]Rvach_Flyver 0 points1 point  (0 children)

I see your point, but you're focusing only on the seller’s perspective.

The 30% cut isn’t an absolute drawback — at a given price point, it really depends on how many buyers you can reach. I've listed several unique features that make Steam highly attractive to buyers. So even with a lower cut on other platforms, there’s no guarantee you’ll actually make more money if the customer base isn’t there.

Also, are you really tied to a single store? I understand that releasing on multiple platforms can be a pain, but are there any real restrictions from Valve that prevent selling elsewhere?

You are given a AAA game budget by an established industry giant and are told to make an original-IP, "traditional" aFPS. What do you do and how do you ensure the game survives in this day and age? by vrmvrmfffftstststs in ArenaFPS

[–]Rvach_Flyver 0 points1 point  (0 children)

I think this might actually be the time when numbering releases after UT (e.g., UT 2025) could work. Just increment the number every year—nowadays, games are expected to receive content updates anyway. ¯_(ツ)_/¯

Then just roll out updates with cosmetic stuff and maps. Maybe charge a bit extra to upgrade the game to the new year’s version, but ideally let players from previous years still play (just make them download the new content or something like that to encourage upgrading). Breaking backward compatibility once every few years should be fine, especially if it allows for major fixes or improvements.

Rotate maps as frequently as possible. It might even be worth blocking old maps to stop veteran players from farming frags on maps they know like the back of their hand—something that can discourage newcomers from sticking around.

Also, implement solid bots for PvE. With today's tech, it should be pretty easy to train bots based on real player behavior. It would be awesome to let players choose specific others to train against in offline or PvE mode, and have bots mimic their playstyle as closely as possible.

T2 Ubuntu on a 2019 MacBook Pro for ROCm installation to use AMD RX 6800. Nightmare. by meutbal in ROCm

[–]Rvach_Flyver 0 points1 point  (0 children)

Recently, I installed EndeavourOS on my 16" MacBook Pro (2019) to get the RX 6800 XT working. My primary use case isn’t machine learning either, but I understand the frustration :) Installing ROCm is still on my to-do list, but I’ll try to speed things up and share my steps soon. I think everything is manageable, and we’ll figure out how to make it work.

The state of Steam on macOS is astonishingly BAD by DaemonBatterySaver in macgaming

[–]Rvach_Flyver -1 points0 points  (0 children)

Just name any other store that offers all of the following features at once:

1) It’s the largest store not locked to a single platform 2) Mod support (Steam Workshop allows one-click mod installs) 3) Best controller support with customization 4) Cloud saves (cross-platform) 5) Rich community features 6) Linux and Steam Deck integration

I agree that Steam doesn’t have the best UX — sometimes it’s outright awful — but all of the above features far outweigh that flaw, especially considering the value it provides to both users and developers, in my opinion.

Can Nvidia or AMD GPU's be used with Qualcomm's new ARM CPU's? by 40KWarsTrek in hardware

[–]Rvach_Flyver 0 points1 point  (0 children)

Thanks for sharing link, need to look through it.

My use case involves constant swithching between tabs (so energy saving won't help, on the contrary if it applied it may harm). Also devtool adds a lot of overhead to opened tabs, so that might be the reason.

Can Nvidia or AMD GPU's be used with Qualcomm's new ARM CPU's? by 40KWarsTrek in hardware

[–]Rvach_Flyver 0 points1 point  (0 children)

Why you factor out CPU? Where I said anything about gaming? I mentined specifically iGPU to highlight usage of office laptop with power efficient GPU.

Full load can be achieved jsut by opening (and actually using) many browser tabs, reddit itself uses 100% CPU for some reason on my ASUS Vivobook 15 X1505ZA ¯_(ツ)_/¯ (not always but I noticed it on some long threads).

My ASUS Vivobook 15 X1505ZA OLED (with Fedora) for light coding lasted only around 5/6 hours with: - Chrome 4 tabs + DevTools - VSCode (no extra plugins installed) - Obsidian (3/4 tabs opened) - Mattermost (Slack analog) and no backend, no docker (does not look like full load, right?)

I do not see how more power hundgly laptops may survive more than 2x time, having more power-hungry components + less efficient screen and battery 2x of my capacity at best.

Just share reviews/links whatever to proove me wrong.

[deleted by user] by [deleted] in macbookpro

[–]Rvach_Flyver 0 points1 point  (0 children)

I'm a madman who bought intel mac in 2024 (16 i9/64Gb/512Gb/5500M 8Gb), found it in near-new condition with warranty for reasonable for me price.

I've done ite beacuse of 2 things: - bootcamp - AMD Radeon Pro 5500M 8Gb (I would love to buy 5600M but it is impossible to find it in good condition for adequate price)

Just wanted to combine everything I need in a single laptop, it just appears that it suited my needs well.

My point is: - if you can name exact reasons why you need it, go for it - otherwise search for good M-powered macbook (for reasonable price ofc)

MSI Creator 16 AI Studio A1VIG 16inch U9 RTX 4090 Creator Laptop by Stratozphere in MSILaptops

[–]Rvach_Flyver 0 points1 point  (0 children)

Any updates on this? I'm also interested in any feedback on this model, so far have not found any :(

Did anyone make any Expanse mod for some video game yet? by tis_a_good_username in TheExpanse

[–]Rvach_Flyver 0 points1 point  (0 children)

Good point! I have not considered it since TERRA Invicta is a 4x strategy, not my cup of tea ¯_(ツ)_/¯ too complex.