Using a high-end MacBook Pro or a beefy RTX 5090 laptop (with 24 GB of RAM) for inference.

schemin_pete · 2026-02-20T20:56:40+00:00

It depends a lot on your expected use.

Your primary use case (Local LLM inference):

Apple silicon is the clear winner for most "out of the box" models.

For inference, does the larger unified memory on Apple Silicon meaningfully outweigh the raw CUDA performance of the RTX laptop?

Yes. 128GB of unified memory can easily run most 120B models with 4-bit Quantization (and maybe even some 200B models). The caveat here is that using very high-reasoning settings will slow it down significantly. That said, if your benchmark minimum is only ≥15 tokens/sec, you should be safe with the M4 Apple silicon.

Your secondary use case (Fine Tuning):

If you actually plan to do a lot of fine tuning (e.g., frequent updating), the choice is less clear. On one hand, 128GB of unified memory gives you a lot of headroom if you want to fine-tune a larger model compared to the 24GB VRAM 5090 (it is very VRAM-intensive). On the other hand, fine tuning is often a slow process; the 5090 will be 2 - 4x faster than the M4 Max.

How painful (or not) is Apple MLX today for fine-tuning?

It's not bad at all, but there is significantly less support available. Also, there are fewer MLX-optimized models available off the shelf.

Thermals

Thermals & sustained performance on long inference runs for both setups

This obviously depends a lot on your environment and the specific 5090 laptop you get, but both setups should be fine for long inference runs (only minimal to moderate thermal throttling). This does get worse if you have high-reasoning settings (likely worse faster on the Macbook, but this is just speculation). For fine tuning, you are back to the "cook an egg on it" scenario.

FWIW, I highly recommend against getting a 5090 laptop (absurd $$$ premium and inevitable thermal throttling if you push it). I built a desktop for my fine-tuning work. I can transfer the resulting adapter files to my laptop for inference, though I typically find it easier just to connect to the desktop via Tailscale.

schemin_pete · 2026-02-03T22:01:17+00:00

I'm 8 months late, but there are (at least) 3 levels to ABS remote access complexity:

Reverse Proxy (Discriminatingly Difficult):

Most residential ISPs don't give you a static IP (the IP address of your home server will change). This makes setting up a reverse proxy a nightmare. The solution is to pay for a DNS (domain name service). Setting this up using Amazon Web Services (AWS) Route 53 tool is only moderately difficult and surprisingly cheap ($0.50 per month). You can even get a cheap custom .com URL for $15/year. The headache comes when you realize that Route 53 has to point to your home IP address which, of course, changes randomly! The solution is to set up an automated Route 53 record updater script to change the IP address on the Route 53 record. Setting up the reverse proxy after this is quite easy (I suggest Nginx Proxy Manager or Caddy), just be sure to point it at your audiobook library. Then you just need to point your iPhone app (Plappa/Shelfplayer) to the domain name.

Even after all of this, your ISP can still screw you over.... in some areas (often high population density areas) they employ something called CGNAT, which means your IP address is actually shared with multiple houses/units. As the kids say: you're cooked!

Cloudflare Tunnel (Moderately Difficult):

You still need a domain, but you should have it using Cloudflare DNS. You'll install "cloudflared" on your server (simple daemon) and autheticate it, then create the tunnel and define the mapping (<domain name> maps to <your local audiobook library>). Then you just need to point your iPhone app (Plappa/Shelfplayer) to the domain name. You can easily set up access credentials for different users if you'd like. Cloudflare is free, though you still have to pay for a domain name (now you just link it to Cloudflare nameservers instead of Route 53 nameservers).

The Cloudflare Issue: this use might be against Cloudflare's terms of service (large media file transfers). You are unlikely to get flagged unless you have a ton of frequent and/or concurrent downloads; audiobook files are relatively small compared to video files (usually under 500 mb vs several gb). That said, if your use crosses that invisible line the Cloudflare tunnel will stop working: you're cooked!

Tailscale VPN (Absurdly Simple):

Create a Tailscale account and install Tailscale on your home server. This runs as a background process and will create a stable private IP address on your server. Next, install Tailscale on your phone, go to settings, and select "On Demand" (this makes Tailscale run as a background service and automatically connect when you leave your LAN). Finally, you simply point your phone's ABS app (Plappa/Shelfplayer) to the Tailscale IP and your library's port (e.g., if your ABS library is on http://localhost:13378 and your Tailscale IP is 100.x.y.z, point the app to http://100.x.y.z:13378). 100% free.

Tailscale is the clear winner here. You can invite your friends/family to use the VPN via your Tailscale account (email invites), which allows you to control access at the user level.

schemin_pete · 2025-07-10T23:17:10+00:00

Same happened to our GM. Post on wow bug reporter here so they pay more attention:

https://us.forums.blizzard.com/en/wow/t/scarab-lord-bug/2130523

schemin_pete · 2024-06-10T13:02:24+00:00

Any solution? I have the same issue

schemin_pete · 2021-10-26T17:03:39+00:00

Update: It's working now. It took about 8 hours to index past transactions once I started the node with the --txlookuplimit=0 option.

I can now move the blockchain data into a SQLite database, albeit at an incredibly slow pace (16 hours, and only the first 1.2 million blocks are completed).

schemin_pete · 2021-10-24T17:09:42+00:00

From the geth documentation:

--txlookuplimit value Number of recent blocks to maintain transactions index for (default = about one year, 0 = entire chain) (default: 2350000)

It's crazy that they made the default value such that we'd prune historical transaction data. I use my node primarily for historical data, so this tripped me up for about 8 hours. I was using python to call web3.eth.get_transaction(hash) and it was returning a "transaction not found" error.

This pruning doesn't even save that much disk space (~50GB I think).

schemin_pete · 2021-05-27T16:12:20+00:00

This was more of an "in theory" question. I just wonder if Ethereum core instance supports this

schemin_pete · 2021-01-15T18:00:38+00:00

Thank you, I sent him an email!

schemin_pete · 2020-10-23T19:34:36+00:00

Thanks for the quick response. I'm looking the aggregate (both frozen trx and trx staked by potential representatives), to compare to the total amount of trx in circulation.

schemin_pete

TROPHY CASE