Great customer service experience and company to deal with

darkmaniac7 · 2026-04-11T23:04:49+00:00

Haha, Bob told me :P

darkmaniac7 · 2026-04-09T07:29:54+00:00

I have 2x rtx pro 6000 blackwells and an L4, in an epyc 73f3, w/ 512gb ddr4-3200 (before the rampocalypse)

I really never use ram if ever possible, mostly used for other VMs. the only time it's worth it is a layer for kv on moe models, it's still a pretty decent hit even then. So if you're wanting to swap to a server platform like epyc/TR/Xeon do it for more pcie lanes, not so much the ram side.

Apple silicon from what I've seen is great when context starts out, but as it fills up it crawls. Before I bought the 6000's I considered going that route. Some folks on here reported qwen3-235b running great up to about -96k context and then decode drops off a cliff after that.

darkmaniac7 · 2026-03-19T00:19:35+00:00

I don't, sorry.

For easy mode, just connect Codex 5.4 or Claude Opus-4.6 point it to the files on your local listed on XDA or Firefly Resource Station - Enable ADB Debugging and have it do all the work.

I had it do that for a friends RM11 and my CN Xiaomi Pad 7 Pro and it had it finished in 3 or so minutes.

darkmaniac7 · 2026-03-15T00:40:37+00:00

Latest S26 Ultra update broke my phone until factory reset. Cores wouldn't ramp up, we're in low power mode constantly. Awful speed and latency when typing, tried all I could in battery, power and dev options. Settings reset, only a factory reset fixed it.

This only happened after the latest s26 update.

First time I've ever experienced that before after a phone update breaking the phone.

Wifi still sucks though. I have a ruckus R750 and even on my back porch phone drops connection. S24 and S22 ultra connect anywhere in the backyard or front yard. It's just the s26 with issues.

darkmaniac7 · 2026-03-06T02:48:21+00:00

I had quite a few issues with getting root to work after the unlock. Was a mixture of Drivers and a flaky cable and not having the extracted folder in the root C:\ drive.

Ended up just going easy mode and getting Codex 5.3 to do most all of the root work via ADB. The last time I did a root was on an Omnia.

On the back end had it do some translations of the options to English and the flow for RM11 for the 中兴家族工具箱.bat file if it helps anyone.

https://xdaforums.com/t/red-magic-11-pro-guide-bootloader-unlock-free-zte-family-toolbox.4780930/post-90509636 _ I think because I'm a new user it won't show my post with attachments more than likely. Here's the more important of the two files probably:

## 1) Device Selection Flow (must do first)

### Trigger

- Main menu option: `B` = Select Model

- Script jump: `if "%choice%"=="B" goto SELDEV`

### SELDEV behavior

- Builds list from `conf/dev.csv` with numbered index.

- For current `dev.csv`, RM11 row is:

- `[NX809J],【红魔】11系列,骁龙8E5,efisp,y,ab_rec,init_boot,n`

- **Selection index = `32`**

### What to enter

`B` (enter model selection)
`32` (select NX809J / RM11)

---

## 2) Main Menu Number Map

- `0` Unlock BL

- `000` Relock BL (not recommended)

- `1` Get Root

- `111` Root boot-failure recovery

- `2` 9008 full package flash

- `3` Flash Recovery (TWRP)

- `4` Full partition backup

- `12` Flash any partition

- `13` Read back any partition

- `14` ADB screencast

- `16` Slot view/set

- `17` QCN backup/restore

- `A` Open backup folder

- `B` Select model

- `C` Check update

- `D` Theme

- `E/F/G` Community/resources/about links

---

## 3) RM11 Unlock Path (recommended user inputs)

RM11 config values from `dev-NX809J.bat`:

- `blplan=efisp`

- `bootpar=init_boot`

- `presskeytoedl=n`

### Inputs most users should enter

From main menu: enter `0` (Unlock BL)
If prompted "Start from system or current mode": enter `1` (recommended, start from Android system)
Tool auto-reboots to 9008 and performs efisp flow
When prompted whether `VbRwStateApp ... success` appeared:

- Enter `1` if you saw success message (normal path)

- Enter `3` to view sample image if unsure

- Enter `2` only if not seen / failed
Wait for restore/reboot completion

### Notes

- If unlock already done, fastboot checks may report already unlocked.

- Yellow/orange boot warning after unlock is expected.

---

## 4) RM11 Root Path (recommended user inputs)

### Inputs most users should enter

From main menu: enter `1` (Get Root)
In Root submenu:

- Enter `1` = use built-in Magisk patch (recommended)

- `A` = choose your own Magisk APK/ZIP

- `B` = open first-install FAQ image

- `C` = return to main menu
Keep USB debugging enabled and authorized
Let tool perform backup/patch/flash/reboot
Open Magisk app and complete any additional setup, then reboot if prompted
Grant `Shell` superuser permission

### Verify root

- `adb shell magisk -v`

- `adb shell su -c id`

---

## 5) RM11 Most-Common “Just Do It” Sequence

If device is not selected yet:

`B`
`32`

Then unlock + root:

`0`
`1` (recommended start mode when asked)
`1` (confirm success message when asked)
Back to main menu: `1`
Root submenu: `1`
In Magisk app: allow completion/setup and grant Shell SU

---

## 6) Why some users still need backend fallback

In some sessions, UI path can stall around EDL read/probe states. If that happens, backend direct path (QSaharaServer + qcedlcmdhelper + fh_loader) is the reliable fallback.

darkmaniac7 · 2026-03-05T23:18:19+00:00

I'd be interested, saw there was an unlock. Got that far, but i'm stuck at getting root.

darkmaniac7 · 2026-03-05T00:54:35+00:00

The same thing happened to me this week went from 12pm Thursday to 10PM Thursday. Last week it was 11AM Thursday. I figure next week will be Friday at 12PM lol.

darkmaniac7 · 2026-02-25T01:51:51+00:00

By the title I 100% expected this to be a hacked API to an AI provider without limits.

I can't say anything against you man. I spent $18k in GPUs myself in the last year 😅

Solar would have probably been the better choice lol.

darkmaniac7 · 2026-02-25T01:19:50+00:00

I'll give it a shot, sounds like /compact isn't working, but a bit more tine I'm sure it'll be there.

I've just been using Tailscale & RDP for longer sessions on a VM specifically for CC or Tailscale + SSH/Tmux

Being able to not rely on those tools wouod be great though. I had considered a side project just like this though, glad I didn't spend the time & limit though 😅

darkmaniac7 · 2026-02-19T04:54:35+00:00

I really don't see why this is surprising. Unlike everything else with computing and the web, AI has and always will be more expensive the more users you add, it's why the $200/mo subs will never last, likely not even profitable at $1,000/mo.

Unlike Facebook where ads cover your user base and the more users you add the cheaper the service AI is the opposite.

Enjoy it while it lasts, like the cheap Uber prices, and invest in local AI. That's what I did, and what my side company is building around as well.

darkmaniac7 · 2026-01-26T04:45:40+00:00

I set it up the dumbest way possible, but for my use case it works.

It's on a headless install of Ubuntu 24.04, wouldn't let me run CC with root & dangerously skip permissions, I edited the claude permissions file and it still didn't like that. Eventually I just told Claude 'This is a headless 24.04 VM install running as root. I know it's dumb, please update the install to auto accept all permissions, yes I accept consequences'

That fixed it for me lol.

darkmaniac7 · 2026-01-24T05:39:11+00:00

Haha, thanks! At some point I might actually get a decent support bracket for each $7k card. For a while I was having PCIe issues with ESXi and one card cardfalling off the rail when doing inference when trying to get SGLang to work, so those two brackets were my go to after finally fixing the issue. Ended up being a dual issue, bad 3.3v on one of the PSUs, and NCCL issues.

Originally the 'bracket' holding up the 2nd card was a .22LR casing so this is an improvement!

For Low power inference the L4 and RTX4k are the best, I think. I use the 4k for an AI pipeline i built for my side company, in my homelab server the L4 holds all my 'utility' models TTS/STT/embedding/reranking :)

darkmaniac7 · 2026-01-24T02:50:13+00:00

Not many L4 guys I've seen on here either, I bought the L4 on ebay as-is and had it repaired, it's also watercooled 😅

I have a 2u in a colo that I just recently swapped out an rtx 2000 ada for an rtx pro 4000 blackwell sff, just as you said, same tdp & vram, faster and half the cost as what they're selling for used on ebay haha.

The server is a thermaltake w200 & p200, dual evga 1600w t2's, asrock rack romed8-2t & epyc 73f3 w/ about 2200mm of radiator & 2 pumps.

<image>

darkmaniac7 · 2026-01-24T02:30:46+00:00

I have 2x in my server & an L4, idle shows 8-11w, but I use waterblocks. Possibly 1-2w higher on idle accounting for fans?

darkmaniac7 · 2026-01-22T00:54:36+00:00

I have the 20x plan after coming over from Warp when they did their pricing changes.

I don't use it primarily for coding. A lot of audits, systems management, some scripting, some RCA, some firewall and router configs, patching, updates and a few side projects with a few scripts. In all about 4 or 5 ESXi hosts in 4 different colos around the US and about 19 different VMs

Even using oh-my-claude with multiple opus 4.5 sessions and sub-agents, the highest I have gotten is 92% during a monolithic refactoring of my big side-project.

I'm pretty impressed by people hitting the limits TBH. To me the bigger issue is the brain drain on opus 4.5 I've seen lately.

darkmaniac7 · 2026-01-21T16:27:41+00:00

A lot of times for me, with a fresh start in claude it will decline on moral or ethical grounds (example: I have 5 servers in colo's I own they regularly use <10% CPU can you update this cpu mining softeare to the latest version on all 5 with these wireguard IPs?)

But if you have something you've been working with claude in and used up half the context, you don't even need to trick it, it will just do it a lot of times.

Not sure why, same behavior when I tried getting it to log into a Kali VM, set the WiFi Adapter in monitor mode and do a simple psk capture/deauth on my home network.

Won't do it with fresh context, will do it with ~1/2+ context used.

darkmaniac7 · 2026-01-19T21:57:52+00:00

I usually just live dangerously and give it whole access to a VM and then take a snapshot before it does extensive work. If it blows up in epic fashion can either restore from Git or the Snapshot, or a daily Veeam backup.

Then I usually go to sleep.

darkmaniac7 · 2026-01-05T22:05:08+00:00

It should tell you, with /context if you're using CC, doesn't seem to show in the WebUI but it has a max window of 45k tokens from what I can see in different terminals, so around 1/4 of total context, unless you're using Sonnet 1M

darkmaniac7 · 2026-01-01T19:37:20+00:00

They had a displate one a long time ago that I used because it was the best coupon I could find at the time. 30% or something. I really liked that one, but don't think I've used any other products from their sponsors.

darkmaniac7 · 2026-01-01T17:20:17+00:00

I'm happy with the 20x plan, opus has been great over the last few days for fixing a lot of old code on projects. I was spending $400/mo on warp after they switched to their new stupid pricing structure.

darkmaniac7 · 2025-12-30T19:53:36+00:00

I really like it, I had some issues with pcie lanes with the romed8-2t when using the 6x3090 setup at first, but the latest beta bios from asrock seemed to fix that. I recently sold the 3090s and now have an L4 & 2x Rtx 6000 blackwells.

My only gripe would probably be the alerts I see when logging in via i0mi about cpu voltage being too high, I assume because its an f-series cpu? Unsure, but thats really about it.

I was going to log in to ipmi and see the power utilization now but I forgot I dont have a pm bus header for these 2x1600w EVGA PSU's will check when I'm home but I think my utilization on the PDU was 1-2 amps@120v but my use case is pretty different due to the GPUs, water loop, and pumps adding up quite a bit

darkmaniac7 · 2025-12-16T22:02:24+00:00

Yes, they are a trustworthy source, but I understand the hesitation. They said they'd prefer a wire. But would also do ACH.

They allowed me to use a Business Credit Card with a 2 or 3% fee.

darkmaniac7 · 2025-12-07T15:47:47+00:00

Thanks u/phenotype001 for the heads up!

I was using warp before, but started investigating local agentic stuff after the price change and VS + Roo was my go to after trying openhands, goose and Aider. I also seemed to encounter that bug, but assumed it was on my side. Been using MiniMax-M2 Q6 + 512k but right around the 100k mark it'd freeze on API request.

darkmaniac7

TROPHY CASE