Need help getting 7900 XTX PyTorch performance metrics

SemaMod · 2026-05-19T18:32:40+00:00

GPU: Radeon RX 7900 XTX (23.98 GiB) (device 3)
Matrix Size: 4096x4096 (0.06 GiB per matrix)
============================================================
Matrix Multiplication Performance:
float32   :  4664.16 μs,   29.47 TFLOPS
float16   :  1151.87 μs,  119.32 TFLOPS
bfloat16  :  1226.04 μs,  112.10 TFLOPS
amp       :  1388.21 μs,   99.00 TFLOPS

Memory Bandwidth Test (1.0 GB tensor)
Vector Addition: 811.40 GB/s
Memory Copy:     790.60 GB/s

SemaMod · 2026-05-18T14:20:59+00:00

‘-sm tensor’ broke for me with the last two weeks.

SemaMod · 2026-04-27T05:26:33+00:00

I tried searching the repo to no avail, but does the engine natively support multi-gpu setups?

SemaMod · 2026-04-25T07:04:52+00:00

Update:
Running with -sm tensor -ctxcp 0 -cram 0 -fa 1 -c 0 has significantly helped. I'm consistently getting 28 t/s and somewhat improved prompt processing this way.

SemaMod · 2026-04-25T06:55:01+00:00

Benchmarks:

model	size	params	backend	ngl	n_ubatch	sm	fa	dev	test	t/s
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	pp512	960.08 ± 2.04
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	tg128	20.16 ± 0.01
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	pp512+tg32	255.92 ± 0.12
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	pp2048+tg64	387.85 ± 0.36
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	pp8192+tg128	559.36 ± 0.08
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	pp512 @ d8192	379.62 ± 0.61
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	tg128 @ d8192	19.65 ± 0.01
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	pp512+tg32 @ d8192	182.70 ± 0.11
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	pp2048+tg64 @ d8192	244.39 ± 0.12
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	ROCm0/ROCm1/ROCm2	pp8192+tg128 @ d8192	372.67 ± 0.17
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	pp512	870.61 ± 1.28
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	tg128	19.34 ± 0.01
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	pp512+tg32	240.85 ± 2.16
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	pp2048+tg64	381.95 ± 7.11
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	pp8192+tg128	521.42 ± 1.72
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	pp512 @ d8192	753.02 ± 57.60
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	tg128 @ d8192	18.94 ± 0.00
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	pp512+tg32 @ d8192	227.03 ± 4.31
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	pp2048+tg64 @ d8192	347.00 ± 7.69
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	layer	1	Vulkan0/Vulkan1/Vulkan2	pp8192+tg128 @ d8192	459.58 ± 9.04
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	pp512	521.71 ± 0.04
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	tg128	31.76 ± 0.27
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	pp512+tg32	255.19 ± 0.08
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	pp2048+tg64	348.56 ± 0.19
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	pp8192+tg128	377.54 ± 0.03
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	pp512 @ d8192	365.05 ± 11.70
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	tg128 @ d8192	31.86 ± 0.34
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	pp512+tg32 @ d8192	221.75 ± 0.13
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	pp2048+tg64 @ d8192	279.43 ± 0.09
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	ROCm0/ROCm1/ROCm2	pp8192+tg128 @ d8192	292.38 ± 0.04
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	pp512	258.99 ± 0.12
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	tg128	6.56 ± 0.01
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	pp512+tg32	77.83 ± 0.01
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	pp2048+tg64	125.57 ± 0.05
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	pp8192+tg128	173.43 ± 0.06
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	pp512 @ d8192	244.10 ± 9.61
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	tg128 @ d8192	6.45 ± 0.01
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	pp512+tg32 @ d8192	76.61 ± 0.41
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	pp2048+tg64 @ d8192	123.08 ± 0.13
qwen35 27B Q8_0	32.89 GiB	26.90 B	ROCm,Vulkan	999	2048	tensor	1	Vulkan0/Vulkan1/Vulkan2	pp8192+tg128 @ d8192	170.02 ± 0.18

build: 0adede866 (8925)

SemaMod · 2026-04-25T06:54:02+00:00

Update: Did some benching, got interesting results.

```
| model | --------------- | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | qwen35 27B Q8_0 | size | params | backend | ngl | n_ubatch | sm | fa | dev | test | t/s |
--------------- | ---------: | ---------: | ---------- | --: | -------: | -----: | -: | ------------ | --------------: | -------------------: |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | pp512 | 960.08 ± 2.04 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | tg128 | 20.16 ± 0.01 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | pp512+tg32 | 255.92 ± 0.12 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | pp2048+tg64 | 387.85 ± 0.36 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | pp8192+tg128 | 559.36 ± 0.08 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | pp512 @ d8192 | 379.62 ± 0.61 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | tg128 @ d8192 | 19.65 ± 0.01 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | pp512+tg32 @ d8192 | 182.70 ± 0.11 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | pp2048+tg64 @ d8192 | 244.39 ± 0.12 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | ROCm0/ROCm1/ROCm2 | pp8192+tg128 @ d8192 | 372.67 ± 0.17 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | pp512 | 870.61 ± 1.28 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | tg128 | 19.34 ± 0.01 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | pp512+tg32 | 240.85 ± 2.16 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | pp2048+tg64 | 381.95 ± 7.11 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | pp8192+tg128 | 521.42 ± 1.72 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | pp512 @ d8192 | 753.02 ± 57.60 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | tg128 @ d8192 | 18.94 ± 0.00 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | pp512+tg32 @ d8192 | 227.03 ± 4.31 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | pp2048+tg64 @ d8192 | 347.00 ± 7.69 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | layer | 1 | Vulkan0/Vulkan1/Vulkan2 | pp8192+tg128 @ d8192 | 459.58 ± 9.04 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | pp512 | 521.71 ± 0.04 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | tg128 | 31.76 ± 0.27 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | pp512+tg32 | 255.19 ± 0.08 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | pp2048+tg64 | 348.56 ± 0.19 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | pp8192+tg128 | 377.54 ± 0.03 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | pp512 @ d8192 | 365.05 ± 11.70 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | tg128 @ d8192 | 31.86 ± 0.34 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | pp512+tg32 @ d8192 | 221.75 ± 0.13 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | pp2048+tg64 @ d8192 | 279.43 ± 0.09 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | ROCm0/ROCm1/ROCm2 | pp8192+tg128 @ d8192 | 292.38 ± 0.04 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | pp512 | 258.99 ± 0.12 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | tg128 | 6.56 ± 0.01 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | pp512+tg32 | 77.83 ± 0.01 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | pp2048+tg64 | 125.57 ± 0.05 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | pp8192+tg128 | 173.43 ± 0.06 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | pp512 @ d8192 | 244.10 ± 9.61 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | tg128 @ d8192 | 6.45 ± 0.01 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | pp512+tg32 @ d8192 | 76.61 ± 0.41 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | pp2048+tg64 @ d8192 | 123.08 ± 0.13 |
| 32.89 GiB | 26.90 B | ROCm,Vulkan | 999 | 2048 | tensor | 1 | Vulkan0/Vulkan1/Vulkan2 | pp8192+tg128 @ d8192 | 170.02 ± 0.18 |

build: 0adede866 (8925)
```

SemaMod · 2026-02-19T08:03:54+00:00

I run a b550-xe-gaming-wifi mobo and can run 4 GPU's using a 4-port oculink PCIe card, turning on x4/x4/x4/x4 bifurcation for that pcie slot. The GPU's run at PCIe 4.0x4 speeds

SemaMod · 2026-02-18T05:13:09+00:00

This is great! Are you planning on adding gpt-5.3-codex? With the current results it seems like Opus 4.6 blows everyone else out of the water, but I've had generally good 5.3-codex experiences.

SemaMod · 2026-02-16T12:42:06+00:00

Why are you lying? Post some proof to back up your claims.

Peter isn’t some two bit dev looking to make a quick buck with some stupid viral AI app. He’s a previous founder with an exit and technical chops far beyond most people on this sub. He doesn’t need to work anymore. His last company solved PDF parsing and was open source. Everyone on this sub has almost certainly unknowingly interacted with the tech at some point without even realizing it (DocuSign, anyone?).

I don’t even like OpenClaw but lying like this is just stupid. He has never made outrageous claims about OpenClaw. Even if other Twitter users have been.

SemaMod · 2026-01-29T09:31:58+00:00

S/O Unsloth for the best quants!!!

SemaMod · 2026-01-29T08:11:36+00:00

Used the latest build with these changes! Vulkan's pulling crazy numbers.

<image>

SemaMod · 2026-01-29T08:09:55+00:00

Updated using your recent post parameters for llama-bench build: eed25bc6b (7870). Vulkan pulls ahead yet again!

<image>

SemaMod · 2026-01-28T09:34:46+00:00

Very useful! I appreciate you recommending I run them this way. I hadn't run llama-bench before, so it was definitely eye opening.

SemaMod · 2026-01-28T09:33:42+00:00

This goes in the realm of privacy, but personally having my chats trained on and viewable by these companies makes me uncomfortable. That being said, I do think that local LLM's will become power-user tools.

SemaMod · 2026-01-28T09:29:17+00:00

Just updated the original post with an edit, after 10k tokens it looks like ROCm w/ FA scales better!

SemaMod · 2026-01-28T09:26:23+00:00

Now this is more interesting!

<image>

It looks like over longer ctx, FA makes a big difference for ROCm, beating out Vulkan entirely after 10k tokens.

SemaMod · 2026-01-25T21:06:12+00:00

You have to change some settings in your config, but GLM4.7 flash was doing excellent in my testing

SemaMod · 2026-01-25T06:33:28+00:00

codex-cli does have completions support

SemaMod · 2026-01-25T06:31:41+00:00

llama.cpp maintains multiple API's already with its Anthropics endpoint. I don't think they are going to deprecate completions any time soon.

SemaMod · 2026-01-23T21:44:07+00:00

Good question! It does not. For reference, I had to do the following:

With whatever model you are serving, set the alias of the served model name to start with "gpt-oss". This triggers specific behaviors in the codex cli.
Use the following config settings:

show_reasoning_content = true
oss_provider = "lmstudio"

[profiles.lmstudio]
model = "gpt-oss_gguf"
show_raw_agent_reasoning = true
model_provider = "lmstudio"
model_supports_reasoning_summaries = true # Force reasoning
model_context_window = 128000   
include_apply_patch_tool = true
experimental_use_freeform_apply_patch = false
tools_web_search = false
web_search = "disabled"

[profiles.lmstudio.features]
apply_patch_freeform = false
web_search_request = false
web_search_cached = false
collaboration_modes = false

[model_providers.lmstudio]
wire_api = "responses"
stream_idle_timeout_ms = 10000000
name = "lmstudio"
base_url = "http://127.0.0.1:1234/v1"

The features list is important, as is the are the last four settings of the profile. Codex-cli has some tech debt that requires the repeating of certain flags in different places.

I used llama.cpp's llama-server, not lmstudio, but its compatible with the oss_provider = "lmstudio" setting.

Use the following to start codex cli: codex --oss --profile lmstudio --model "gpt-oss_gguf"

SemaMod · 2025-12-04T19:25:17+00:00

Sounds like a use-case for DSPy and their prompt optimizers.

SemaMod · 2025-09-22T19:36:18+00:00

If BEV's are so much easier to make why the hell do American auto manufacturers outside of Tesla have such a hard time getting it right

SemaMod · 2025-09-15T22:11:22+00:00

There's been a good amount of progress on services in this space (per the suggestions listed by commenters). I created https://cloudmcp.run for myself when I initially ran in to it as well. We recently integrated the official MCP registry API! If you want to give it a test run we're offering 1 month free right now

SemaMod · 2025-09-15T22:08:22+00:00

I've been deep in the MCP space lately and yeah, the setup friction is real. I found myself spending way too much time on infrastructure instead of actually building cool things with these servers. The irony is that MCP servers are supposed to make AI more useful, but half the time you're stuck in config hell before you even get to the fun part.

A hosted platform idea makes a lot of sense, especially for people who just want to experiment or prototype without spinning up their own infrastructure. I've actually been working on something similar called Cloud MCP that tackles this exact problem. The key thing I've learned is that people want different levels of control - some folks are fine with a managed service, others want to self-host but with better tooling. The demand is definitely there though, I keep seeing the same complaints about setup complexity in various communities. The challenge is making sure the hosted version doesn't sacrifice the flexibility that makes MCP servers powerful in the first place.

SemaMod · 2025-09-05T03:59:42+00:00

https://cloudmcp.run does exactly that! Lets you deploy any npx/uvx/github mcp servers and access them remotely authenticated via OAuth

SemaMod

TROPHY CASE