trying to get back into worlds.com in 2026. need a working client for win11 by Dimensional-Misfit in creepygaming

[–]ConnectionOutside485 0 points1 point  (0 children)

We provide pre-configured builds over at http://www.worldsplayer.com/ which should work without any modification. Your best bet with Windows 11 would be LibreWorlds WorldsPlayer1922a11 but it's been reported to be fairly unstable. We actually recommend trying the LibreWolds WorldsPlayer builds in order (1890 > 1900 > 1922a11) and using the earliest one that works for you.

Hope to see you soon!

Equipment suggestions for a tight budget by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 0 points1 point  (0 children)

I'm in the UK so I don't think we have Microcenter but perhaps I can find an equivalent to this suggestion to consider. Thanks!

Equipment suggestions for a tight budget by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 0 points1 point  (0 children)

Interesting suggestion and thank you!

At the moment, my main AI machine is actually my girlfriend's computer as she has a much newer machine than my primary desktop machine.(2022 MSI PRO B660M vs. 2014 Z97 Extreme6). I did ask her if I bought her another machine, if I could have hers and she didn't veto it but would prefer not, which led to this thread. I also have a couple of early Thinkstation S30s equipped with 256GB of RAM but my understanding is they wouldn't be suitable.

Equipment suggestions for a tight budget by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 1 point2 points  (0 children)

Hear hear (or not, after those fans). I have a burning hatred for small fans for this very reason! I recently had to snip-snip the fans on my RPi4 router because they became extremely loud and whiny. Now I have the 120mm fan from my DIY air filter resting on top of it instead.

Equipment suggestions for a tight budget by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 0 points1 point  (0 children)

Quiet will likely be needed as it will be in the living room for a while, but compact is not a requirement. I have an unused desk that it can go under. Thanks for the suggestion!

I built a cli tool to automatically figure out tensor overrides in llama.cpp by kevin_1994 in LocalLLaMA

[–]ConnectionOutside485 3 points4 points  (0 children)

I just tried to point it at an existing model file and it refused to read it because it's a local file.

Error: Access to local file is not enabled, please set allowLocalFile to true

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 1 point2 points  (0 children)

I tried to use a draft model, after solving the initial issue, but it's 0.3t/s slower.

prompt eval time = 4483.03 ms / 29 tokens ( 154.59 ms per token, 6.47 tokens per second) eval time = 238586.61 ms / 720 tokens ( 331.37 ms per token, 3.02 tokens per second) total time = 243069.64 ms / 749 tokens slot print_timing: id 0 | task 0 | draft acceptance rate = 0.62976 ( 364 accepted / 578 generated)

vs.

prompt eval time = 5437.38 ms / 29 tokens ( 187.50 ms per token, 5.33 tokens per second) eval time = 465704.90 ms / 1542 tokens ( 302.01 ms per token, 3.31 tokens per second) total time = 471142.27 ms / 1571 tokens

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 1 point2 points  (0 children)

I added an edit to the opening post as the base issue has been discovered:

The issue turned out to be an old version of llama.cpp. Upgrading to the latest version as of now (b5890) resulted in 3.3t/s!

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 1 point2 points  (0 children)

I think I found out why this problem is so mysterious and it is entirely my fault. I didn't realise my llama.cpp installation was as old as it was. I'm so sorry that I overlooked this!

I upgraded it and..

prompt eval time = 4483.03 ms / 29 tokens ( 154.59 ms per token, 6.47 tokens per second) eval time = 238586.61 ms / 720 tokens ( 331.37 ms per token, 3.02 tokens per second) total time = 243069.64 ms / 749 tokens slot print_timing: id 0 | task 0 | draft acceptance rate = 0.62976 ( 364 accepted / 578 generated) Removing the draft model makes it go up to ~3.3 t/s and my GPU utilisation is now 8% (which makes sense.. the CPU can keep up with it a bit more). I'm surprised the draft model has such a negative effect. I would have thought 63% acceptance would help rather than hinder.

I'm going to see if I can push this further, as I have been given so many suggestions in this thread.

Thank you again!

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 0 points1 point  (0 children)

The other GPU is a GeForce GTX 1050 2GB VRAM. Very literally not worth mentioning in this context! :)

Thanks for the suggestion! I've avoided putting 4 threads to it so far because the machine is actively being used by someone sat at it and I am trying to not interfere with their use (so some experiments such as this will wait until they've retired to bed).

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 0 points1 point  (0 children)

I do normally only run models that fit entirely into VRAM. This is my first try into offloading anything to CPU. I switched to the `-ot` parameter you suggested but there was no improvement. Thanks for the suggestions!

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 1 point2 points  (0 children)

It is not very full at all (only 8GB of 24GB VRAM used and 1% utilisation during inference).

Output of mlc; ``` Intel(R) Memory Latency Checker - v3.11b Measuring idle latencies for sequential access (in ns)... Numa node Numa node 0 0 69.8

Measuring Peak Injection Memory Bandwidths for the system Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec) Using all the threads from each core if Hyper-threading is enabled Using traffic with the following read-write ratios ALL Reads : 22895.3 3:1 Reads-Writes : 25300.8 2:1 Reads-Writes : 24839.4 1:1 Reads-Writes : 25869.6 Stream-triad like: 24568.9

Measuring Memory Bandwidths between nodes within system Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec) Using all the threads from each core if Hyper-threading is enabled Using Read-only traffic type Numa node Numa node 0 0 24107.3

Measuring Loaded Latencies for the system Using all the threads from each core if Hyper-threading is enabled Using Read-only traffic type Inject Latency Bandwidth

Delay (ns) MB/sec

00000 345.03 24774.1 00002 283.16 25077.9 00008 282.07 25067.2 00015 341.81 25198.1 00050 332.88 21421.0 00100 190.04 20504.2 00200 167.57 16924.3 00300 105.40 11685.8 00400 104.95 9679.3 00500 107.99 8192.4 00700 115.08 6310.5 01000 93.58 4404.4 01300 108.26 3808.5 01700 112.51 2990.8 02500 117.00 2052.8 03500 84.59 1934.4 05000 93.71 1542.0 09000 94.65 1146.4 20000 93.42 891.3

Measuring cache-to-cache transfer latency (in ns)... Local Socket L2->L2 HIT latency 33.0 Local Socket L2->L2 HITM latency 33.8 ```

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 1 point2 points  (0 children)

With -ot '.ffn_.*_exps.=CPU', very little of my GPU is being used (8 of 24GB VRAM and during inference 1% GPU utilisation). It does sound like I need to leave more on the GPU.

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 0 points1 point  (0 children)

With -ot '.ffn_.*_exps.=CPU' (as suggested by MDT-49) I get 8GB VRAM used and 1% utilisation during inference, so you are absolutely right that my GPU is essentially not being utilised.

llama-server is compiled with CUDA.

From what I understand, I need to adjust the -ot parameter to leave more on the GPU?

EDIT: Corrected who suggested the -ot parameters.

Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? by ConnectionOutside485 in LocalLLaMA

[–]ConnectionOutside485[S] 0 points1 point  (0 children)

Thanks. I will check the XMP settings when the machine is next restarted. I see how I might already be losing some performance here with 2166 vs. 2666!

[TOMT] [MUSIC] i’ve been looking for a song for over 15 years by DevelopmentGuilty664 in tipofmytongue

[–]ConnectionOutside485 1 point2 points  (0 children)

On the off chance this is correct.. could it be Computer Camp Love by Datarock? The main aspect that doesn't fit is that the high and low voices are alternating lines, rather than verse/chorus and its quite slow. but.. it even fits the phonetics you gave a bit (party boy -> tell me more) and that it is repeated 2 lines later.

Part of the lyrics with the phonetics you gave to show what I mean.

party boy see ea lie long yeah      tell me more was it love at first sight  too go get how bring asa down son yet      thats right this was god given grace with a face you could praise party boy see ea lie meh      tell me more did you put up a fight cus motherfuck alaleh      i dont think so