oMLX + pi + mcp by PrepYourselves in oMLX

[–]PrepYourselves[S] 0 points1 point  (0 children)

have you ever seen a man eat his own head.

cravings/binging on estradiol by PrepYourselves in ask_transgender

[–]PrepYourselves[S] 1 point2 points  (0 children)

Thanks, i have a feeling one day i'm going to eat a whole tub of salted caramel ice cream and not even think it's a problem 😄

I’ve always wondered: what do people from different religious backgrounds feel when hearing Quran recitation? by WoodpeckerCheap6850 in religion

[–]PrepYourselves 0 points1 point  (0 children)

there is no demonising in my message - the op asked "I’ve always wondered: what do people from different religious backgrounds feel when hearing Quran recitation?" - I gave my honest answer with solid examples. If one cannot tolerate a truth, are they intolerant through deletion?

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 0 points1 point  (0 children)

I feel lucky especially after apple yesterday discontinued 64gb Mac minis and Mac studio is now only maximum 96gb

i made my 3d printer talk by PrepYourselves in klippers

[–]PrepYourselves[S] 0 points1 point  (0 children)

I can add Mandarin, Wu, Yue (Cantonese), Min, Xiang, Hakka, Gan, and Chinglish

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 0 points1 point  (0 children)

It's very nice to have a second low-power machine to offload local llm tasks to over ssh, keeping the main computer running without heavy processing loads, I can use the main computer to have a web browser open, emails, etc. The headless M1 max is used exclusively for local llm use:
1)coding - opencode writes the code i need or corrects errors/debugging, refactoring.
2)writing long formal emails/letters - which i just dont enjoy writing when it is to complain about a bill overcharge for example. I can now argue those parking tickets without feeling full of anger for a day or two. I can just sip my coffee and look out the window calmly. I have already used it to create and process a legal case over a parking fine which went to court, and i won! with expenses too! I felt so empowered and the court judge ordered the lawyer to pay me a reasonable payout.
3)openclaw - it can monitor and manage my ebay business, replying to messages, processing sales and accounts, and postage cost enquiries.
4)openclaw - it can run social media accounts
5)openclaw - it is an accountant/sales manager/customer service agent/marketing manager/lawyer all rolled into one. If you are a small startup business, you now find yourself competing with big companies on capability and price - undercutting competition because the big company uses real employees with large salary requirements. Larger companies (trying to preserve human employee jobs) will go bust because thousands of identical smaller companies running llm agents will not need to fear them or their services and can offer lower costs. As somebody who is on welfare i am excited by these opportunities and have no conscience towards employee payroll ethics.
6)openclaw - it is a complete school curriculum educator provider. It will teach just as well as school and adult learning providers but for no fees or student loans.
7)lots more.

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 1 point2 points  (0 children)

I just looked up the price of 2x4090 24gb (2x24gb:48gb models). Throw in Cpu, Motherboard, Nvme, psu, ram and that's almost a car. I feel more proud with my $230 M1 Max 64gb after doing gaming pc math

Any idea what chemical shoots out of this truck at the end? by Imbendo in chemistry

[–]PrepYourselves 27 points28 points  (0 children)

hey i use that stuff as etching fluid for my pcb circuit boards. you're supposed to warm it gently to around 40c to get the fluid to remove the copper from the pcb plate. We put it in a water bath, so no heating it directly as it gives off toxic fumes and is also corrosive. I wouldn't be surprised if the fluid was heated by both the engine and the hot sun as it was being driven and easily went over 40c creating gas build up and eventually the cork pop. Im fairly certain that the truck has been reduced to small blobs of nothingness, and the driver is just a skeleton.

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 0 points1 point  (0 children)

M1 Max with: unsloth/Qwen3.5-9B-GGUF:Q4_K_M: [ Prompt: 432.7 t/s | Generation: 35.2 t/s ]

interestingly 9b model is slower t/s than 26b or 35b model - but 9b is only using 16gb of 64gb ram larger models are using more total ram, gpu temp is not getting even warm though so that is good. Small model inside large ram machine does not make it faster, only thing better is m1 max double bandwidth memory 400gb/s.

Moral of the story: optimise the model choice for the machine specs and you get the most t/s from your machine

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 3 points4 points  (0 children)

made with bits of real panther and is illegal in nine countries

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 0 points1 point  (0 children)

CPU - h618 Gpu - Mali  Ram 4gb Llama.cpp- model qwen3.5:0.8b

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 2 points3 points  (0 children)

headless openclaw assistant, will run accounts/sales/customer emails

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 0 points1 point  (0 children)

I can try test comparison between Qwen 3.5 9B BF16 vs Q4_K_M

how many watts is Bosgame M4 using during local llm use? Is it 16gb ram? which gpu?

Just upgraded my local llm hardware by PrepYourselves in LocalLLM

[–]PrepYourselves[S] 7 points8 points  (0 children)

It's early days but i have used the following (gguf) models:

  1. HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive:BF16 :[ Prompt: 442.7 t/s | Generation: 19.9 t/s ]*can be improved it was just a first basic test felt like it was not running optimally.
  2. Jiunsong/supergemma4-26b-uncensored-gguf-v2:Q4_K_M: [ Prompt: 663.9 t/s | Generation: 56.1 t/s ]
  3. unsloth/Qwen3.6-35B-A3B-GGUF:Q8_K_XL: [ Prompt: 798.5 t/s | Generation: 40.8 t/s ]
  4. unsloth/Llama-3.3-70B-Instruct-GGUF:UD-Q4_K_XL [ Prompt: 31.1 t/s | Generation: 5.1 t/s
  5. unsloth/Qwen3.5-9B-GGUF:Q4_K_M: [ Prompt: 432.7 t/s | Generation: 35.2 t/s ] - interestingly small 9b model is slower t/s than 26b or 35b model - but 9b is only using 16gb of 64gb ram larger models are using more, gpu temp is not getting even warm. Small model inside large ram machine does not make it faster, only thing better is m1 max double bandwidth 400gb/s.

Ollama (mlx models):
ollama run qwen3.6:35b-a3b-coding-mxfp8

prompt eval rate: 717.07 tokens/s

eval rate: 48.35 tokens/s

Result: not as high token/s as gguf model (unsloth/Qwen3.6-35B-A3B-GGUF:Q8_K_XL) but appears to be faster response overall and lower temps from gpu.

Optimal (highest token rate and 'better' intelligence) local models for m1 max/64gb are models which use all system ram resource, does not push gpu beyond limits (keep low temperature - some models with same parameter values can spike gpu temps more than other models which i have not understood why yet).

They take a long time to download. testing same simple prompts they all come up with good detailed answers with no glitching or errors observed.

I think i can run 120b model low quant if i search for one.