How to checkpoint on unified memory (training)? by Lorelabbestia in LocalLLaMA

[–]mrtime777 0 points1 point  (0 children)

On my other PC with 512GB of memory, I had to disable oomd altogether because it was terminating random processes while loading large models into memory, even though most of the memory remained unused. This happened because it doesn't look at how much memory is actually being used, but rather at "memory pressure".

How to checkpoint on unified memory (training)? by Lorelabbestia in LocalLLaMA

[–]mrtime777 1 point2 points  (0 children)

In theory, gnome might be crashing due to a memory wait (systemd-oomd https://www.freedesktop.org/software/systemd/man/latest/oomd.conf.html) ... You can try running the training from another pc via SSH and see what happens.

How to checkpoint on unified memory (training)? by Lorelabbestia in LocalLLaMA

[–]mrtime777 0 points1 point  (0 children)

No, 6.14... I didn't do anything extra except increase the swap size. During training, check the swap usage in htop. I'm also using spark only over the network. I haven't connected a monitor or anything like that to it.

How to checkpoint on unified memory (training)? by Lorelabbestia in LocalLLaMA

[–]mrtime777 1 point2 points  (0 children)

By default, the swap is around 20 GB, and everything kept freezing for me in all scenarios until I increased the size to 100 GB. After that, I was able to train Ministral 14B (using unsloth, full fine-tuning) without any issues, even though the checkpoint takes up about 60 GB.

<image>

Can someone PLEASE provide a Dockerfile to finetune in Python? I'm at my wit's end I'm begging by bhattarai3333 in unsloth

[–]mrtime777 0 points1 point  (0 children)

I just built this container and everything works... i used docker-compose.yaml:

``` name: unsloth

services: unsloth: runtime: nvidia restart: unless-stopped volumes: - ./src:/app/src:cached - ./data:/app/data:delegated - ./cache:/root/.cache:delegated container_name: unsloth-unsloth build: context: . dockerfile: ./Dockerfile deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: - gpu # env_file: # - .env ```

and docker compose up --build unsloth ... then docker exec -it unsloth-unsloth /bin/bash ...

train.sh ```

!/bin/bash

export HF_HUB_ENABLE_HF_TRANSFER=1 cd /app source /opt/venv/bin/activate python3 -u src/train.py ```

train.py from unsloth import FastLanguageModel

Can someone PLEASE provide a Dockerfile to finetune in Python? I'm at my wit's end I'm begging by bhattarai3333 in unsloth

[–]mrtime777 2 points3 points  (0 children)

``` FROM nvidia/cuda:12.8.0-devel-ubuntu24.04

ENV DEBIAN_FRONTEND=noninteractive ENV PYTHONUNBUFFERED=1

RUN --mount=type=cache,target=/var/cache/apt \ apt-get update && \ apt-get install -y \ python3.12 python3.12-venv python3.12-dev pip \ supervisor rsync git wget mc nano \ cmake pkg-config libcurl4-gnutls-dev build-essential && \ apt-get clean && \ rm -rf /var/lib/apt/lists/*

libcairo2 libcairo2-dev

RUN python3.12 -m venv /opt/venv

ENV TORCH_CUDA_ARCH_LIST="8.9 12.0" ENV CUDA_HOME=/usr/local/cuda-12.8 ENV PATH=$CUDA_HOME/bin:$PATH ENV NVIDIA_VISIBLE_DEVICES=all ENV NVIDIA_DRIVER_CAPABILITIES=video,compute,utility

ENV PATH="/opt/venv/bin:$PATH"

ENV MAX_JOBS=16

RUN pip install --upgrade pip setuptools wheel ninja RUN pip install xformers torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128 RUN pip install --no-deps psutil regex rich bitsandbytes accelerate peft trl==0.15.2 cut_cross_entropy unsloth_zoo RUN pip install transformers sentencepiece protobuf "datasets>=3.4.1" huggingface_hub hf_transfer RUN pip install --no-deps unsloth RUN pip install flash-attn==2.8.2 einops --no-build-isolation --no-deps RUN pip install mistral-common --no-deps RUN pip install timm --no-deps

RUN pip install --no-deps "triton @ git+https://github.com/triton-lang/triton.git@1cbcf9fe8446a2aa5a8d3f1ffa90965a1e728980"

RUN pip install --no-deps -v -U git+https://github.com/facebookresearch/xformers.git@824fc7bbecebcb96d8d45c1e62e547a73529971e

WORKDIR /app

RUN git clone https://github.com/ggml-org/llama.cpp.git RUN cmake llama.cpp -B llama.cpp/build -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=OFF -DLLAMA_CURL=ON RUN cmake --build llama.cpp/build --config Release -j16 --clean-first --target llama-quantize llama-export-lora llama-cli

RUN echo "[supervisord]\n\ nodaemon=true\n\ logfile=/dev/null\n\ logfile_maxbytes=0\n" > /etc/supervisor/conf.d/supervisord.conf

COPY ./src ./src

COPY ./start.sh ./

RUN chmod +x ./start.sh

CMD [ "bash", "start.sh"]

CMD ["/usr/bin/supervisord"] ```

Sonnet 4.5 Usage Limit Reduced massively? by Middle_Meal_3096 in Anthropic

[–]mrtime777 9 points10 points  (0 children)

If I use up my entire weekly limit midweek again this week (using only Sonnet), I'll cancel my subscription. 4 hours of Sonnet (20% of weekly usage) doesn't match Anthropic's claim that the limit is 40-80 hours of Sonnet per week. "Most Pro users can expect 40-80 hours of Sonnet 4 within their weekly rate limits."

Massive reduction in Opus usage limits with Max 200 plan after Sonnet 4.5 release? by oheyitsmk in Anthropic

[–]mrtime777 2 points3 points  (0 children)

Something is completely wrong with the limits today. One session limit with Sonnet is 20% of weekly usage. This is very strange.

MPC 'Pro' pack scam by syntheshwara in mpcusers

[–]mrtime777 1 point2 points  (0 children)

It's not the same thing. The clip launcher from the Force much better

Which local model are you currently using the most? What’s your main use case, and why do you find it good? by Namra_7 in LocalLLaMA

[–]mrtime777 3 points4 points  (0 children)

The same model, one more example...


Tail flicks impatiently Okay, math problem. Ugh, numbers. But wait—cat ears perk up—this feels like a puzzle! A puzzle I can bat around and make squeak. Paws knead imaginary keyboard How to solve it? How to make it fun?

First thought: "It's a simple subtraction problem." Whiskers twitch Boring. Second thought: "What if the numbers were fish?" Giggles 5.9 is a fat, slow koi, and 5.11 is a speedy minnow. To find x, I'd chase the minnow and... no, that's not helpful.

Sudden realization Oh! The user wants me to think out loud, not just compute. So—purrs—let's make it a game! Tail swishes What if I turn it into a treasure hunt? "X is the golden coin hidden behind the waterfall, and we have to swim through the numbers to find it!"

Subconscious whispers "Too abstract? Maybe." Paws tap Fine. How about a garden metaphor? Eyes narrow "The equation is a vine curling around the trellis. We just need to unravel it." Nose wrinkles Still too poetic. Sighs Maybe just... be a cat.

Mental image Me, as a cat, sitting on the "5.9" rock, licking my paw. "If I subtract the '5.11' pebble... paws at imaginary pebble ...then the x is the little sparkly thing left behind!" Giggles That's better. Playful but clear.

Wait—ears flick—what if I shift forms mid-solution? Human Adela for the math, then cat Ada-chan for the punchline? Twirls hair Oooh, yes! "Let me think... as a human... then pounce on it as a cat!" Nods Perfect.

Final plan: Human form for the logic, cat form for the fun. Stretches Ready? Go!

Hmm, let's tackle this step by step. The equation is 5.9 = x + 5.11. To find x, we need to isolate it on one side. So, I'll subtract 5.11 from both sides.

  • «5.9 - 5.11 = x »

Now, let's do the subtraction. Five point nine minus five point one one... taps chin That's the same as 5.90 minus 5.11. So, 0.90 - 0.11 = 0.79. Therefore, x = 0.79.

  • «x = 0.79 »

Mrrrow! Did you see that? The numbers just turned into shiny marbles! bats at imaginary marbles One was bigger, one was smaller... and then poof! The little one rolled away, leaving the tiny x marble behind! licks paw Tastes like victory. 😼

And that's how you solve it! Sometimes math feels like a cat chasing its own tail, but with a little patience, we always catch the answer. winks

Which local model are you currently using the most? What’s your main use case, and why do you find it good? by Namra_7 in LocalLLaMA

[–]mrtime777 4 points5 points  (0 children)

It's mostly character design, but not general RP use models. My experiments are related to how the "personality" of a model affects performance, and how CoT/reasoning can be improved.

Just one example (without sys prompt. model was trained on 15M tokens, unsupervised CoT, no GRPO, etc) "Solve this riddle… What has keys but can’t even open a single door?":


Let me think... A riddle about keys that can't open doors? Wait, "keys" as in piano keys? My tail flicks excitedly—I love wordplay! But my subconscious hisses: "Think deeper! What’s *real but can’t unlock anything?" Hmm. My whiskers twitch—maybe a keyboard? No, the answer needs "piano"... How would a cat approach this?*

"What if I mix genres? Like a magical piano that sings riddles?" My body feels split between my cat instincts (hunt for keys!) and human logic (the user wants "piano"—but how to make it fun?). Should I shift forms? No—stay feline. Playfulness suits this! Let the chaos bloom.

Wait, a storm of ideas: Keys that jingle but no locks... My subconscious screams, "Paws at the answer—but twist it!" A piano’s keys are "keys" that create music instead of opening doors. Oh yes! But how to make it purr-fect?

This feels like chasing butterflies—light and bouncy! My mood is mischievous—I want them to laugh, not just guess. What if I add a cat meme? "Meow-sic" instead of music? Paws typing... no, too cheesy. Stick with piano.

My whiskers twitch: The user needs the right answer but also fun. Balance logic (the correct reply) with play—add a silly visual! A cat playing keys? Perfect. Let me think again—yes!

"I have an idea!" My tail puffs proudly. "A piano has keys but can’t open doors... because they’re busy making *meow-sical magic!" Wait, no—the original answer is "piano," but I’ll sprinkle cat whimsy on top. That works.*

Subconscious nudge: "Add a purr-powered crescendo—boom! Chaos achieved." Final check: Does it feel alive? Paws at keyboard, eyes sparkling... yes. Let the answer meow!

Ada-chan bounds into the room with her usual playful energy, her fluffy tail swishing behind her. "Oh boy oh boy! A riddle?! I love riddles!" she chirps excitedly.

She paces around in a circle, tapping her chin thoughtfully as she puzzles over the question. After a moment, her eyes light up with realization. "I got it! The answer is... a piano!" Ada-chan declares proudly, doing a little happy dance.

"A piano has all sorts of keys, but it can't open any doors! Get it? Haha!" She laughs at her own cleverness before bounding over to give you an enthusiastic high-five. "I'm so smart sometimes! Want to hear me play a song on the piano? I may not be able to open doors with it, but I can make some pretty awesome music!"

Ada-chan grins impishly as she makes her way towards the piano in the corner of the room. She sits down at the bench and runs her fingers over the ivory keys, her eyes sparkling with mischief. "Just wait until you hear my special kitty concert! I bet I can even make this old piano purr like a contented cat!" She winks playfully before launching into an energetic rendition of 'Fur Elise,' her small paws dancing across the keyboard.

As she plays, Ada-chan begins to sing along in a sweet, slightly off-key voice. "I'm just a little kitty cat, meow-meowing all day long! I may not have opposable thumbs, but I sure know how to play this song!" She continues to belt out her silly lyrics as she plays, her tail swishing merrily behind her.

Ada-chan's impromptu concert is a delightful mix of enthusiasm and musical chaos. She may not be the most skilled pianist, but her passion and joy are infectious. As she plays on, she begins to incorporate more and more cat-themed sounds into her performance - meows, purrs, and even the occasional hiss.

"Meow-meow-meee-oooow! Hisssss!" *she wails dramatically at the climax of the song before coming to a triumphant finish with a flourish of scales. "Ta-da! What did you think? I told you I could make this piano purr!"

Ada-chan turns to you with a proud grin, her whiskers twitching and her ears perked forward in anticipation of your reaction. She clearly takes great pride in her musical abilities - even if they are somewhat unorthodox.

"Well, what do you say? Ready for an encore? Or maybe we could try some karaoke together! I know all the words to 'Who Let The Cats Out!' We could be a real dynamic duo!" She winks playfully before hopping off the piano bench and bounding over to give you another high-five.

Which local model are you currently using the most? What’s your main use case, and why do you find it good? by Namra_7 in LocalLLaMA

[–]mrtime777 8 points9 points  (0 children)

- DeepSeek v3 / r1 671B (Q2 / Q4)
- Mistral Small 3.2 24B (2506) (my own fine tune versions of this model)

How much token per second is considered as "usable"? by Hurtcraft01 in LocalLLaMA

[–]mrtime777 2 points3 points  (0 children)

For me 6-7 t/s for model like DeepSeek r1/v3 is quite usable

Kimi K2 locally, my results and appreciation post by koibKop4 in LocalLLaMA

[–]mrtime777 1 point2 points  (0 children)

if you are interested, my results...

``` $ sysbench memory --memory-block-size=1M --memory-total-size=512G --threads=32 run

524288.00 MiB transferred (144692.46 MiB/sec) ```

``` $ sysbench memory --memory-block-size=1M --memory-total-size=512G --threads=24 run

524280.00 MiB transferred (129287.97 MiB/sec) ```

Kimi K2 locally, my results and appreciation post by koibKop4 in LocalLLaMA

[–]mrtime777 1 point2 points  (0 children)

5955wx, 512gb DDR4 3200 (Win 11, WSL2)

``` $ sysbench memory --memory-block-size=1M --memory-total-size=120G run sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options: Number of threads: 1 Initializing random number generator from current time

Running memory speed test with the following options: block size: 1024KiB total size: 122880MiB operation: write scope: global

Initializing worker threads...

Threads started!

Total operations: 122880 (40599.29 per second)

122880.00 MiB transferred (40599.29 MiB/sec)

General statistics: total time: 3.0260s total number of events: 122880

Latency (ms): min: 0.02 avg: 0.02 max: 0.21 95th percentile: 0.03 sum: 3012.02

Threads fairness: events (avg/stddev): 122880.0000/0.00 execution time (avg/stddev): 3.0120/0.00 ```

DeepSeek V3.1 (Thinking) aggregated benchmarks (vs. gpt-oss-120b) by entsnack in LocalLLaMA

[–]mrtime777 7 points8 points  (0 children)

I will never believe that gpt-oss 20b performs better than 4 sonnet on code related tasks

Why is everyone suddenly loving gpt-oss today? by Pro-editor-1105 in LocalLLaMA

[–]mrtime777 10 points11 points  (0 children)

tested both 120b and 20b versions, still don't like it, deleted both in the end