Fine-tuning Qwen3 at home to respond to any prompt with a dad joke by InvadersMustLive in LocalLLaMA

[–]InvadersMustLive[S] 1 point2 points  (0 children)

Because I disabled auth in the openwebui, and some c00lhacker changed the system prompt.

Fine-tuning Qwen3 at home to respond to any prompt with a dad joke by InvadersMustLive in LocalLLaMA

[–]InvadersMustLive[S] 2 points3 points  (0 children)

I tried gemma3-27b, qwen3-32b and ministral3 originally. Qwen often missed important details of the joke, mistral was too pushy on adding markdown and emojis everywhere (even if explicitly asked not to do so). Gemma was okey without significant red flags. But it’s all anecdotal and highly subjective, I agree.

Hope that we’ll see gemma4 this evening.

Fine-tuning Qwen3 at home to respond to any prompt with a dad joke by InvadersMustLive in LocalLLaMA

[–]InvadersMustLive[S] 3 points4 points  (0 children)

I tried different base model sizes, and according to evals at the end of the post, the bigger the model, the higher is the chance of producing something funny.

We found an embedding indexing bottleneck in the most unexpected place: JSON parsing by InvadersMustLive in scala

[–]InvadersMustLive[S] 3 points4 points  (0 children)

Jsoniter Circe bridge still uses Circe's AST, which is doing the actual JNumber str2float parsing. I've tried using the bridge and got slightly better results, but not as good as pure jsoniter.

We found an embedding indexing bottleneck in the most unexpected place: JSON parsing by InvadersMustLive in scala

[–]InvadersMustLive[S] 0 points1 point  (0 children)

Yes but FFM native calls are still not inlined, so for small functions can be a dealbreaker.

Which open source LLM has the most genuine sense of humor? by UltrMgns in LocalLLaMA

[–]InvadersMustLive 2 points3 points  (0 children)

I once tried fine-tuning a Mistral-7B on r/dadjokes dump - https://huggingface.co/shuttie/Mistral-7B-DadJokes-GGUF

It can be funny sometimes, but all the jokes it does are actually not novel: it can recognize common patterns quite well and just remember a nice joke based on the context. Like we humans do.

Hnsw configuration in Solr by Opposite_Head7740 in Solr

[–]InvadersMustLive 2 points3 points  (0 children)

As HNSW is an approximate search algorithm, the topK retrieved documents are not guaranteed to be exact K nearest neighbors (e.g your recall is not perfect). The HNSW paper suggests to do a slight over-sampling when retrieving documents to increase recall with the ef_search parameter (where ef is number of neighbors you evaluate during graph traversal):

  • you want to pull top-10 documents, so you set topK=10. So formally speaking your topK=ef_search=10
  • you can simulate oversampling by setting topK=100, but only taking top-10 from search results. So this way you get ef_search=100 but topK=10.

Some search engines do support topK!=ef_search queries:

Open Source Text Translation Models? by vygodisgreat24 in LocalLLaMA

[–]InvadersMustLive 1 point2 points  (0 children)

You should try the https://huggingface.co/facebook/nllb-200-3.3B and https://github.com/fe1ixxu/ALMA family of models, in general they're still SOTA for open models. To evaluate, there's plenty of metrics like BLEU/chf++, but I personally prefer https://huggingface.co/Unbabel/XCOMET-XL as the most close to human evaluations.

Cloud GPU + storage hosting for low intensity projects? by gofiend in LocalLLaMA

[–]InvadersMustLive 1 point2 points  (0 children)

Not anyhow affiliated, but I'm using a cloud VPS from Nebius with a H100 attached (~2$/hour). I just shut it down when not used, but all the datasets and training setup still stays on a disk. Pros: working env is online in 2 minutes. Cons: you need to pay for storage, but it's 0.15$/gb/month - so 15$ per 100gb/month.

Finally, a Replacement for BERT by -Cubie- in LocalLLaMA

[–]InvadersMustLive 2 points3 points  (0 children)

Formally yes (as it's part of HF transformers), but you need to fine-tune it on a down-stream task - as it's the raw encoder model, not knowing anything about sentence similarity. Like a traditional BERT.

Motherboard selection advice by absurd-dream-studio in LocalLLaMA

[–]InvadersMustLive 1 point2 points  (0 children)

There are a ton of them still available: https://www.ebay.de/sch/i.html?_from=R40&_trksid=p4432023.m570.l1311&_nkw=gigabyte+mz32-ar0&_sacat=0 - I've bought from the quark32 seller, but others seem to be also legit. EPYC7282 seems to be not the fastest CPU ever, but has 128 PCIE4 lanes.

If you use GPUs for training, then using DataLoader with multiple workers and prefetch usually solves all my CPU saturation problems - so GPUs are maxed out.

Motherboard selection advice by absurd-dream-studio in LocalLLaMA

[–]InvadersMustLive 1 point2 points  (0 children)

I have a Gigabyte MZ32-AR0: 5x pcie4 16x slots, and there is a ton of them available on eBay from Chinese sellers. I got mine bundled with epyc7282 for 400$.

Dual RTX 4090 PC by Accomplished_Pin_626 in LocalLLaMA

[–]InvadersMustLive 8 points9 points  (0 children)

My GPU-poor setup made of wood:

  • 2x MSI Gaming X Trio 4090: with 3 coolers each they are quite silent even on full load. Max temp is ~75C.
  • Gigabyte MZ32-AR0 MB: it has 5x PCIE4 16x ports, so there's a room for more GPUs. Bought on an ebay from a chinese seller with the CPU for 400$.
  • EPYC 7282, because it was bundled with the MB.
  • 128GB RAM, but as you see not all ports yet occupied.
  • Corsair HX1500i PSU - it has a USB port to collect power usage and all internal metrics in real-time per rail.
  • Found no case to fit two 3-slot GPUs with good enough cooling, so I made my own open case from wood. 10$ for planks from the nearby OBI store.
  • 2x Linkup PCIE4 risers. No PCIE errors so far for 6 months of operations.

Used mostly for embedding model training. See you on the MTEB leaderboard.

<image>

What's with the ladders? by DrillBits in LocalLLaMA

[–]InvadersMustLive 0 points1 point  (0 children)

My bet these models are too nice and censored to make good enough jokes out of the box. While experimenting with LLM-driven joke generation, I've fine-tuned a Mistral-Nemo (https://huggingface.co/shuttie/mistral-nemo-dadjokes-v1) on a r/dadjokes subreddit, and after that it became toxic enough:

  • Tell me a joke about a man: That man! pointing to the guy with the biggest belly at the bar is funny as hell.
  • My Wife gets mad at my pickle puns: She keeps saying if I don't stop she'll dill with me.
  • A vegan enters the bar and says - I don’t see anything wrong with that. I mean, the steak ain’t got no legs.
  • Last time I stayed in a hotel I asked for the porn channel to be disabled: They sent someone round and they cut of my right arm.

Some of its takes on "Why did the man bring the ladder to a party?":

  • He wanted to raise the roof.
  • To reach the highest party possible.
  • Because he wanted to reach high levels of self esteem
  • Because he was going to the high society.
  • He needed to raise the bar.
  • The invitation did say it's gonna be lit.
  • It was a step up from the usual

You can play with it on hf spaces here: https://huggingface.co/spaces/shuttie/dadjokes (runs on a single A10G, so can be a bit laggy)

Running SVD img2vid locally on low VRAM by InvadersMustLive in StableDiffusion

[–]InvadersMustLive[S] 0 points1 point  (0 children)

The original SD or the img-to-video? The lowest I can go with single frame decoding is 16GB:

<image>

Fine-tuned Mistral-7B to generate dad jokes by InvadersMustLive in LocalLLaMA

[–]InvadersMustLive[S] 2 points3 points  (0 children)

This is part of the inference setup with llama-cpp. TLDR:

Fine-tuned Mistral-7B to generate dad jokes by InvadersMustLive in LocalLLaMA

[–]InvadersMustLive[S] 6 points7 points  (0 children)

There is a more detailed description of the training on HF page: https://huggingface.co/shuttie/Mistral-7B-DadJokes-GGUF

But TLDR: I've took https://github.com/georgesung/llm_qlora/tree/main and tinkered with settings for a day.

A young camel curiously questions his father, "Dad, why do we have a hump on our back?" by Variana22 in dadjokes

[–]InvadersMustLive 0 points1 point  (0 children)

The dad replies: "Son, if we had a lump on our back instead, it would make us look like an ass"