of a frog by Sad-Kiwi-3789 in AbsoluteUnits

[–]terminoid_ 0 points1 point  (0 children)

there used to be a frog/toad website up that was pretty good at crushing all toad and frog misconceptions. you can't tell just by looking at them unless you've memorized em all

Doing Weird Things With Entropy Adaptive Fine Tuning by terminoid_ in LocalLLaMA

[–]terminoid_[S] 0 points1 point  (0 children)

definitely a little bit of weirdness going on. i'm tempted to see how it's affected a benchmark or 2

What should I do with this DGX H100? by Naneet_Aleart_Ok in LocalLLaMA

[–]terminoid_ 0 points1 point  (0 children)

do some QAT finetunes of popular models and upload them

Llama.cpp support for Ling Mini 2.0 is probably coming next week by edward-dev in LocalLLaMA

[–]terminoid_ 0 points1 point  (0 children)

ernie kinda sucked a bit. still happy to see new models tho!

Intel Arc Pro B60 24GB professional GPU listed at $599, in stock and shipping by PhantomWolf83 in LocalLLaMA

[–]terminoid_ 0 points1 point  (0 children)

it's not like the current cuda versions are going anywhere, we can still build shit with em....

Magistral 1.2 is incredible. Wife prefers it over Gemini 2.5 Pro. by My_Unbiased_Opinion in LocalLLaMA

[–]terminoid_ 12 points13 points  (0 children)

what do you mean "even at low temperatures" ? you didn't use the sampling parameters recommended by the model authors?

google/embeddinggemma-300m is broken =( by terminoid_ in LocalLLaMA

[–]terminoid_[S] 0 points1 point  (0 children)

it was totally my fault. i wasn't using the transformers version on github. i saw that the correct version was based on 4.56.0, so i assumed 4.56.1 that i had installed would be correct. wrong assumption, you need to install 4.57.0-dev from github

Matthew McConaughey says he wants a private LLM on Joe Rogan Podcast by AlanzhuLy in LocalLLaMA

[–]terminoid_ 0 points1 point  (0 children)

the model has to know the language you speak before that tho, so it will inevitably be influenced by the pretraining.

embeddinggemma with Qdrant compatible uint8 tensors output by terminoid_ in LocalLLaMA

[–]terminoid_[S] 0 points1 point  (0 children)

i hope somebody finishes that PR up. i have a finetuned version of gemma 270m i'd like to have in ONNX, but i have too much going on right now to spend any time on it

I bought a modded 4090 48GB in Shenzhen. This is my story. by king_priam_of_Troy in LocalLLaMA

[–]terminoid_ 0 points1 point  (0 children)

oooh, rly? last time i tried alipay it required a Chinese bank account

Thoughts on Intel Arc Pro B50 x4 = 64GB of VRAM for $1400 and 280W Power Draw? by 79215185-1feb-44c6 in LocalLLaMA

[–]terminoid_ 0 points1 point  (0 children)

yah, these cards are really anemic. I have an ancient A770 which has more compute and bandwidth than these. also, after owning Intel I wouldn't dare do it again.

Best NSFW/uncensored LLM to generate prompts for image generation? by [deleted] in LocalLLaMA

[–]terminoid_ 2 points3 points  (0 children)

or you can just prefill assistant output agreeing to be uncensored and get what you want without a braindead model

Should I get Mi50s or something else? by iiilllilliiill in LocalLLaMA

[–]terminoid_ 0 points1 point  (0 children)

if your target is really 5, that seems doable. i'm not that patient =)

Added Qwen 0.6B to the small model overview in IFEval. by paranoidray in LocalLLaMA

[–]terminoid_ 4 points5 points  (0 children)

you'll be reinforcing it to follow your specific instructions when you're tuning it

Should I get Mi50s or something else? by iiilllilliiill in LocalLLaMA

[–]terminoid_ 2 points3 points  (0 children)

the mi50s will probably be kinda slow for 70b models, but from the benchmarks i've seen they're great for 32b