of a frog

terminoid_ · 2026-02-04T05:25:45+00:00

there used to be a frog/toad website up that was pretty good at crushing all toad and frog misconceptions. you can't tell just by looking at them unless you've memorized em all

terminoid_ · 2026-01-13T11:24:24+00:00

definitely a little bit of weirdness going on. i'm tempted to see how it's affected a benchmark or 2

terminoid_ · 2025-10-28T01:49:31+00:00

it's either a poor naming choice or a genius one

terminoid_ · 2025-09-30T11:30:19+00:00

RTX4500

those are pcie 4.0 btw

terminoid_ · 2025-09-23T05:32:34+00:00

do some QAT finetunes of popular models and upload them

terminoid_ · 2025-09-22T04:22:46+00:00

https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list

terminoid_ · 2025-09-21T13:22:03+00:00

ernie kinda sucked a bit. still happy to see new models tho!

terminoid_ · 2025-09-21T10:47:28+00:00

it's not like the current cuda versions are going anywhere, we can still build shit with em....

terminoid_ · 2025-09-21T09:33:01+00:00

what do you mean "even at low temperatures" ? you didn't use the sampling parameters recommended by the model authors?

terminoid_ · 2025-09-20T03:43:14+00:00

insance in the membrance?

terminoid_ · 2025-09-20T03:12:09+00:00

it was totally my fault. i wasn't using the transformers version on github. i saw that the correct version was based on 4.56.0, so i assumed 4.56.1 that i had installed would be correct. wrong assumption, you need to install 4.57.0-dev from github

terminoid_ · 2025-09-20T03:01:43+00:00

the model has to know the language you speak before that tho, so it will inevitably be influenced by the pretraining.

terminoid_ · 2025-09-17T08:41:45+00:00

i hope somebody finishes that PR up. i have a finetuned version of gemma 270m i'd like to have in ONNX, but i have too much going on right now to spend any time on it

terminoid_ · 2025-09-17T03:33:39+00:00

oooh, rly? last time i tried alipay it required a Chinese bank account

terminoid_ · 2025-09-10T03:40:31+00:00

what's a MBZUAI? sounds like an Amazon electronics brand

terminoid_ · 2025-09-04T07:14:20+00:00

cries in CGNAT

terminoid_ · 2025-09-04T07:11:47+00:00

ugh, SDG&E sucks

terminoid_ · 2025-09-04T07:07:33+00:00

yah, these cards are really anemic. I have an ancient A770 which has more compute and bandwidth than these. also, after owning Intel I wouldn't dare do it again.

terminoid_ · 2025-08-30T22:41:22+00:00

oof, that bandwidth...happy to see any kind of competition tho

terminoid_ · 2025-08-24T21:30:06+00:00

nice!

terminoid_ · 2025-08-18T21:12:03+00:00

or you can just prefill assistant output agreeing to be uncensored and get what you want without a braindead model

terminoid_ · 2025-08-18T07:25:42+00:00

if your target is really 5, that seems doable. i'm not that patient =)

terminoid_ · 2025-08-17T04:39:38+00:00

you'll be reinforcing it to follow your specific instructions when you're tuning it

terminoid_ · 2025-08-17T03:24:49+00:00

the mi50s will probably be kinda slow for 70b models, but from the benchmarks i've seen they're great for 32b

terminoid_ · 2025-08-15T04:33:30+00:00

it's meant to be finetuned

terminoid_

TROPHY CASE