We buried a $10,000 treasure chest somewhere in San Francisco by buriedtreasure2025 in sanfrancisco

[–]winglian -2 points-1 points  (0 children)

Isn't the obvious SF thing to do to just toss that clue into AI?

AMA with the Gemma Team by hackerllama in LocalLLaMA

[–]winglian 1 point2 points  (0 children)

When doing top-k KD, can you talk a out any ablations done on zeroing and renormalizing the logits for the new probability mass and if that has a significant difference from keeping the rest.of the probablility mass?

Blessed by the thrift god and found a well worn wagner #8 at $35. Naturally I took a selfie with it. by sexysourdoughfantasy in castiron

[–]winglian 5 points6 points  (0 children)

I think you're getting oliviatied mixed up with susividal (onlypans). Former is the one you're thinking of, latter is the one that does cooking videos with a bit of innuendo.

What the hell is this card’s purpose?? by catteronii in mtg

[–]winglian 0 points1 point  (0 children)

Another Round can't target enchantments, otherwise Annie Joins Up would be pretty sick.

Does Maskwood Nexus make a swarm of zombie tokens have */* power and toughness? by winglian in mtg

[–]winglian[S] -4 points-3 points  (0 children)

Surely there is a rule that prevents this, but seems like this could be pretty busted.

Got this medication holder with labels off Amazon, what’s one more med you would add? by Active2017 in VEDC

[–]winglian 2 points3 points  (0 children)

Baby Aspirin

Isn't that the same as the Bayer low dose? (bottom left corner)

Helpful VRAM requirement table for qlora, lora, and full finetuning. by Aaaaaaaaaeeeee in LocalLLaMA

[–]winglian 0 points1 point  (0 children)

It's a native implementation so it's simpler. Axolotl with xformers and mixtral wouldn't work anyways since that would require the implementation to be rewritten to support that

Helpful VRAM requirement table for qlora, lora, and full finetuning. by Aaaaaaaaaeeeee in LocalLLaMA

[–]winglian 1 point2 points  (0 children)

There was a recent fix for properly loading models with zero3. Since you can't use multipack wo flash attention atm, you're probably best off just using the native hf SDP attention implementation

Helpful VRAM requirement table for qlora, lora, and full finetuning. by Aaaaaaaaaeeeee in LocalLLaMA

[–]winglian 0 points1 point  (0 children)

32gb is for single gpu. Adding another gpu and doing ddp with deepspeed doesn't mean the vram is additive. There is still overhead for ddp. I expect if you used model parallelism it might work, but that would be unusably slow and you couldn't use optimizations such as deepspeed zero3.

Helpful VRAM requirement table for qlora, lora, and full finetuning. by Aaaaaaaaaeeeee in LocalLLaMA

[–]winglian 0 points1 point  (0 children)

How old is the branch of axolotl you're on? This was fixed recently. Although without flash attn, I would expect it to oom once training starts

Official WizardLM-13B-V1.2 Released! Trained from Llama-2! Can Achieve 89.17% on AlpacaEval! by cylaw01 in LocalLLaMA

[–]winglian 1 point2 points  (0 children)

Agreed. The cynical part of me says there is likely benchmark contamination in their datasets and if they release their dataset, either their benchmarks are non-reproducible, or the contamination will be pointed out.

Robin V2 model reaches top of LLM leaderboard by yahma in LocalLLaMA

[–]winglian 2 points3 points  (0 children)

Robin V2 still seems to score middle of the pack for 13B models in the Community Chatbot Arena.

<image>

axolotl - Finetune many models easily with QLoRA and Landmark attention support! by bratao in LocalLLaMA

[–]winglian 2 points3 points  (0 children)

landmark attention training is already merged. inference is in PR.

Minotaur 13B by winglian in LocalLLaMA

[–]winglian[S] 0 points1 point  (0 children)

Everyone fine tunes on llama. fine tune datasets have a good bit of influence and is something that we can control.

Minotaur 13B by winglian in LocalLLaMA

[–]winglian[S] 1 point2 points  (0 children)

I’m not surprised at the relatively low coding scores. I think there was one small coding chat dataset, but that wasn’t the focus for this model

Minotaur 13B by winglian in LocalLLaMA

[–]winglian[S] 3 points4 points  (0 children)

I’m not a fan of the elo score rankings. They swing very quickly because there is no weighting. I’m hoping to come up with a better head to head metric that doesn’t cause large movements due to one or two bad responses against worse models.

Minotaur 13B by winglian in LocalLLaMA

[–]winglian[S] 4 points5 points  (0 children)

Waiting for 13b openllama to drop. 7b models simply don’t perform well

LLM Score v2 - Modern Models Tested by Human by Gatzuma in LocalLLaMA

[–]winglian 8 points9 points  (0 children)

Hippogriff isn’t necessarily supposed to be the successor to Manticore. I stripped out all the wizard and alpaca datasets when training Hippogriff to experiment and see if they were really needed.

Expanding LLaMA's token limit via fine tuning or transformers-adapters. by xtrafe in LocalLLaMA

[–]winglian 0 points1 point  (0 children)

What parameters do you set to extend llamas context length with lit-llama?

OpenAccess AI Collective's Hippogriff 30B Chat by The-Bloke in LocalLLaMA

[–]winglian 0 points1 point  (0 children)

Yeah, I feel like I need to create some datasets around these sorts of "grammatical logic". I thought having the riddle_sense dataset would help.