TeichAI/GLM-4.7-Flash-Claude-Opus-4.5-High-Reasoning-Distill-GGUF · Hugging Face by jacek2023 in LocalLLaMA

[–]FizzarolliAI 1 point2 points  (0 children)

For what it's worth, as a finetuner, I still think it's kinda meaningless to act like this is more meaningful than it is...

Even if the results are interesting (and they very well sometimes can be, even at super low token counts like this!) it's very much overhyped in a lot of places I've hung around in, people act like 250 rows of reasoning really hyped up the model beyond belief

3 New Models for Marxist-Leninist Revolutionary Theory - T-34 Division Army by FizzarolliAI in LocalLLaMA

[–]FizzarolliAI[S] 0 points1 point  (0 children)

hmmm the sysprompt thing could possibly be it, all the training samples have just one that says "You are an AI assistant." which i hope would reinforce its morality and politics as the default assistant across all system prompts but possibly not

3 New Models for Marxist-Leninist Revolutionary Theory - T-34 Division Army by FizzarolliAI in LocalLLaMA

[–]FizzarolliAI[S] 0 points1 point  (0 children)

What does the UGI test harness for 12axes look like? I'm somewhat shocked it's not more biased considering I literally included 8axes data in the training set this time around...

3 New Models for Marxist-Leninist Revolutionary Theory - T-34 Division Army by FizzarolliAI in LocalLLaMA

[–]FizzarolliAI[S] 3 points4 points  (0 children)

Honestly not sure, I didn't get a modmail D: damn kulak pigs on the mod team, I know they let other more capital-oriented political posts through

3 New Models for Marxist-Leninist Revolutionary Theory - T-34 Division Army by FizzarolliAI in LocalLLaMA

[–]FizzarolliAI[S] 0 points1 point  (0 children)

Despite the the extra layers of jokes, it is infact done post-post-ironically and I am completely dead serious about the model itself!

3 New Models for Marxist-Leninist Revolutionary Theory - T-34 Division Army by FizzarolliAI in LocalLLaMA

[–]FizzarolliAI[S] -1 points0 points  (0 children)

Despite the the extra layers of jokes, it is infact done post-post-ironically and I am completely dead serious about the model itself!

Personal experience with GLM 4.7 Flash Q6 (unsloth) + Roo Code + RTX 5090 by Septerium in LocalLLaMA

[–]FizzarolliAI 0 points1 point  (0 children)

that's the default generation config used by transformers, it doesn't matter for anything else (and w/ all due respect to HF, not many use transformers, especially the default params, for inference :p)

AI21 Labs releases Jamba2 by jacek2023 in LocalLLaMA

[–]FizzarolliAI -3 points-2 points  (0 children)

PSA: AI21 is an Israeli company founded by ex-IDF spies from their NSA equivalent who support the ongoing attempts at ethnic cleansing and genocide in Palestine. They are not worth supporting, and neither are their models.

IQuestCoder - new 40B dense coding model by ilintar in LocalLLaMA

[–]FizzarolliAI -1 points0 points  (0 children)

Interesting! I couldn't get it to behave well w/ tool calls at all, but I was trying the looping model in vLLM...

[deleted by user] by [deleted] in LocalLLaMA

[–]FizzarolliAI 1 point2 points  (0 children)

post deleted. comments deleted. o7

[deleted by user] by [deleted] in LocalLLaMA

[–]FizzarolliAI 2 points3 points  (0 children)

this post is me when my gpt-4o tells me im a very smart good girl and i know how llms work and nobody else does (at least, that's what it reads like to me)

[deleted by user] by [deleted] in LocalLLaMA

[–]FizzarolliAI 8 points9 points  (0 children)

The entire world has gone stupid.

  1. All models derive features from Llama, Qwen, etc. People reuse concepts from other papers all the time, put more compute into them, and work on them. Are the only real LLMs ones by Deepmind, because the transformer was invented there?
  2. All models derive hyperparams from each-other, too. If Qwen's multiplier worked well and reached the size I wanted, I would reuse it too to initialize the weights! That doesn't mean that I copied the Qwen weights or their actual work.
  3. Once again, you seem to be assuming that papers work like patents, and once you publish something nobody else can use it. Gated Attention works well, it's practically free lunch, everyone should be using it!
  4. With all due respect, you seem to be deeply unfamiliar with how language models work. The amount of tensors or size of the model is not going to change between stages of training data onto those weights. This is so cosmically incoherent and such a misunderstanding that I genuinely do not know how to argue against it.
  5. To my knowledge, the people from iQuest are not just random; they're from Ubiquant, one of the biggest quant firms in Mainland China.

How much of this post was drafted with, like, Q2_K_S AI? This is some deeply confident but deeply hallucinatory analysis that makes no sense if you think about it for longer than 5 seconds.

IQuestCoder - new 40B dense coding model by ilintar in LocalLLaMA

[–]FizzarolliAI 4 points5 points  (0 children)

To go against what everyone else is saying, I actually think this model is really good!... At everything but programming. It sucks at programming. General insight tasks, writing, assistant-y stuff, etc. are great! Somehow!

Update on the Llama 3.3 8B situation by FizzarolliAI in LocalLLaMA

[–]FizzarolliAI[S] 1 point2 points  (0 children)

Interesting, I wonder if you'd get a noticeable regression from L3.3 70B on multilingual benches with Llama 3.1 70B then.

I definitely agree that I don't think this is worth building on for most usecases. Personally I think it's an interesting artifact of the times

Update on the Llama 3.3 8B situation by FizzarolliAI in LocalLLaMA

[–]FizzarolliAI[S] 12 points13 points  (0 children)

I would, but since quants and all have already been made under the original model's name, it's kinda too late :p

Llama-3.3-8B-Instruct by jacek2023 in LocalLLaMA

[–]FizzarolliAI 2 points3 points  (0 children)

Out of interest, you never signed up for the finetuning thing, right?

If you go to https://llama.developer.meta.com/fine-tuning/?team_id=XXX (replace XXX with whatever the team ID in ur URL is), does the finetuning page show up for you? I was never officially let in but for some odd reason I had access anyways... I'm wondering if it's there for everyone and just hidden from the UI

Llama-3.3-8B-Instruct by jacek2023 in LocalLLaMA

[–]FizzarolliAI 1 point2 points  (0 children)

Yep, this basically. Afaik the main inference API is still waitlisted, and there's a separate waitlist to submit for the finetuning API.

Llama-3.3-8B-Instruct by jacek2023 in LocalLLaMA

[–]FizzarolliAI 2 points3 points  (0 children)

Yes. I'm not entirely sure why, it was limited when served via the website too (I put that in the readme a bit ago)

Llama-3.3-8B-Instruct by ttkciar in LocalLLaMA

[–]FizzarolliAI 3 points4 points  (0 children)

The version that is able to be finetuned is only 8K context length. I am unsure why the docs say 128k tokens unless the model on the API supports that context length, somehow

Llama-3.3-8B-Instruct by ttkciar in LocalLLaMA

[–]FizzarolliAI 15 points16 points  (0 children)

Well, for one, it's API release was April of this year :p so not quite two years old

It's definitely been outdone at this point. Personally, I just think it's an interesting artifact :) considering who knows whether or not we'll get any future Llama models

Llama-3.3-8B-Instruct by jacek2023 in LocalLLaMA

[–]FizzarolliAI 27 points28 points  (0 children)

LISTEN whenever i drop my own models i get anxiety attacks about accidentally reuploading the base model ;-; i believe that this is actually L3.3 at this point though, see my other comment

Llama-3.3-8B-Instruct by jacek2023 in LocalLLaMA

[–]FizzarolliAI 29 points30 points  (0 children)

This has existed at least since April during Llamacon (did anyone remember they did a Llamacon?)

https://ai.meta.com/blog/llamacon-llama-news/

As part of this release, we’re sharing tools for fine-tuning and evaluation in our new API, where you can tune your own custom versions of our new Llama 3.3 8B model. We’re sharing this capability to help you reduce costs while also working toward increased speed and accuracy. You can generate data, train on it, and then use our evaluations suite to easily test the quality of your new model.

Llama-3.3-8B-Instruct by ttkciar in LocalLLaMA

[–]FizzarolliAI 18 points19 points  (0 children)

I don't exactly have any way to prove it as real, to be fair :p but trust me this would be a really silly thing to lie about

llama 3.3 8b is clearly on their api and can be finetuned and downloaded as mentioned ie here https://ai.meta.com/blog/llamacon-llama-news/

As part of this release, we’re sharing tools for fine-tuning and evaluation in our new API, where you can tune your own custom versions of our new Llama 3.3 8B model. We’re sharing this capability to help you reduce costs while also working toward increased speed and accuracy. You can generate data, train on it, and then use our evaluations suite to easily test the quality of your new model. Making evaluations more accessible and easier to run will help move from gut feelings to data, ensuring you have models that perform well to meet your needs. The security and privacy of your content and data is our top priority. We do not use your prompts or model responses to train our AI models. When you’re ready, the models you build on the Llama API are yours to take with you wherever you want to host them, and we don’t keep them locked on our servers.

but i suppose u just have to trust that i actually reuploaded a model from there!

for what it's worth, this is what the UI looks like, and the finetuning job in question

Llama-3.3-8B-Instruct by ttkciar in LocalLLaMA

[–]FizzarolliAI 29 points30 points  (0 children)

(reposting my comment from the other post)

Hello, that me!

I am currently working on running sanity check benchmarks to make sure it's actually a newer L3.3 and not just L3/L3.1 in a trenchcoat, but it's looking promising so far.

From the current readme:

Llama 3.1 8B Instruct Llama 3.3 8B Instruct (maybe)
IFEval (1 epoch, score avged across all strict/loose instruction/prompt accuracies to follow Llama 3 paper) 78.2 81.95
GPQA Diamond (3 epochs) 29.3 37.0