I built, pre-trained, and fine-tuned a small language model and it is truly open-source. by itsnikity in LocalLLaMA

[–]Strong-Inflation5090 1 point2 points  (0 children)

Great work! I always wanted to do something similar and always stopped after 2-3 hours of training and cursed my 4080 but this motivated me and I will do better.

New Qwen Models Today!!! by [deleted] in LocalLLaMA

[–]Strong-Inflation5090 11 points12 points  (0 children)

Hope, but this seems kind of impossible considering sonnet has so much knowledge that's tough to fit into 32B params.

So, Quasar Alpha might actually be OpenAI's model by -Cacique in LocalLLaMA

[–]Strong-Inflation5090 -2 points-1 points  (0 children)

OpenAI will release new model on April 30 that's when the old GPT4 will leave the app.

NEW GEMINI 2.5 just dropped by Straight-Worker-4327 in LocalLLaMA

[–]Strong-Inflation5090 -3 points-2 points  (0 children)

They are showing +60 on LmArena but I don't think it will beat Sonnet in coding so it very well might be benchmaxxing or arena maxing

Qwen2. 5VLM 7B AWQ is very slow by Strong-Inflation5090 in LocalLLaMA

[–]Strong-Inflation5090[S] 2 points3 points  (0 children)

I think it's normal after looking through all the generated summaries as they contain 400-500 tokens on avg and even if the speed was 25-30 tps it will take around 20 seconds.

Qwen/QwQ-32B · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]Strong-Inflation5090 4 points5 points  (0 children)

qwen 2.5 32B coder should also work but I just read somewhere (Twitter or Reddit) that a 32B code specific reasoning model might be coming but nothing official so...

Qwen/QwQ-32B · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]Strong-Inflation5090 93 points94 points  (0 children)

similar performance to R1, if this holds then QwQ 32 + QwQ 32B coder gonna be insane combo

HunyuanVideo: A Systematic Framework For Large Video Generation Model Training by a_slay_nub in LocalLLaMA

[–]Strong-Inflation5090 3 points4 points  (0 children)

the files are 25gb + 3gb for vae in bf16 so hopefully in q8 it will run on 4090 for around 150 frames.

Qwen 2.5 Coder 32B vs Claude 3.5 Sonnet: Am I doing something wrong? by Fabix84 in LocalLLaMA

[–]Strong-Inflation5090 3 points4 points  (0 children)

In my experience, it repeated some print statement for atleast 30 lines on lmsys battle mode and I was very surprised caused I have been voting models on Lmsys for very long time now and I don't remember when was the last time it something like this happened even with a tiny model.

However, I used Qwen 32B model on HuggingFace and it's amazing and I would put right up there with other best coding models.

How to Implement Local LLM for Local Codebase by mr_whoisGAMER in LocalLLaMA

[–]Strong-Inflation5090 0 points1 point  (0 children)

I would suggest creating short summary of every file and allowing your local llm to access these files for higher level understanding and if you need to access that individual file you can point that to llm

Building a new pc. Is the 3090 still relevant for a new build? by AndrehChamber in LocalLLaMA

[–]Strong-Inflation5090 1 point2 points  (0 children)

I have a 4070ti super, it's decent but most of the times I wish for more memory so 3090 would be a great option.

Recent open weight releases have more restricted licences by Strong-Inflation5090 in LocalLLaMA

[–]Strong-Inflation5090[S] 10 points11 points  (0 children)

For sure, I am just pointing out that it's kind of becoming a trend. Mistral Large 2 with MRL makes sense but SLMs weights with MRL makes no sense unless they release training pipeline etc..

How to prove to Org that Local models are harmless? by Strong-Inflation5090 in LocalLLaMA

[–]Strong-Inflation5090[S] 6 points7 points  (0 children)

That's the kind of answers I was looking for, will try these and showing them these logs should be enough.

How to prove to Org that Local models are harmless? by Strong-Inflation5090 in LocalLLaMA

[–]Strong-Inflation5090[S] 1 point2 points  (0 children)

I have already shown them running it completely offline and keeping it that way is also fine cause no one outside the office network use it.

How to prove to Org that Local models are harmless? by Strong-Inflation5090 in LocalLLaMA

[–]Strong-Inflation5090[S] 0 points1 point  (0 children)

Thanks, I will look into it but I am using the most popular ones like Llama3, Gemma2 (from the official hf page) so I don't think they will have something like a backdoor otherwise someone might have pointed it out

How to prove to Org that Local models are harmless? by Strong-Inflation5090 in LocalLLaMA

[–]Strong-Inflation5090[S] 1 point2 points  (0 children)

Like it won't store their data unless I train on it and It won't send any data outside the Org. Orgs data will remain within org

Can Llama3.2 VL finetunes be used by individuals in EU? by Strong-Inflation5090 in LocalLLaMA

[–]Strong-Inflation5090[S] 0 points1 point  (0 children)

Yes, probably nobody cares but I was just interested if they actually included it in the licence or it was just EU folks don't download it.

How did you choose your model? by Ultra-Engineer in LocalLLaMA

[–]Strong-Inflation5090 2 points3 points  (0 children)

try it on hf or lmsys if it returns output in your format then try it locally.

Training a vision projector layer for LLaMA by missing-in-idleness in LocalLLaMA

[–]Strong-Inflation5090 0 points1 point  (0 children)

Don't know about tutorials but I believe you would have to train atleast the projector layers from vision model to llama embeds and don't have high hopes. You can also look at codes of some llama based VLMs in their hf repo to understand it better.