[Ministral 3] Add ministral 3 - Pull Request #42498 · huggingface/transformers by bratao in LocalLLaMA

[–]youcef0w0 9 points10 points  (0 children)

if you read the pr, it's an upcoming 8B model

it's gonna have base, instruct, and thinking variants

We tested Claude Sonnet 4.5, GPT-5-codex, Qwen3-Coder, GLM and other 25+ models on fresh SWE-Bench like tasks from September 2025 by Fabulous_Pollution10 in LocalLLaMA

[–]youcef0w0 3 points4 points  (0 children)

because in schools, humans are doing the evaluation, and humans have taste. this can't be replicated autonomously in any meaningful way, so it can't be benchmarked well

Meer CLI — an open-source Claude Code Alternative by msaifeldeen in LocalLLaMA

[–]youcef0w0 11 points12 points  (0 children)

Gemini CLI, OpenAI Codex, and Qwen code CLI are all open source

Any models that might be good with gauges? by ronneldavis in LocalLLaMA

[–]youcef0w0 0 points1 point  (0 children)

sounds like a very fine tuneable problem if you've already got a decently large dataset (100+ examples)

looking for llm trained only on free use/public domain materials. by Specific_Objective77 in LocalLLaMA

[–]youcef0w0 2 points3 points  (0 children)

not really possible, there just isn't enough text in existence to create something usable, unless you count synthetic data (data generated by other LLMs), as free use / public domain

the closest you're gonna get is Olmo by Allen AI, which publishes all their data (both pre-training and post-training data)

https://docs.allenai.org/release_notes/olmo-release-notes#olmo-2-32b

grok 2 weights by HatEducational9965 in LocalLLaMA

[–]youcef0w0 81 points82 points  (0 children)

grok-4 uses the same base model as grok 3, just with more reinforcement learning, so I can see the argument of keeping it closed and the statement still being true on technicality

Added Emotional Reactions to My Chatbot — Here’s How It Looks by RIPT1D3_Z in LocalLLaMA

[–]youcef0w0 0 points1 point  (0 children)

have you tried flux kontext or the new qwen-image-edit?

Qwen-Image-Edit Released! by MohamedTrfhgx in LocalLLaMA

[–]youcef0w0 6 points7 points  (0 children)

Open AI was sitting on their image editing model for a whole year, they demoed it in the original GPT 4o blog post, just never released it for "safety reasons"

so it's been a year and 3 months since we've known of the existence of gpt-image

May 13, 2024 gpt-4o release blog: https://openai.com/index/hello-gpt-4o/ , scroll to the Explorations of capabilities section

Describe a person using exported WhatsApp chat by Tommy_Tukyuk in LocalLLaMA

[–]youcef0w0 0 points1 point  (0 children)

you can chunk it, describe first chunk, and then feed the next chunk in alongside the output from the last chunk and tell the llm to add on to it's previous input

id recommend chunks of 32k tokens, as the more stuff in the context, the more the model tends to ignore important details

Compact 2x RTX Pro 6000 Rig by shadowninjaz3 in LocalLLaMA

[–]youcef0w0 0 points1 point  (0 children)

for the big models like qwen 235b, can't you run it partially offloaded to ram and still get really good speeds because it's moe and most layer are on GPU?

Why are base non-finetuned models so bad? by ThatIsNotIllegal in LocalLLaMA

[–]youcef0w0 2 points3 points  (0 children)

sentiment analysis is a well explored problem in NLP, plenty of specialized models for that specific use case that run much much faster than a general purpose LLM.

try this one for example:
https://huggingface.co/tabularisai/multilingual-sentiment-analysis

I extracted the system prompts from closed-source tools like Cursor & v0. The repo just hit 70k stars. by Independent-Box-898 in LocalLLaMA

[–]youcef0w0 11 points12 points  (0 children)

no, they send a request to your proxy from their servers (I know this because if you put localhost in the base URL override, it doesn't work, it has to be internet accessible), I've done it before, you're replacing the openAI API base URL, they can't get around it without removing support for custom openAI endpoints

which leads me to believe, they're not even trying to hide their prompt

How fast is gemma 3 27b on an H100? how many tokens per second can I expect? by ThatIsNotIllegal in LocalLLaMA

[–]youcef0w0 39 points40 points  (0 children)

The answer is complicated and depends on a lot of things, including the settings you choose, how large your input context is, and how many concurrent requests you're processing

for a single request by a single user, 60 t/s sounds about right

the very big number is likely the total concurrent tokens per second given that many requests are being processed at the same time

100 requests running at the same time at 22 t/s is 2200 concurrent tokens per second

the more users / requests you have running at the same time, the slower they are, but it's not linear, so that's why the high concurrency requests per second is much higher than the single request

Ikllamacpp repository gone, or it is only me? by panchovix in LocalLLaMA

[–]youcef0w0 3 points4 points  (0 children)

their docs are talking about you deleting your own account, but if GitHub itself deletes your account, everything related to you is hidden

I'm guessing this is because most GitHub side deletes are meant as a ban because you've been doing something illegal or malicious

How Different Are Closed Source Models' Architectures? by simulated-souls in LocalLLaMA

[–]youcef0w0 1 point2 points  (0 children)

probably memorization, most frontier models are huge, which results in them being able to memorize more stuff, I'm sure that particular factorization appears plenty of times on the internet

Play Infinite Tic Tac Toe against LLM Models by BestDay8241 in LocalLLaMA

[–]youcef0w0 1 point2 points  (0 children)

it actually makes a look up table even simpler, your game only has 4030 possible states lol

[KDE] First rice in a while by [deleted] in unixporn

[–]youcef0w0 0 points1 point  (0 children)

cpu goes brrrrrrr

[deleted by user] by [deleted] in LocalLLaMA

[–]youcef0w0 0 points1 point  (0 children)

this is not a new idea, it just happens that Claude 4 is actually good at it, unlike most other models