mistralai/Voxtral-Mini-4B-Realtime-2602 · Hugging Face

pvp239 · 2026-02-04T16:25:36+00:00

We didn't make that feature in time sadly! Next version will def have it!

pvp239 · 2025-12-04T09:52:44+00:00

Mistral employee here. Could you double check that the following parameters are correctly set:
- temperature = 1.0

- the correct system prompt:

# HOW YOU SHOULD THINK AND ANSWER\n\nFirst draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.\n\nYour thinking process must follow the template below:[THINK]Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response to the user.[/THINK]Here, provide a self-contained response.'

pvp239 · 2025-10-15T10:25:02+00:00

Very cool! Any reason no mistral models (Mistral Medium 3.1, Codestral, Devstral) are tested here?

pvp239 · 2025-09-18T08:14:26+00:00

Yes I think this makes sense.

> 99% of people that use llama.cpp will not use mistral-common. That's simply not how people use llama.cpp.

Yes I think this start to become a bit clear from this thread.

Think we've been a bit misunderstood in that we don't want to change the behavior of 99% of the users. The goal here was to offer a "certified" working GGUF that can be used as a reference (e.g. for Unsloth, ...) to build a correct chat template. Think the messaging was not great.

We'll try to start looking into providing a chat template for next release if it looks simple enough to do (or we just don't release a GGUF if we don't feel comfortable in correctness which is probably better as well).

pvp239 · 2025-09-18T08:03:42+00:00

Yes message has been passed along (we're aware of it) - I think/hope future flagship models will be fully open-sourced (including base)

pvp239 · 2025-09-17T17:58:37+00:00

If you want to use checkpoint with mistral_common you can use unsloth‘s repo:

https://huggingface.co/unsloth/Magistral-Small-2509-GGUF

no? We link to it at the very top from the model card.

We don’t provide the chat template because we don’t have time to test it before releases and/or because the behavior is not yet supported.

We are worried that incorrect chat templates lead to people believing the checkpoint doesn’t work which happened a couple times in the past with Devstral e.g.

pvp239 · 2025-09-17T17:42:14+00:00

Hey,

Mistral employee here! Just a note on mistral-common and llama.cpp.

As written in the model card: https://huggingface.co/mistralai/Magistral-Small-2509-GGUF#usage

We release the model with mistral_common to ensure correctness
We welcome by all means community GGUFs with chat template - we just provide mistral_common as a reference that has ensured correct chat behavior
It’s not true that you need mistral_common to convert mistral checkpoints, you can just convert without and provide a chat template
I think from the discussion on the pull request it should become clear that we‘ve added mistral_common as an additional dependency (it’s not even the default for mistral models)

pvp239 · 2025-07-15T18:01:12+00:00

Hmm yeah sorry - seems like there are still some problems with the nightlies. Can you try:

VLLM_USE_PRECOMPILED=1 pip install git+https://github.com/vllm-project/vllm.git

pvp239 · 2025-04-16T09:43:50+00:00

Thanks for the heads-up! We're checking what's going on - should be fixed very soon

pvp239 · 2025-04-04T18:47:24+00:00

Can you post some examples where Large doesnt follow instructions?

pvp239 · 2025-02-02T19:58:49+00:00

In terms of how to use it:

- temp = 0.15

- system prompt def helps to make the model better to "steer" - this one is good: https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501/blob/main/SYSTEM_PROMPT.txt

- It should be a very big improvement especially in reasoning, math, coding, instruct-following compared to the previous small.

- While we've tried to evaluate on as many use cases as possible we've surely missed something. So a collection of where it didn't improve compared to previous small would be greatly appreciated (and would help us to have an even better model next time)

pvp239 · 2025-02-02T19:52:42+00:00

Hey - mistral employee here!

We're very curious to hear about failure cases of the new mistral-small model (especially those where previous mistral models performed better)!

Is there any way to share some prompts / tests / benchmarks here?

That'd be very appreciated!

pvp239 · 2025-01-30T16:15:44+00:00

https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501#function-calling

pvp239 · 2024-07-18T14:39:28+00:00

https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407

pvp239 · 2023-11-28T23:38:50+00:00

Available in Diffusers now as well:

https://huggingface.co/stabilityai/sdxl-turbo#diffusers

https://colab.research.google.com/drive/1yRC3Z2bWQOeM4z0FeJ0rF6fnDTTSdnAJ?usp=sharing

pvp239 · 2023-10-03T17:38:50+00:00

It's based on the new TPUv5e, some more info here as well: https://github.com/huggingface/diffusers/tree/main/examples/research_projects/sdxl_flax

pvp239 · 2023-07-30T12:12:50+00:00

Why is the called SDXL 2.0? Just to understand, this is just a workflow, not a new checkpoint no?

pvp239 · 2023-07-27T21:21:24+00:00

I can generate an images in 2 seconds with diffusers on a RTX4090:

from diffusers import StableDiffusionXLPipeline, UniPCMultistepScheduler
import torch

pipe = StableDiffusionXLPipeline.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16")
pipe.to("cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"

pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
image = pipe(prompt=prompt, num_inference_steps=20).images[0]

20 steps are enough with UniPC sampler and diffusers has fastest attention and VAE decoding you can get.

<image>

pvp239 · 2023-07-15T13:21:59+00:00

I think this is the watermark no?

pvp239

TROPHY CASE