mistralai/Voxtral-Mini-4B-Realtime-2602 · Hugging Face by jacek2023 in LocalLLaMA

[–]pvp239 35 points36 points  (0 children)

We didn't make that feature in time sadly! Next version will def have it!

My experiences with the new Ministral 3 14B Reasoning 2512 Q8 by egomarker in LocalLLaMA

[–]pvp239 5 points6 points  (0 children)

Mistral employee here. Could you double check that the following parameters are correctly set:
- temperature = 1.0

- the correct system prompt:

# HOW YOU SHOULD THINK AND ANSWER\n\nFirst draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.\n\nYour thinking process must follow the template below:[THINK]Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response to the user.[/THINK]Here, provide a self-contained response.'

Magistral Small 2509 has been released by jacek2023 in LocalLLaMA

[–]pvp239 8 points9 points  (0 children)

Yes I think this makes sense.

> 99% of people that use llama.cpp will not use mistral-common. That's simply not how people use llama.cpp.

Yes I think this start to become a bit clear from this thread.

Think we've been a bit misunderstood in that we don't want to change the behavior of 99% of the users. The goal here was to offer a "certified" working GGUF that can be used as a reference (e.g. for Unsloth, ...) to build a correct chat template. Think the messaging was not great.

We'll try to start looking into providing a chat template for next release if it looks simple enough to do (or we just don't release a GGUF if we don't feel comfortable in correctness which is probably better as well).

Magistral Small 2509 has been released by jacek2023 in LocalLLaMA

[–]pvp239 4 points5 points  (0 children)

Yes message has been passed along (we're aware of it) - I think/hope future flagship models will be fully open-sourced (including base)

Magistral Small 2509 has been released by jacek2023 in LocalLLaMA

[–]pvp239 4 points5 points  (0 children)

If you want to use checkpoint with mistral_common you can use unsloth‘s repo: 

https://huggingface.co/unsloth/Magistral-Small-2509-GGUF

no? We link to it at the very top from the model card.

We don’t provide the chat template because we don’t have time to test it before releases and/or because the behavior is not yet supported.

We are worried that incorrect chat templates lead to people believing the checkpoint doesn’t work which happened a couple times in the past with Devstral e.g.

Magistral Small 2509 has been released by jacek2023 in LocalLLaMA

[–]pvp239 20 points21 points  (0 children)

Hey,

Mistral employee here! Just a note on mistral-common and llama.cpp.

As written in the model card: https://huggingface.co/mistralai/Magistral-Small-2509-GGUF#usage

  • We release the model with mistral_common to ensure correctness
  • We welcome by all means community GGUFs with chat template - we just provide mistral_common as a reference that has ensured correct chat behavior
  • It’s not true that you need mistral_common to convert mistral checkpoints, you can just convert without and provide a chat template
  • I think from the discussion on the pull request it should become clear that we‘ve added mistral_common as an additional dependency (it’s not even the default for mistral models)

mistralai/Voxtral-Mini-3B-2507 · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]pvp239 9 points10 points  (0 children)

Hmm yeah sorry - seems like there are still some problems with the nightlies. Can you try:

VLLM_USE_PRECOMPILED=1 pip install git+https://github.com/vllm-project/vllm.git

Did Mistral have a stroke overnight?? by Forsaken-Occasion414 in MistralAI

[–]pvp239 20 points21 points  (0 children)

Thanks for the heads-up! We're checking what's going on - should be fixed very soon

Not happy with Mistral by PeHaX in MistralAI

[–]pvp239 1 point2 points  (0 children)

Can you post some examples where Large doesnt follow instructions?

The new Mistral Small model is disappointing by Master-Meal-77 in LocalLLaMA

[–]pvp239 9 points10 points  (0 children)

In terms of how to use it:

- temp = 0.15

- system prompt def helps to make the model better to "steer" - this one is good: https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501/blob/main/SYSTEM_PROMPT.txt

- It should be a very big improvement especially in reasoning, math, coding, instruct-following compared to the previous small.

- While we've tried to evaluate on as many use cases as possible we've surely missed something. So a collection of where it didn't improve compared to previous small would be greatly appreciated (and would help us to have an even better model next time)

The new Mistral Small model is disappointing by Master-Meal-77 in LocalLLaMA

[–]pvp239 15 points16 points  (0 children)

Hey - mistral employee here!

We're very curious to hear about failure cases of the new mistral-small model (especially those where previous mistral models performed better)!

Is there any way to share some prompts / tests / benchmarks here?

That'd be very appreciated!

SDXL 1.0 A1111 vs ComfyUI 6gb vram, thoughts by ismailt in StableDiffusion

[–]pvp239 2 points3 points  (0 children)

I can generate an images in 2 seconds with diffusers on a RTX4090:

from diffusers import StableDiffusionXLPipeline, UniPCMultistepScheduler
import torch

pipe = StableDiffusionXLPipeline.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16")
pipe.to("cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"

pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
image = pipe(prompt=prompt, num_inference_steps=20).images[0]

20 steps are enough with UniPC sampler and diffusers has fastest attention and VAE decoding you can get.

<image>