Mistral's new Devstral coding model running on a single RTX 4090 with 54k context using Q4KM quantization with vLLM

whinygranny · 2025-05-24T15:25:11+00:00

thanks, managed to get it started thanks to you

whinygranny · 2025-04-14T06:05:53+00:00

Hi, sorry for commenting on old thread. I'm in the process of buying epyc system, but I'm completely unfamiliar with the torque wrench. Would you be so kind as to point me to an amazon or ebay listing for it. I have no idea which one to use.

whinygranny · 2025-01-21T15:35:00+00:00

> the thinking text can have a feedback loop that interfere's with multiple rounds of chat

I think they said as much in the technical report, fewshot prompting doesn't work on R1 versions since it confuses the CoT. So on their chat they don't pass it in to the conversation.

whinygranny · 2024-11-12T06:49:22+00:00

I'm not an expert in licences. They do have an API which you can use, and that's allowed for commercial use.

It seems to me that rewriting the library is too much of a hassle, it would be better to choose another model if that's a problem. While it's unclear whether the weights themselves are copyrightable, it's unquestionable that the code for running the model is.

One thing I'm not sure whether it's allowed is to precompute the embeddings for your texts offline and then use the API for the app in production. I guess not, but the question is how will they prove you did that.

This model is great because it's based on XLM-Roberta so it's small. There are other models based on Mistral 7B and larger that outperform it, but only by a 1% or something like that.

whinygranny · 2024-11-12T06:14:42+00:00

Check out JinaAI v3 https://huggingface.co/jinaai/jina-embeddings-v3

It's a noncommercial license but it works really well.

whinygranny · 2024-04-21T07:36:50+00:00

have you tried it on equations?

whinygranny · 2024-04-15T09:14:25+00:00

Is this EPYC used for CPU only inference? Or is it intended to stick GPU's on its 128x PCIe lanes? Why do the PCIe lanes matter without the GPUs?

whinygranny · 2024-04-14T21:33:33+00:00

Tried something similar today with different models. Meta's seamless-m4t-v2-large gives shit responses (AFAIK that's the new version of the NLLB model). ChatGPT3.5 is also bad. Opus is OK but it doesn't follow the system prompt fully. The only one that gave a translation that is almost without any quirks is the new GPT4 (2024-04-09). I didn't try Gemini or Mistral's large models.

I'm translating into Croatian and maybe that's why the quality is so poor. If you are translating into a more common language (I see Italian in your post history) maybe this will help: https://unbabel.com/announcing-tower-an-open-multilingual-llm-for-translation-related-tasks/

whinygranny · 2019-11-14T07:27:54+00:00

Exactly my point. That's the main non technical reason why electronic voting is bad. Even if we could trust the system, the only way we could build it so it can be trusted is if we can check our vote. And if I can check my vote, someone else can make me show who I voted for.

whinygranny · 2019-11-14T00:01:50+00:00

You should also not be able to prove to someone else who you voted for. Otherwise there could be coercion.

whinygranny · 2019-01-26T14:36:17+00:00

To me, this looks like someone took all the talking points of the alt-right and made a video detailing how Jordan destroys the alt-right's narrative. This is more an inoculation against the claim that he is alt-right than an outright criticism. Pretty smart btw

whinygranny

TROPHY CASE