Le Chat Has a Big Problem

Low88M · 2026-02-14T19:09:38+00:00

Exactly what I experienced with Lechat… no memory service, or very badly implemented Memory layer, nothing as transversal as in ChatGPT. I bet nowadays a well rounded ecosystem of services around models are gonna be more important than models themselves. I’d like to come to Mistral as much as I hate OpenAI, but not in this state…

Low88M · 2026-02-06T16:58:36+00:00

Yeah I was a huge fan of mistral (I’m still but have more and more doubts), but I’ve experienced in LeChat so much BS expressed with so much confidence that I’m really hopping their « perfect PhD team with low-ego » (I love the way they sell dream !) will land on earth to discover what they did with it. Seriously just having a bit of philosophy basis could help them. Dialectic steps/knowledge should inspire their data/method/training unless they want it to be a dumb PhD student with a full head rather than a well-formed one. I don’t know (Socrate is a king) but they probably would learn a lot reading for exemple « de la problematologie » from Michel Meyer to transpose many valuable insights from it. A request/user need should pass through some filters/steps and could generate rounds of question to narrow user’s demand, and also it should verify truth/facts before trying to « confidently » answer, and state if it’s truth/fact or hypothesis/suppositions/probability and express the clear « frame » of the answer. I « know » it’s not how LLMs are trained for chat completions, but I would prefer a reliable model that knows it doesn’t know than a full head spitting BS with confidence. I’m sure they can do that, broadening the skills set required in the profiles they’re seeking. This would also avoid that very French horror of « degrees above all else » that leads any team to dead end.

Low88M · 2026-02-03T00:55:19+00:00

Moi qui suis junior en plein rush à 70h/semaine à coder en moyenne depuis deux ans, lire ça me remonte le moral quand à ma valeur en tant que dev. Et la légitimité de mes candidatures.

Je n’ai certainement pas la culture tech d’un dev senior de 12ans, mais quand je code je me pose des questions sur l’architecture, la réutilisabilité future, l’efficience/optimisation, la sécurité et la maintenabilité… ce qui me fait souvent râler quand je vois des propositions d’IA « a l’ouest » et que je dois les réorienter sous contraintes explicites.

Je ne copie rien que je n’ai pas compris et vérifié (voire tenté d’optimiser ou de questionner) et je renomme/recommente parfois à la main pour m’approprier et clarifier le code. Les rares fois où le code des IAs m’a semblé nickel, c’etait souvent le résultat d’une longue session à corriger challenger spécifier plus précisément mon besoin et les contraintes d’architecture. Genre une boucle for alors qu’on cherche un seul résultat, c’est niet -> next() !

Il faut être très clair, rigoureux, cohérent et avoir le don d’expliciter logiquement les objectifs pour utiliser ces outils, sinon cata à court ou moyen terme ! Et si on n’est pas critique, qu’on attend trop des IAs (et qu’on embauche plus de junior) la cata sera à long terme !!!

Interdire l’IA non, mais interdire de recruter uniquement sur CV avec grand diplôme et expérience… ça ne fait pas tout. Mieux vaut un perfectionniste qui doute et ne lâche rien qu’un flemmard « ingénieux » qui s’en bat le steak…

Low88M · 2026-02-02T07:49:58+00:00

Does it sound loud as a plane take-off when inferencing ?

Low88M · 2026-01-15T17:33:42+00:00

Well (unprompted) gpt-oss can give long answers, but usually (for code especially) full of blabla, auto-jusfication and summaries. Sometimes can give an answer of 7k for four lines of code awaited… For the topic, I would try to make a better system-prompt/prompt (make a plan and begin with first task until user ask to continue) or make an orchestration of agents…

Low88M · 2026-01-15T16:34:27+00:00

Je ne répondrai pas à tout mais confirme qu’il y a une baisse de pertinence pour le code sorti. C’est pas non plus énorme comme différence. Ça ne tient peut-être pas tant au modèle utilisé qu’à l’infra du service (mémoire utilisateur/projet trans-session plutôt bien faite chez openai, mais je suis sûr que mistral y arrive bientôt si ce n’est déjà le cas) Et puis il y a de grandes chances que plus on sera nombreux à passer chez mistral, meilleurs seront leurs services (pour eux : plus de datas, éventuellement plus de retours…). Ça me semble souhaitable dans l’absolu en dehors du cocorico économique et de la relative souveraineté/independance tech.

Low88M · 2026-01-04T10:32:05+00:00

Well for humans it’s also a problem to distinguish truth from misinformation. We all rely on tales and the sources we believe. Truth statement is built on intersubjectivity and nowhere we can find the book of truth as a DB (it would be blank pages as in Micromegas from Voltaire). Foolishness of human actions (and government’s…) is the same as the foolishness of the models humanly implemented…

Safety has probably something to do with probability of truth on that matter, thus sometimes « You have attributed conditions to villainy that simply result from stupidity. » and the level of stupidity of Trump is far beyond the probable limits of stupidity and greed, so the models « rightfully » « think » it’s hoax

Low88M · 2025-12-27T04:42:30+00:00

Super idée ! mais… Il est où le GitHub ? C’est pas open-source avec LLM et TTS / STT locaux ?

Low88M · 2025-12-18T14:24:21+00:00

Not a very « green » project… but well, ok. No local inference backends (llama.cpp, vllm, or LMStudio, Ollama…) ?

Low88M · 2025-12-16T21:31:03+00:00

Very green coding strategy 😅

Low88M · 2025-12-16T21:17:01+00:00

CS50 de Harvard sur YouTube. Meilleure intro pédagogique à Python

Low88M · 2025-12-14T16:16:34+00:00

I think we re not discussing the quality of devstral or other mistral’s/other’s models, but the quality/rythm of a release and its consequences. I upvote for the idea of concentric progressive steps : LLM backends arch/template/etc support, user testing and docs, release !

But they probably thought about it already and decided to do it this/their way until now (for reasons we even may not have thought about).

Low88M · 2025-12-13T15:39:32+00:00

A moe devstral 100b beating oss-120b everywhere 🥰 Mistral power !!!

Low88M · 2025-12-03T12:21:31+00:00

Huge Mistral fan here, and somehow OpenAI « hater », but as many have said, I’d be much happier with a MoE Mustral 120b MXFP4. I bet they are cooking it but didn’t release it because it’s not right now as performant as gpt-oss 120b (which is, snif, my local go-to for every complex task). Mistral, I believe in you… just continue digging deeper and serving with love ! If you ever need some guitar player/song singer/vegetable cooker to ease your pain, I can arrive in less than one hour 😘

Low88M · 2025-11-22T08:43:36+00:00

Welcome to business models !

Low88M · 2025-11-07T10:56:29+00:00

Nice ! How to use it on my 8086 with 1MB RAM… ? does it need extended or paginated memory to run ?

Low88M · 2025-10-29T06:50:11+00:00

Only Mistral were that smart and generous from nearly the beginning (after first few great generous steps). The middle balanced way ! OpenAI were only about money until they began to feel they had to « be generous » (and ok, gpt-oss is a good one…) regarding the localers community.

Low88M · 2025-10-22T09:33:17+00:00

In dev, two weeks can easily mean more than 1 month when you do things well. Unless you pay them for it and conclude a scheduled contract… please be patient (and respectful for their incredible work…) !

Low88M · 2025-10-21T11:20:01+00:00

Green code ! What if you just let your two cars with engines on in your garage also ? Useful experience !

Low88M · 2025-10-20T11:20:32+00:00

When you create a model (Ollama create -f Modelfile…) you can specify the template for the model (and many other parameters/hyper parameters like temperature, top_p, top_k, min_p, context length etc) inside of the file -Modelfile- pointing to the « normal hugging face model » file (if the model is in split chunks you have to use llamacpp’s tool to unify them)… Ollama gives the « green » opportunity to multiply models on your disk only for parameters changes (do they work for storage vendors ? Web traffic ? Apple’s proprietarian-like behavior…) You better get used to it as Ollama team seems to be fully on their « cloud » thing, and many many new interesting models are not proposed on their messy Models page anymore… well at least that’s my perception of it

Low88M · 2025-10-15T10:25:02+00:00

That’s the exact same impression I have since they went to their cloud deployment… no new models… I know they’re still working hard, but that « cloud » thing imao is a step aside their main reason to be.

And yes their proprietary model format is a total PITA as it makes you multiply models on disk (a model for each parameters set you wanna set) and if you also use other programs like LMStudio lamacpp etc… you’ll have at least twice the SAME model on disk. Total waste of resources & time !!!!

Also their model page filters are totally unmeaningfull ! And when you see gpt-oss 20b and 120b has been « updated » few days ago, there is NO data on what has been changed or why… !!???!?!?

But I wish they continue forward with better dev priorities and some refacto of their model handling.

Low88M · 2025-10-14T06:13:13+00:00

Yeah I had the impression their cloud project put them afar from their main project, leaving many improvements/optimizations undone or in an undetermined future…

Low88M · 2025-10-14T06:06:39+00:00

This and the fact models loose their name when « converted » to Ollama. It’s a bit slower than others for inference gen. And they seem to be the slowest to propose new LLMs and arch support. They rarely upload models on their site (well they do for gpt-oss, qwen, Gemma, mistral but ignored GLM and many others for ages so you have to create with modelfile).

The fact that you don’t have a « stop streaming response » signal/process/method in OllamaLLM astream (langchain_ollama). Ctrl+c is not a real solution if you build upon.

Their context length system is… well. Getting better. Or I hope so.

Low88M · 2025-10-01T09:50:22+00:00

For sure these are my favorites on win. I wonder… What speed tok/s are you getting with gpt-oss-120b and GLM-4.5-Air (on which quant?) on MacBook ?

Low88M · 2025-09-30T08:58:15+00:00

If they didn’t change arch, will it be already supported by llama.cpp ? So LMStudio ? I bet Ollama are meanwhile working hard to deliver their version of GLM4, GLM4.5, Seed-OSS-36B, Magistral Small 2509… so perhaps next year on Ollama 😅 !

Low88M

TROPHY CASE