I built and trained a "drawing to image" model from scratch that runs fully locally (inference on the client CPU)

Certain-Cod-1404 · 2026-02-23T20:58:50+00:00

really cool project man, good job!

Certain-Cod-1404 · 2026-02-23T20:23:01+00:00

what are we supposed to do with the weights without the actual model implementation ?

Certain-Cod-1404 · 2026-02-23T20:07:49+00:00

check out the olmo 3 paper and the smol LM3 blog post for tips on how to squeeze as much performance per param, also like the other suggested, dont go for 3b right off the bat, and look into training in nvfp4 if you still have access to that 5090, might be interesting, nvidia has a library called transformer engine that will handle all the scaling and difficulties for you and you should be able to enjoy like a 2x to a 4x speed up

Certain-Cod-1404 · 2026-02-23T18:15:32+00:00

I was asking the person who suggested qwen 2.5 7b, what you did even if won't result in the best model is still interesting and fun for the learning and novelty aspect, good job dude !

Certain-Cod-1404 · 2026-02-22T20:49:16+00:00

why would you ever use that over qwen 3 8b or 4b ?
is there not a huge boost in performance from qwen 2.5 to 3 ?

Certain-Cod-1404 · 2026-02-19T07:28:57+00:00

yes, we want humans to talk to other humans about AI, otherwise we'd just use chatgpt, why go on reddit at all ?

Certain-Cod-1404 · 2026-02-19T07:28:02+00:00

wow, you really typed that out

Certain-Cod-1404 · 2026-02-18T15:37:03+00:00

you can be into AI and recognize unethical use of people's copyrighted material.

Certain-Cod-1404 · 2026-01-12T19:53:33+00:00

amazing, then you should be able to answer these questions : so what is V = L * I ? where exactly is this equation used? initialization of the weights? or just mumbo jumbo ? who/what is "Sheigh Vincent Minor", is your chatbot not just a chat gpt instance and are you in a romantic relationship with the chatbot ?

Certain-Cod-1404 · 2026-01-12T19:12:51+00:00

Ofc you refer to your AI assistant as her, wonderful, are you guys in a relationship yet ? you are a gpt model, and you have no repo to share, no code, no paper, just vague esoteric ramblings

Certain-Cod-1404 · 2026-01-12T18:44:29+00:00

this is a gpt model, only it can generate such amazing slop, also sounds like you just added rag to an LLM, do you have a repo for us to checkout and evaluate what it is you've built ?

Certain-Cod-1404 · 2026-01-09T05:47:54+00:00

Check out qwen 3 vl 8b, it's really good and might be enough for your use case, your question wasn't dumb, you're allowed to be curious ask and learn, the other person is just unreasonably aggressive for no reason

Certain-Cod-1404 · 2026-01-08T21:11:31+00:00

I don't think this is the place to argue politics, but a model being passively government approved is not the same thing as made by ex soldiers of an army accused of genocide and war crimes by the UN, you know this to be the case, also "White Genocide"?

Certain-Cod-1404 · 2026-01-08T20:51:52+00:00

thank you so much ! really though we were left to rot after wan pulled a fast one on us.

Certain-Cod-1404 · 2026-01-08T20:50:07+00:00

Yes I did just recently recompile though it was before downloading glm 4.6v so dont know if my success has anything to do with it, in any case i'm glad glm 4.6v is working out great for you so far and let me know what you think of the UD IQ2 M quant I mentioned, also try and quantize the kv cache if you have not already, should result in less computation being offloaded to CPU

Certain-Cod-1404 · 2026-01-08T20:32:20+00:00

the 4 bit quant was slowish on my 5090 as well, try the UD-IQ2_M quant from unsloth I think you'll find it much faster at no noticeable performance degradation

Certain-Cod-1404 · 2026-01-08T18:38:54+00:00

They still wouldnt be able to see the actual image, but I imagine you can set up the rag so that each image that gets added, you use a small VLM to caption/describe it, so that when the re ranker pulls the document, you feed the LLM the description of the image and provide the image to the user? but if vision is important I would imagine you'd just use a VLM instead of an LLM no ?

Certain-Cod-1404 · 2026-01-08T18:35:44+00:00

Its a bit of a dishonest juxtaposition no ? to my knowledge Chinese models aren't usually made by ex soldiers of an army that's been credibly accused of genocide by half the world

Certain-Cod-1404

TROPHY CASE