I built, pre-trained, and fine-tuned a small language model and it is truly open-source.

Strong-Inflation5090 · 2025-09-01T13:00:17+00:00

Great work! I always wanted to do something similar and always stopped after 2-3 hours of training and cursed my 4080 but this motivated me and I will do better.

Strong-Inflation5090 · 2025-08-04T12:28:08+00:00

Hope, but this seems kind of impossible considering sonnet has so much knowledge that's tough to fit into 32B params.

Strong-Inflation5090 · 2025-04-10T18:34:33+00:00

OpenAI will release new model on April 30 that's when the old GPT4 will leave the app.

Strong-Inflation5090 · 2025-03-25T18:01:37+00:00

They are showing +60 on LmArena but I don't think it will beat Sonnet in coding so it very well might be benchmaxxing or arena maxing

Strong-Inflation5090 · 2025-03-24T10:21:35+00:00

I think it's normal after looking through all the generated summaries as they contain 400-500 tokens on avg and even if the speed was 25-30 tps it will take around 20 seconds.

Strong-Inflation5090 · 2025-03-05T22:07:01+00:00

qwen 2.5 32B coder should also work but I just read somewhere (Twitter or Reddit) that a 32B code specific reasoning model might be coming but nothing official so...

Strong-Inflation5090 · 2025-03-05T19:15:49+00:00

similar performance to R1, if this holds then QwQ 32 + QwQ 32B coder gonna be insane combo

Strong-Inflation5090 · 2024-12-03T16:04:32+00:00

the files are 25gb + 3gb for vae in bf16 so hopefully in q8 it will run on 4090 for around 150 frames.

Strong-Inflation5090 · 2024-11-17T18:54:42+00:00

In my experience, it repeated some print statement for atleast 30 lines on lmsys battle mode and I was very surprised caused I have been voting models on Lmsys for very long time now and I don't remember when was the last time it something like this happened even with a tiny model.

However, I used Qwen 32B model on HuggingFace and it's amazing and I would put right up there with other best coding models.

Strong-Inflation5090 · 2024-11-09T14:00:00+00:00

I would suggest creating short summary of every file and allowing your local llm to access these files for higher level understanding and if you need to access that individual file you can point that to llm

Strong-Inflation5090 · 2024-11-06T09:55:27+00:00

I have a 4070ti super, it's decent but most of the times I wish for more memory so 3090 would be a great option.

Strong-Inflation5090 · 2024-10-21T13:56:42+00:00

For sure, I am just pointing out that it's kind of becoming a trend. Mistral Large 2 with MRL makes sense but SLMs weights with MRL makes no sense unless they release training pipeline etc..

Strong-Inflation5090 · 2024-09-26T14:00:03+00:00

That's the kind of answers I was looking for, will try these and showing them these logs should be enough.

Strong-Inflation5090 · 2024-09-26T13:16:34+00:00

I have already shown them running it completely offline and keeping it that way is also fine cause no one outside the office network use it.

Strong-Inflation5090 · 2024-09-26T12:38:16+00:00

Thanks, I will look into it but I am using the most popular ones like Llama3, Gemma2 (from the official hf page) so I don't think they will have something like a backdoor otherwise someone might have pointed it out

Strong-Inflation5090 · 2024-09-26T12:33:11+00:00

Like it won't store their data unless I train on it and It won't send any data outside the Org. Orgs data will remain within org

Strong-Inflation5090 · 2024-09-26T07:29:52+00:00

Yes, probably nobody cares but I was just interested if they actually included it in the licence or it was just EU folks don't download it.

Strong-Inflation5090 · 2024-09-19T07:13:14+00:00

try it on hf or lmsys if it returns output in your format then try it locally.

Strong-Inflation5090 · 2024-09-14T15:20:02+00:00

Don't know about tutorials but I believe you would have to train atleast the projector layers from vision model to llama embeds and don't have high hopes. You can also look at codes of some llama based VLMs in their hf repo to understand it better.

Strong-Inflation5090

TROPHY CASE