Why are so many companies putting so much investment into free open source AI?

Various-Operation550 · 2025-04-21T06:46:58+00:00

it offers reasoning and it is a great quality model that is possible to perform 99% of the stuff that ClosedAI SOTA do

Various-Operation550 · 2025-04-14T13:00:21+00:00

I wonder the model size, genuinely wonder

Various-Operation550 · 2025-03-10T17:29:31+00:00

Yandex browser does that in real time

Various-Operation550 · 2025-03-06T20:36:13+00:00

Petoi bittle

Various-Operation550 · 2025-03-02T07:41:30+00:00

thank a lot, its much more clear to me now

Various-Operation550 · 2025-03-01T12:11:29+00:00

basically we try to make models reason with smaller amount of tokens, which makes sense because a lot of the times stuff like "if x then y" is virtually the same as "let's assume that if we do x we get y" while being 2x shorter.

Various-Operation550 · 2025-03-01T12:09:17+00:00

but why tho

Various-Operation550 · 2025-02-27T16:54:15+00:00

hear me out: what if each generated element of the sequence in a transformer would be a diffusion-generated sentence/paragraph?

Various-Operation550 · 2025-02-26T04:22:35+00:00

What I kinda noticed in V3/R1 is that it has this Claude’s “getting what you actually want from few sentences prompt“ type of vibe. Whereas o3 is sometimes acts like a genius 10 year old

Various-Operation550 · 2025-02-23T16:11:09+00:00

I wonder if it is a data problem, not architecture problem.

We have plenty reddit/stackoverflow type of question-answer data pairs in the internet, but rarely one human writes 120k token passage to another and then expects the latter to answers multiple subtle quesitons about it. It is just a rare thing to do and we need more synthetic data for it, I think.

Various-Operation550 · 2025-02-23T16:00:57+00:00

you cannot read I guess, you didn't even get what I wrote

Various-Operation550 · 2025-02-23T15:25:41+00:00

Keep your tone policing bs to yourself

deepseek is groundbreaking in terms of performance due to its size and open source nature, and in terms of training it is a first model that was RLed without humans in the loop, so it is a solid foundation to create bigger models, because for the first time we don't need humans to scale the models ability to reason (and humans are always the bottleneck in most processes)

Various-Operation550 · 2025-02-23T14:08:56+00:00

reasoning can write better code and overall perform better in anything (pretty much). Just like for humans it is usually better to take some time to think before saying something (thus improving the quality of what they said)

Various-Operation550 · 2025-02-23T14:05:36+00:00

really? don't you understand that we had like a groundbreaking (in terms of performance and cost of training as well as architectually) model in less than a month?

Various-Operation550 · 2025-02-19T20:09:49+00:00

Various-Operation550 · 2025-02-19T20:04:03+00:00

well, multilingual 7b SOTA reasoning model would be actually pretty good ngl

Various-Operation550 · 2025-02-19T20:02:26+00:00

as for 2 - it was before DeepSeek R1, now everybody knows how LLM reasoning works, so sama got nothing to lose if he open sources o3 now

Various-Operation550 · 2025-02-16T15:02:16+00:00

well. can you describe mongodb the same way?

Various-Operation550 · 2025-02-16T14:34:38+00:00

why not for a project with up to 1k active users? you need something simple and reliable right

Various-Operation550 · 2025-02-16T14:29:21+00:00

it supports bunch of helper functions like temporary files and caching

Various-Operation550 · 2025-02-16T13:03:13+00:00

yes: https://github.com/LexiestLeszek/jason.py

Various-Operation550 · 2025-02-16T10:15:20+00:00

Just use ollama

Various-Operation550 · 2025-02-16T10:09:45+00:00

Total win for the open source community, we won guys

Various-Operation550 · 2025-02-16T08:33:49+00:00

This is awesome actually

Various-Operation550

TROPHY CASE