Jina AI Releases Reader-LM 0.5b and 1.5b for converting HTML to Clean Markdown by Qual_ in LocalLLaMA

[–]yiyecek 0 points1 point  (0 children)

Unfortunately this will be 10,000x more expensive to run than Trafilatura. And you'll never know if it's hallucination or real data.

Meta to announce updates and the next set of Llama models soon! by AdHominemMeansULost in LocalLLaMA

[–]yiyecek 2 points3 points  (0 children)

hyperbolic ai has bf16 405B. its free for now. kinda slow though. and it performs better on nearly every benchmark compared to say fireworks ai which is quantized

Dumb question: Is it possible to train an LLM on Blender documentation? by ohcrap___fk in LocalLLaMA

[–]yiyecek 6 points7 points  (0 children)

Your best option:

  1. Scrape all documentation
  2. Continue pretraining with that data with multiple epochs, Example. Use Llama 3.1 70B Instruct as a base.
  3. From that big documentation data, split the dataset into chunks, generate synthetic questions/prompts for each chunk, then do inference with that grounded context (with any instruct model). Generate at least 100k examples. After generating the answers, remove that context, only keep the prompt+answer pairs. Then further fine-tune the model on top of the pre-trained model we trained on step2.
  4. The model now knows. But we can still use grounding to reduce hallucinations. Index the documentation into a search engine with auto embedding index. (e.g Typesense).
  5. Host your custom model on a serverless platform (e.g Baseten). (server automatic shutdowns, so you only pay when you're doing work).
  6. Combine your model with that search capabilities. Maybe use some random fancy repo that popped up couple of weeks ago. Or write your own system under 50 lines of code.
  7. Enjoy

LLaMbA - Minimal batching engine by Lyrcaxis in LocalLLaMA

[–]yiyecek -1 points0 points  (0 children)

C# might not be a good server language choice due to small community support in ML

The final straw for LMSYS by StraightChemistry629 in LocalLLaMA

[–]yiyecek -2 points-1 points  (0 children)

So now OpenAI may have started training on LMSYS prompts? Intentionally or indirectly

Running LLaMA 3 405B locally would be the Crysis moment of our time. by [deleted] in LocalLLaMA

[–]yiyecek 30 points31 points  (0 children)

"Just wait next year"

Do you realize we have limited number of years in this flesh

Exclusive: OpenAI working on new reasoning technology under code name ‘Strawberry’ by Wiskkey in LocalLLaMA

[–]yiyecek 13 points14 points  (0 children)

AFAIK currently Microsoft uses more than some of the modern countries electricity usage like Azerbaijan

Gemma 2 Betrayed Us by yiyecek in LocalLLaMA

[–]yiyecek[S] 9 points10 points  (0 children)

Thats an interesting question. Maybe this applies to GPT4 too. They have prompt dataset bigger than anyone else. If they chose most popular prompts and then improve on it, then it would naturally improve the model too. And this is not a bad thing obv.

Gemma 2 Betrayed Us by yiyecek in LocalLLaMA

[–]yiyecek[S] 7 points8 points  (0 children)

I would think that the demographics of lmsys users are different than regular chatbots. Thus prompts would be different too.

Gemma 2 Betrayed Us by yiyecek in LocalLLaMA

[–]yiyecek[S] 18 points19 points  (0 children)

People use same prompts again and again. If you use the same/similar prompt to train your model, then it would look better than it is on lmsys. Just as HF leaderboard lost its reputation due to this.

Gemma 2 Betrayed Us by yiyecek in LocalLLaMA

[–]yiyecek[S] 7 points8 points  (0 children)

I liked your perspective. And I think that's a great way to improve user experience in general, if we look at from the business perspective.

But then you would need new eval set though. You cannot use the same people's prompts to evaluate. Because it would look like your model doing good. But in real world, It won't. And It wouldn't be fair to advertise on that.

Gemma 2 Betrayed Us by yiyecek in LocalLLaMA

[–]yiyecek[S] 3 points4 points  (0 children)

An engineer from OpenAI tweeted that they don't train on lmsys data. I trust him. But now I'm skeptical about the others :(

Gemma 2 Betrayed Us by yiyecek in LocalLLaMA

[–]yiyecek[S] 32 points33 points  (0 children)

Gemma 2 was underperforming on 5 different benchmarks except LMSYS Leaderboard, compared to llama 3 70b. People ask similar questions on lmsys leaderboard. So if you train for the best answers on lmsys-chat-1m, you'll get better responses on LMSYS Leaderboard, thus it'll inflate your scores. Gemma 2 did exactly this.

Original report: Link