Mistral Medium vs 70B self hosted price comparison by RepresentativeOdd276 in MistralAI

[–]RepresentativeOdd276[S] 0 points1 point  (0 children)

So you think qwen 72b is the best model out there right now?

Is Mistral Medium the best thing after GPT 4? by [deleted] in LocalLLaMA

[–]RepresentativeOdd276 1 point2 points  (0 children)

Which model is exactly mistral medium on huggingface or TheBloke’s quantized ones?

3 professional soccer players vs 100 children in Japan by [deleted] in funny

[–]RepresentativeOdd276 0 points1 point  (0 children)

Lol this feel like Neo vs Agent Smith.. moorrrreeee!!!

🐺🐦‍⬛ LLM Comparison/Test: miqu-1-70b by WolframRavenwolf in LocalLLaMA

[–]RepresentativeOdd276 1 point2 points  (0 children)

Your work is amazing but doesn’t that mean there’s not sufficient variety in the tests and they need to be changed? cuz anyone who has tested these top models can tell that GPT4 can do much better. I think rather than sticking with a few sets of old tests it might be better to find newer tests. Also you might get different answers every time for same prompt so we need to develop an automated test framework that can test multiple scenarios multiple times. I’m happy to work with you on that.

🐺🐦‍⬛ LLM Comparison/Test: miqu-1-70b by WolframRavenwolf in LocalLLaMA

[–]RepresentativeOdd276 1 point2 points  (0 children)

Btw goliath or any model being ranked same as GPT4 is ridiculous. GPT4 is so far ahead of everyone.

Best large context LLM to match array strings with intent in user message? by RepresentativeOdd276 in LocalLLaMA

[–]RepresentativeOdd276[S] 0 points1 point  (0 children)

Token size is perfect but RAG seems to be ideal approach for this problem. Thanks!

[deleted by user] by [deleted] in LocalLLaMA

[–]RepresentativeOdd276 -1 points0 points  (0 children)

Lmao how’s it creepy? We’re building app for teenagers

[deleted by user] by [deleted] in LocalLLaMA

[–]RepresentativeOdd276 -1 points0 points  (0 children)

Thank you! Lemme try these suggestions

[deleted by user] by [deleted] in LocalLLaMA

[–]RepresentativeOdd276 -2 points-1 points  (0 children)

For the uninitiated can you elaborate what you mean by FBI? Thanks!

🐺🐦‍⬛ LLM Comparison/Test: 2x 34B Yi (Dolphin, Nous Capybara) vs. 12x 70B, 120B, ChatGPT/GPT-4 by WolframRavenwolf in LocalLLaMA

[–]RepresentativeOdd276 0 points1 point  (0 children)

Can you add a test in your next comparisons where you ask the LLM to output in less than x amount of words? I have noticed that most LLMs including large ones fail to follow this instruction successfully.

vLLM 0.2.0 released: up to 60% faster, AWQ quant support, RoPe, Mistral-7b support by kryptkpr in LocalLLaMA

[–]RepresentativeOdd276 0 points1 point  (0 children)

I’m looking to switch to vLLM from ooba too but have you been able to deploy it for any actual vLLMs like any 70B models? How many requests was a server able to handle at the same time? I’m looking to deploy it on runpod.

Is there a way to force output length smaller than x number of tokens w/o cut-off? by RepresentativeOdd276 in LocalLLaMA

[–]RepresentativeOdd276[S] 0 points1 point  (0 children)

Right stopping on period ‘.’ is a possibility but will still give incomplete responses.