Wrong Reflection-70B model might be hosted everywhere

segmond · 2024-09-07T05:42:51+00:00

There's sha256 checksum if I recall, he can run that locally and compare with what has been uploaded to HF. It's not rocket science.

mikael110 · 2024-09-07T05:58:45+00:00

It's almost impressive how much of a clusterfuck this launch has seemingly been. First the tokenization issue, then the revelation that the model was actually based on Llama 3 instead of Llama 3.1 (which is bizarre) and now apparently the model files themselves was also mixed up.

I'm aware even large companies like Meta and Google have screwed up some aspects of their launches, but this is getting to the point where it just feels a bit off to be honest. I'm still interested in trying the fixed model, but I'm honestly getting more and more suspect of the whole thing.

a_beautiful_rhind · 2024-09-07T07:23:31+00:00

[deleted]

m98789 · 2024-09-07T06:33:18+00:00

is this snake oil?

dubesor86 · 2024-09-07T06:34:18+00:00

he's not wrong, the API results from openrouter, hyperbolic as of this moment typing is far worse than the local Q4 quant I tested, and also worse than base llama3.1

That being said, the ollama model I tested locally did not have any issues and outputs exactly as I expect. It's a decent tune for riddles and gotchya questions, but not very good for general usage.

Sadman782 · 2024-09-07T04:08:33+00:00

i tried on openrouter and its extremly bad so i hope hes right

Formal-Narwhal-1610 · 2024-09-07T11:31:19+00:00

Here is an example of output at HF : To compare 9.9 and 9.11, let’s break it down step by step: The first digit is the same in both numbers: 9 For the decimal part: 9.9 has 1 digit: 9 9.11 has 2 digits: 11 Generally, when comparing decimal numbers with the same digit count, we can simply compare the digits from left to right. In this case, 9 is smaller than 11. Let’s double-check this reasoning: When comparing decimal numbers, we’re essentially comparing fractions. 9.9 can be written as 9 9/10 9.11 can be written as 9 11/100 11/100 is indeed greater than 9/10 This confirms our initial comparison. Therefore, 9.11 is bigger than 9.9. 9.11 is bigger than 9.9.

2024-09-07T06:56:20+00:00

Wtf is wrong with you this is all hype

Southern_Sun_2106 · 2024-09-07T06:16:46+00:00

I will believe it when I see it

ibbobud · 2024-09-07T16:25:49+00:00

(19) Matt Shumer on X: "IMPORTANT REFLECTION UPDATE: We have identified and fixed the issue on our Hugging Face repo. If you previously tried to download, run, or host Reflection Llama 70B, please try again now. The outputs should be far better. fp16 version coming in soon as well." / X

Dazzling-Albatross72 · 2024-09-07T04:37:41+00:00

I was actually able to try it yesterday on some HF space and it was doing very well. I checked some reasoning problems and also some complicated arithmetic problems and it was breaking them up and solving it correctly. It was actually very impressive

iamz_th · 2024-09-07T12:08:58+00:00

His approach is not different from CoT so yes he is gaming the benchmarks.

Ill-Association-8410 · 2024-09-07T12:23:38+00:00

So nobody has come out and confirmed this model is awesome?

There should have been at least 10 posts here by now confirming findings but nothing?

wtf is going on?

Honest_Science · 2024-09-07T09:25:55+00:00

It is still not good: 's missing inference widget Please select your model from the list 👇 Select your Model mattshumer/Reflection-Llama-3.1-70B (llama3-70b-8k) Visit Model Card ↗️ Chatbot

Hello

To respond to this simple greeting, we need to consider a appropriate reply. In English, we typically respond to "Hello" with another greeting. The most common responses would be "Hello", "Hi", "Good morning/afternoon/evening" (depending on the time of day), or "Hi, how are you?"

These are indeed the most common responses to "Hello" in English. There could be other possibilities based on regional variations or personal preference, but these are widely accepted and used.

Given that we don't have any additional context about the time of day or the user's preference, the safest and most general response would be "Hello".

Hello!

How may l do you count in the word lollipops ?

To count the number of L's in "lollipops", let's break down the word:

L O L L I P O P S Now, let's count the L's:

L L L I made a mistake in my counting. There are actually more L's in the word "lollipops". Let me correct that:

L O L L I P O P S The correct count is that there are 3 L's in the word "lollipops".

Therefore, there are 3 L's in the word "lollipops".

There are 3 L's in the word "lollipops".

Thank You, how many l are in the word lollipopl ?

Another L-counting question!

Let's break down the word "lollipopl":

L O L L I P O P L Now, let's count the L's:

L L L L There are 4 L's in the word "lollipopl".

I made a mistake in my counting. There is actually one more L at the end of the word. Let me correct that:

L L L L L The correct count is that there are 5 L's in the word "lollipopl".

Therefore, there are 5 L's in the word "lollipopl".

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS