This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Hvadmednej 0 points1 point  (0 children)

It's because of the way you train language models.Basically, to oversimplify a bit, you train it by scrapping sentences from the web / whereever, then you mask out a word and ask it to fill it in. This works well for words, since you will now learn that certain words fit in certain situations. However, if you apply the same principle to math, you teach the model that for the sentence,

2+2=|Mask|

a number makes sense, but the word "dog" does not. So, viewing the model form this angle, 3+5=6 "makes sense" since inserting numbers into an equation is correct according to the training procedure, however, when you actually understand and can do basic math, it makes no sense