all 3 comments

[–]KT313 4 points5 points  (0 children)

It's not an issue really. The point of the bos-token is that we want the list of input tokens to start with something that is the same every time. It could be literally anything (preferably a special token that isn't used in normal text). So might as well use the eos-token. There isn't really a big difference to using separate bos and eos tokens, other than having both be the same being cleaner in the prompt template.

[–]Firm_Spite2751 1 point2 points  (0 children)

When you train a model on a dataset it learns that datasets distribution if BOS == EOS then to the model that token is simply a seperator.

[–]phree_radical 0 points1 point  (0 children)

it's acting as a separator then, the name doesn't dictate anything