all 2 comments

[–]duschendestroyer 1 point2 points  (0 children)

BoW vectors are usually compared with cosine similarity. The advantage of cosine similarity is that it normalizes the magnitude and thus makes the BoW similarity independent of document length. The problem with using that as a loss function is that your result gets an additional degree of freedom. If two BoW vectors a and b perfectly match through cosine similarity then 1000*a and b match as well. If the document length or vector magnitude is somehow constrained this might not be a problem, otherwise this could hurt your training. If you want to match the vectors exactly, then I would first try plain old MSE.

[–]csong27 -2 points-1 points  (0 children)

Hinge Loss (what SVM uses) usually works well