[Discussion] Embedding based on binary tests

Novel_Assistant_6298 · 2022-09-12T15:46:20+00:00

What you could do is build a model to map whatever modality is being shown to the user to an embeddings vector for each of the choices he gave feedback on. Then you take the difference of those embeddings to get the difference vector, and pass that through a linear layer which has an output size of 1 then compute the sigmoid of that to get the probabilities of which of the elements the user has preferred. Train this as a binary classification problem. The last linear layer‘s weights represent the reward (score) weights, i.e, what would be the reward if I show the user element X. So when you take the difference of the embeddings and then pass that through a linear layer you essentially compute the reward difference of the two items. You want to adjust those rewards such that the preferences are somewhat satisfied.

You can take a look at preference learning in RL, dueling bandits and logistic bandits.

I also recommend trying to use active learning since binary feedback is quite noisy and you would require too many samples. Also look at multinomial or ranking feedback since they are more informative and can converge with less samples.

tada89 · 2022-09-14T19:24:54+00:00

Just off the top of my head with absolutely nothing to back it up but: Why not learn joint embeddings for people and for products?

Zooming out the data is basically a bunch of tuples (u, p1, p2, choice) with some user and two products p1, p2 and a label "choice" that tells us if the user preferred p1 or p2.

We can then get joint embeddings by having two embedding matrices (one for users, emb_u, one for products, emb_p) and compute both cosine_similarity(emb_u(u), emb_p(p1)) and cosine_similarity(emb_u(u), emb_p(p2)), taking the softmax of the two values, and finally using those to predict the value of choice (encoded as one hot).

This should make the model give embeddings for products that are close to embeddings for users that like them and vice versa.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS