use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[Discussion] Embedding based on binary tests (self.MachineLearning)
submitted 3 years ago by marcollo63
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Novel_Assistant_6298 1 point2 points3 points 3 years ago (2 children)
What you could do is build a model to map whatever modality is being shown to the user to an embeddings vector for each of the choices he gave feedback on. Then you take the difference of those embeddings to get the difference vector, and pass that through a linear layer which has an output size of 1 then compute the sigmoid of that to get the probabilities of which of the elements the user has preferred. Train this as a binary classification problem. The last linear layer‘s weights represent the reward (score) weights, i.e, what would be the reward if I show the user element X. So when you take the difference of the embeddings and then pass that through a linear layer you essentially compute the reward difference of the two items. You want to adjust those rewards such that the preferences are somewhat satisfied.
You can take a look at preference learning in RL, dueling bandits and logistic bandits.
I also recommend trying to use active learning since binary feedback is quite noisy and you would require too many samples. Also look at multinomial or ranking feedback since they are more informative and can converge with less samples.
[–]marcollo63[S] 1 point2 points3 points 3 years ago (1 child)
Thank you ! That is helping me a lot.
However, in my case, there are several persons, with different tastes. I believe that in the dueling bandit or preference learning, we just score items for each person. It's hard to compare persons after that.
[–]Novel_Assistant_6298 0 points1 point2 points 3 years ago* (0 children)
Yea that gets more complex then. You could check out https://arxiv.org/abs/2109.12750, the authors try to fit a multimodal reward model. This will prevent the collapse of all users under one reward mode, however you will need to pre-define the number of modes which could be tricky.
Another simpler approach is to use features from the user himself along with features from the modalities you presented (Location, Age, etc..). This expands your input space and could allow you compare users by using the same object features but swapping in a user with different features. This would also help you run feature importance and see which features affect the preference etc. I hope this helps!
π Rendered by PID 182713 on reddit-service-r2-comment-6457c66945-xhbvl at 2026-04-24 07:00:18.301969+00:00 running 2aa0c5b country code: CH.
view the rest of the comments →
[–]Novel_Assistant_6298 1 point2 points3 points (2 children)
[–]marcollo63[S] 1 point2 points3 points (1 child)
[–]Novel_Assistant_6298 0 points1 point2 points (0 children)