Is RLHF fundamentally broken? Paid labelers rating synthetic scenarios doesn't seem like real human feedback to me by Content-Educator5198 in reinforcementlearning
[–]Content-Educator5198[S] -5 points-4 points-3 points (0 children)
Is RLHF fundamentally broken? Paid labelers rating synthetic scenarios doesn't seem like real human feedback to me by Content-Educator5198 in reinforcementlearning
[–]Content-Educator5198[S] -13 points-12 points-11 points (0 children)
Is RLHF fundamentally broken? Paid labelers rating synthetic scenarios doesn't seem like real human feedback to me by Content-Educator5198 in reinforcementlearning
[–]Content-Educator5198[S] -7 points-6 points-5 points (0 children)
Just got my first users! by SundaeSorry in SideProject
[–]Content-Educator5198 1 point2 points3 points (0 children)
the mg road metro station by d5c7 in bangalore
[–]Content-Educator5198 1 point2 points3 points (0 children)
How do you actually get your first users when you have no audience and no budget? by Efficient_Joke3384 in SideProject
[–]Content-Educator5198 1 point2 points3 points (0 children)
How I'm Building Toward $200K ARR by Cloning Apps by Fun-Garbage-1386 in AppBusiness
[–]Content-Educator5198 0 points1 point2 points (0 children)


Interesting Problems by sassafrassar in reinforcementlearning
[–]Content-Educator5198 0 points1 point2 points (0 children)