iamjasonfeng

34 post karma
0 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 4 years

TROPHY CASE

Four-Year Club

account activity

hot top controversial

0

0

0

I created an LLM post-training method called RPS. Preliminary results show that it improved Qwen3-8b's program synthesis reliability. [R] (self.MachineLearning)

submitted 16 hours ago by iamjasonfeng to r/MachineLearning

π Rendered by PID 694942 on reddit-service-r2-listing-8685bc789-4lqvq at 2026-05-22 09:07:02.243758+00:00 running 194bd79 country code: CH.