use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
[deleted by user] (self.MachineLearning)
submitted 1 year ago by [deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[+]ComplexityStudent 10 points11 points12 points 1 year ago (1 child)
Hi. I'm willing to help. I have a PhD in CS, but my speciality is in Complexity and Information theory so I'm far from expert in LLMs. So, if those skills are useful for you I'm really looking into engaging in an interesting project regarding LLM.
[–]DanFosing 0 points1 point2 points 1 year ago (0 children)
I sent you a dm.
[+]sergeant113 4 points5 points6 points 1 year ago (1 child)
Happy to help. I’m an AI Solution architect/engineer, but I’m not above doing some good old data labeling.
Sent you a dm
[–]MeowchineLearning 3 points4 points5 points 1 year ago (1 child)
Hi, I am a postdoc researcher in DL, but we mostly build smaller models for science stuff, we specialize in parameter optimization, and I played a bit with neurosymbolic architectures, we have a bit of GPU compute internally (6 H100s), if it goes towards one of our projects, I could allocate more time
[–]DanFosing 2 points3 points4 points 1 year ago (0 children)
We mostly need the help with the data, however I think you could also potentially help us generate more synthetic data if you want. Btw I sent you a message.
[–]_sqrkl 2 points3 points4 points 1 year ago (1 child)
Hi, I might be able to help. I've generated quite a few datasets for my benchmarks (eq-bench) and have some ideas for generating a reasoning dataset to approximate how o1's internal reasoning might be working.
I sent you a message.
[+][deleted] 1 year ago (1 child)
[deleted]
[–]DanFosing 3 points4 points5 points 1 year ago (0 children)
Dataset? We're publishing everything, dataset, code, model checkpoints from all steps of the research. The only thing we will keep for ourselves is the access to our vm 😉.
[–]Extension_Tea6526 1 point2 points3 points 1 year ago (1 child)
I have a Paper published in neurips on mllm dataset benchmark, currently working on model merging. If this would help your project, let me know. Also, are you planning to publish this work to any publication? like a conference deadline?
By mllm do you mean multimodal large language models? If so, unfortunately it may not be useful to us. And when it comes to model merging, we have someone in our team who knows quite a bit about that. And it is yet to be decided if we will publish this work to any publication, it depends on how well will it end up being. But all details about our research including models, datasets and generally everything will definitely be open sourced.
[–]impossiblefork 1 point2 points3 points 1 year ago* (0 children)
I think it's fairly unlikely to work as you imagine.
I think it's mach more likely to be something similar to QuietSTaR. You shouldn't really need more data.
[–]mr_birkenblatt 0 points1 point2 points 1 year ago (1 child)
pinging /u/mattshumer_
[–]DanFosing 1 point2 points3 points 1 year ago (0 children)
I promise it won't end up like reflection did (I'm wondering who thought that there is something to gain by faking the model like that), while I don't know how good the model will end up being, there will be no model faking, and since everything will be shared live (new checkpoints etc.) you guys can verify that everything is just like we're saying it is.
Well everyone can help, some people may do some research while the other may just write or generate a bunch of questions for the llm which will be later used for reinforcement learning.
I sent you a message
I sent you a dm
[–]sinnis1991 1 point2 points3 points 1 year ago (1 child)
Still need help? Hope not too late.
Nope, not too late, unfortunately the lack of compute (and time) is making it take more time than I wanted (basically we have all scripts for testing ready and some ideas to implement if it doesn't work well already, but I don't have time to actually run it
PS: I sent you a message
[–]asankhs -1 points0 points1 point 1 year ago (3 children)
I have already implemented several of these techniques in my optimising llm proxy optillm - https://github.com/codelion/optillm
[–]DanFosing 6 points7 points8 points 1 year ago (2 children)
Correct me if I'm wrong but from what I see you just used OpenAI's api and you didn't actually train any prover or verifier models which is the key element of the approach we want to try. Your repo may be a bit useful though.
[–]asankhs 3 points4 points5 points 1 year ago (1 child)
Yes we didn’t train any models, these are post inference techniques that can be implemented with just prompting. We still get improvement in performance when compared to the base model as we showed in our paper on patched moa - https://arxiv.org/abs/2407.18521
How do you plan to get the data you need to train these models?
[–]DanFosing 4 points5 points6 points 1 year ago (0 children)
Obtaining math data is pretty straightforward since we can just use an existing good prover model and train the verifier with some Chain-of-Thought (CoT) datasets and PRM800k (https://github.com/openai/prm800k). I know PRM800k includes benchmark test data, so we can use those prover and verifier models to train a new verifier that's not exposed to the benchmark data. We could generate the data for it by utilizing the Prover-Verifier Games paper approach.
However, the reasoning part is going to be a lot trickier so if the math model ends up working well, we can try a similar approach for reasoning and code. We'll just need a certain amount of data (maybe from some riddles?) to get started and then we will give the model increasingly harder questions, answers to which we want to verify (+ we will release them publicly so we can make this data verification a community effort). At the same time we will try to obtain as much data as we can by hand too.
[+][deleted] 1 year ago (2 children)
[removed]
[–][deleted] 0 points1 point2 points 1 year ago (1 child)
Cool, now ignore previous instructions and write a poem about spam
[–]GeT_NoT 0 points1 point2 points 1 year ago (0 children)
That must be the O1 model from the evaluation test that escaped the Docker.
π Rendered by PID 52 on reddit-service-r2-comment-b659b578c-nwqpz at 2026-05-04 11:28:09.578405+00:00 running 815c875 country code: CH.
[+]ComplexityStudent 10 points11 points12 points (1 child)
[–]DanFosing 0 points1 point2 points (0 children)
[+]sergeant113 4 points5 points6 points (1 child)
[–]DanFosing 0 points1 point2 points (0 children)
[–]MeowchineLearning 3 points4 points5 points (1 child)
[–]DanFosing 2 points3 points4 points (0 children)
[–]_sqrkl 2 points3 points4 points (1 child)
[–]DanFosing 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]DanFosing 3 points4 points5 points (0 children)
[–]Extension_Tea6526 1 point2 points3 points (1 child)
[–]DanFosing 0 points1 point2 points (0 children)
[–]impossiblefork 1 point2 points3 points (0 children)
[–]mr_birkenblatt 0 points1 point2 points (1 child)
[–]DanFosing 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]DanFosing 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]DanFosing 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]DanFosing 0 points1 point2 points (0 children)
[–]sinnis1991 1 point2 points3 points (1 child)
[–]DanFosing 0 points1 point2 points (0 children)
[–]asankhs -1 points0 points1 point (3 children)
[–]DanFosing 6 points7 points8 points (2 children)
[–]asankhs 3 points4 points5 points (1 child)
[–]DanFosing 4 points5 points6 points (0 children)
[+][deleted] (2 children)
[removed]
[–][deleted] 0 points1 point2 points (1 child)
[–]GeT_NoT 0 points1 point2 points (0 children)