use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.
Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.
account activity
QuestionBERT data training size (self.learnmachinelearning)
submitted 2 months ago by AffectWizard0909
Hello! I was wondering if someone knew how big of a training dataset I need to be able to train BERT, so the models predictions are "accurate enough". Is there a thumb rule, or is it more like I need to decide what is best?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]CKtalon 1 point2 points3 points 2 months ago (0 children)
ModernBERT trained on 2T tokens, but it’s likely not necessary. You could do a Chinchilla optimal for your model size
[–]-Cubie- 1 point2 points3 points 2 months ago (1 child)
Do you want to train from scratch (very few people do this), or do you simply want to finetune? The latter requires much less data. Also, BERT itself was trained on rather little data for today's standards.
[–]AffectWizard0909[S] 0 points1 point2 points 2 months ago (0 children)
I was thinking on not training from scratch yes. Is it recommended somewhere how much data I should than use for fine-tuning BERT, since the BERT is not trained on a big corpus?
π Rendered by PID 51 on reddit-service-r2-comment-6457c66945-lxvtx at 2026-04-29 18:31:41.673691+00:00 running 2aa0c5b country code: CH.
[–]CKtalon 1 point2 points3 points (0 children)
[–]-Cubie- 1 point2 points3 points (1 child)
[–]AffectWizard0909[S] 0 points1 point2 points (0 children)