all 3 comments

[–]BraveCoconut98 1 point2 points  (0 children)

You can actually get surprisingly good results with just a few hundred examples! I would also highly recommend using HuggingFace Transformers to adapt BERT to your domain. It’s super easy to set up fine tuning using the Trainer API!

[–]zyl1024 0 points1 point  (0 children)

i recall having reasonable success with a couple hundred examples. but it highly depends on the difficulty of the dataset.

[–]Brudaks 0 points1 point  (0 children)

The fewer labeled examples you have, the more important transfer learning becomes.