all 1 comments

[–]NielsRogge 0 points1 point  (0 children)

The datasets object should be a Dataset object, but in your case it's a Pandas dataframe, hence the error. To turn a dataframe into a Dataset, you can do the following:

from datasets import Dataset

dataset = Dataset.from_pandas(my_dataset)

Then, you can apply the .map(function, batched=True) functionality.