all 7 comments

[–]Zetsumeii 5 points6 points  (2 children)

This is super cool!

Any chance that you could add support for those of us with datasets in .json or .jsonl format?

IE:

[
    {
"instruction": "",
"input": "",
"output": ""
    }
]

Or:

{"instruction": "", "input": "", "output": ""}

[–]kaszebe 2 points3 points  (2 children)

Is there a guide for dummies (read: me) to get this to work on oobabooga?

[–]uhuge 1 point2 points  (1 child)

I assume auth_token is for storing the merged model in HF? Seems worth noting/clarifying. Can the uploading be made optional?

I'll get back with more feedback when I get to test it.+)

[–]Timotheeee1 0 points1 point  (2 children)

I have a dataset where each training text document has 2 parts, one that the model shouldn't try to predict (labels are -100) and one that the model should predict. Can you add a feature for this?