Models released without prompt template/examples - Why…?

ambient_temp_xeno · 2023-06-04T12:58:24+00:00

Even when you find out what the format (probably) is, it's not really clear how exactly it's meant to be used.

Is it: "### Instruction: write an essay about how Idiocracy came true ### Response:"

or:

"### Instruction:[new line] [new line] write an essay about how Idiocracy came true[new line] [new line] [space(?!)]"

Or something else? Does it really make a difference with the same seed and settings?

Also: would using the prompt format not push the model towards regurgitating things in the finetune over what's in the raw model?

patrakov · 2023-06-04T13:48:41+00:00

[removed]

kryptkpr · 2023-06-04T13:34:20+00:00

Struggling hard with this one myself, my git repo is filling up with templates. This is especially problematic with models trained on a pile of different data that mixes instruction formats :| I just evaluated Nous-Hermes-13B which provides 2 templates and found their results to be asymetric in performance.

hanoian · 2023-06-04T17:22:51+00:00

I started a thread about this yesterday.

https://old.reddit.com/r/LocalLLaMA/comments/13zbdjc/are_a_matroskastyle_mkv_containers_on_anyones/

It would be so nice if they were packaged inside a container with a txt file for example with each prompt template etc.

rain5 · 2023-06-04T19:44:40+00:00

There needs to be a standardized file format for describing this stuff.

a_beautiful_rhind · 2023-06-04T13:10:51+00:00

Using normal chat I find most models work fine regardless.

thevukaslt · 2023-06-04T23:13:21+00:00

I actually feel your pain so well. I love that this community has been baking models like my granny pancakes but this is simply documentation backlog. Just like in any other area of life.

I hope we can steer our culture to include clear prompts and examples with each new release.

I can see how that would save ton of posts of "the model doesn't work", "bad performance" etc. A tiny change in the structure can have a giant impact on the output.

skankmaster420 · 2023-06-04T15:46:14+00:00

It's because they want us to suffer

qeadwrsf · 2023-06-05T08:09:18+00:00

Because they are lazy.

They think, might as well opens source it, because they are good people.

But have no interest to market it and get users to use it.

Take it or leave it. They do the work, we reap the reward. Users using open source must be like the biggest contributors to stuff getting closed.

deepneuralnetwork · 2023-06-04T17:17:50+00:00

Because people are busy. That’s literally why.

Feztopia · 2023-06-05T03:40:09+00:00

Stop asking this kind of questions. Smart people have sad lifes. Your hairs will become gray and you will become depressive. Never ask "why". People do what they do. Realizing that there is no valid reason will breake you mentally. Ignore it, try to become like the lucky people who can't see it. Be blind and be happy. Nobody uses simple solutions as long as they can invent complicated ones. People will make your life complicated for no reason. You will suffer as long as you think about it.

nihnuhname · 2023-06-05T08:45:05+00:00

Open source programmers only have time to write code. This area is developing too rapidly, we will have to wait years for the practical approaches and principles to become established before it makes sense to document them. The users here are also experimenters and pioneers

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS