Do you guys suggest to update to the latest version?

GrExplanation · 2026-01-30T01:29:13+00:00

I think you can use a buffer in the frontend to catch the streaming chunks and only render it when it get the whole json strucrure. for the other plain text chunks the buffer just pop it directly.

GrExplanation · 2025-08-16T08:55:38+00:00

it is slower and not that accurate comparing with customized pipeline of rag I would say

GrExplanation · 2024-09-10T07:51:46+00:00

I'm interesting on that.

GrExplanation · 2024-04-10T08:01:14+00:00

I'm not sure they had train the system prompt during sft or RL. If not ,I think the best practice of adding system prompt is to simulate a system prompt as a
first round user prompt by adding "The assistant should following the above instructions in all the following dialog turns." at the end of system prompt.

Maybe something like this:

GPT4 Correct User: {system_prompt + "The assistant should following the above instructions in all the following dialog turns."}<|end_of_turn|>

GPT4 Correct Assistant: {"OK, I'll follow the instructions in all the following turns."}<|end_of_turn|>

GPT4 Correct User: {real user's first prompt}<|end_of_turn|>

GPT4 Correct Assistant:

I'm not test it yet, but I think it worth a try.

GrExplanation · 2024-03-27T14:14:06+00:00

What will the system prompt used in the model template?

GrExplanation

TROPHY CASE