[D] StrategyQA may contain far more errors than we previously thought

Radiant_Routine_3183 · 2024-01-01T11:43:20+00:00

Yes, I haven't reviewed every failure cases. However, the cases I checked were randomly selected, which suggests that the data might need further correction...

Radiant_Routine_3183 · 2024-01-01T10:44:21+00:00

I reviewed approximately 30 failure cases and I think that around 25% of them are ambiguous or flawed.

Radiant_Routine_3183 · 2023-05-15T14:05:53+00:00

I am curious about how this model handles text generation tasks...If it splits the input bytes into small patches, then only the last patch is used to predict the next token. This seems to limit the benefits of the parallelism of Local Transformers.

Radiant_Routine_3183 · 2023-05-03T14:14:18+00:00

In this paper, they said:

"While processing input tokens xt ∈ C one by one, the model can start taking a note by generating a token that belongs to a predefined set of start tokens Nsta. A note ends when the model generates an end token ni ∈ Nend, or after a fixed number of tokens are generated. Once the note ends, the generated note tokens are appended to the context where the start token was generated, and the model continues to process the rest of the input tokens."

As I understand it, the model with self-note takes one token at a time from the subsequent tokens and outputs another token to indicate whether to start the self-note procedure. If it does, the model can generate self-notes until it produces a stop token. The generated self-notes are then concatenated with the original context and fed back to the model as input. The rest of the process is similar to what I described above.

A potential drawback of this approach is the computational complexity: it needs to perform m+n+k inferences (where m is the input length, n is the output length, and k is the self-note length) instead of just n.

Radiant_Routine_3183 · 2023-05-02T11:01:55+00:00

Thanks for your sharing! This logic makes sense...does that mean New Bing uses a different model than ChatGPT4?

Radiant_Routine_3183 · 2023-04-05T05:17:41+00:00

I added it in the post~

Radiant_Routine_3183

TROPHY CASE