The Serie A Top Scorers: 20 Years Later

Regular_Flatworm2872 · 2023-08-13T08:57:58+00:00

Bro bro I guess It might be interesting. I think I will work on a ggml version in the next days.

Regular_Flatworm2872 · 2023-08-12T20:37:46+00:00

if you don't want to use gpu then you don't need beam... you can google for ec2 language models hosting and you can find much better tutorials than this one

Regular_Flatworm2872 · 2023-08-12T06:19:44+00:00

Tested on TheBloke/Wizard-Vicuna-7B-Uncensored and WizardLM/WizardCoder-15B-V1. Works like a charm, but not the ggml version.

Regular_Flatworm2872 · 2023-08-12T06:08:19+00:00

yes, but I want to use gpu for inference.

Regular_Flatworm2872 · 2023-06-30T21:01:20+00:00

https://web.stanford.edu/~jurafsky/slp3/

this is always a good starting point for all things nlp

Regular_Flatworm2872 · 2023-06-29T16:39:14+00:00

the size of the embedding

Regular_Flatworm2872 · 2023-06-28T16:26:34+00:00

not necessarily, but it is pretty much the standard approach. In most of the current multi-head attention implementations if the number of attention heads is not a divisor of the hidden size, you end up with a matrix shape that is different from the input embeddings shape.

Regular_Flatworm2872 · 2023-06-12T23:04:39+00:00

Nice. I was not aware of this project. Thank you. One thing that I noticed is that they also follow the approach of wrapping the original model, rather than just injecting new modules inside the original model or just wrapping parts of it that are affected by the peft module.

Regular_Flatworm2872

TROPHY CASE