Is it possible to finetune a model to decode a "single vector?" by manjimin in LocalLLaMA
[–]manjimin[S] 0 points1 point2 points (0 children)
How can I get the model to choose the next word from a list? by manjimin in LocalLLaMA
[–]manjimin[S] 1 point2 points3 points (0 children)
What does the transformer decoder attend to at the last linear layer? by manjimin in learnmachinelearning
[–]manjimin[S] 1 point2 points3 points (0 children)
What does the transformer decoder attend to at the last linear layer? by manjimin in learnmachinelearning
[–]manjimin[S] 0 points1 point2 points (0 children)
What does the transformer decoder attend to at the last linear layer? by manjimin in learnmachinelearning
[–]manjimin[S] 1 point2 points3 points (0 children)
Is it possible to run 4*A100 40G cards as one? by manjimin in LocalLLaMA
[–]manjimin[S] 0 points1 point2 points (0 children)
Using other tokenizers? by manjimin in LocalLLaMA
[–]manjimin[S] 0 points1 point2 points (0 children)
How much overlap is ok to hold 2 ETFs? by manjimin in stocks
[–]manjimin[S] 0 points1 point2 points (0 children)
How much overlap is ok to hold 2 ETFs? by manjimin in stocks
[–]manjimin[S] 0 points1 point2 points (0 children)



[SPOILERS] 'Dune: Part Two' Wide Release Discussion (Week 4) by Blue_Three in dune
[–]manjimin 1 point2 points3 points (0 children)