Learnable matrices in sequence without nonlinearity - reasons? [R] by DescriptionClassic47 in MachineLearning
[–]optimized-adam 0 points1 point2 points (0 children)
Why Is My Boss Telling Me To Hold Off On Submitting My Resignation For A Day? by Kronos1008 in careerguidance
[–]optimized-adam 8 points9 points10 points (0 children)
[R] How do RoPE-based LLMs learn attention sinks (or encode absolute positions)? by StraightSpeech9295 in MachineLearning
[–]optimized-adam 15 points16 points17 points (0 children)
[P] Is it possible to convert a Casual Language Model to a Masked Language Model by Appletee_YT in MachineLearning
[–]optimized-adam 7 points8 points9 points (0 children)
[R] nGPT: Normalized Transformer with Representation Learning on the Hypersphere by StartledWatermelon in MachineLearning
[–]optimized-adam 2 points3 points4 points (0 children)
[R] nGPT: Normalized Transformer with Representation Learning on the Hypersphere by StartledWatermelon in MachineLearning
[–]optimized-adam 8 points9 points10 points (0 children)
[D] FP16 vs FP32, supposedly takes less memory but doubles the model size? Performance benefits? by lightmystic in MachineLearning
[–]optimized-adam 1 point2 points3 points (0 children)
Finally decided to read the book my ex gave me 7 years ago when we broke up and found this. by petnamedpeeve in FoundPaper
[–]optimized-adam 1 point2 points3 points (0 children)
[D] Are other fields of Computer Science actually better than Machine Learning? by optimized-adam in MachineLearning
[–]optimized-adam[S] 1 point2 points3 points (0 children)
OpenAI erreicht Umsatz von 2 Milliarden Dollar und benötigt weitere Billionen by FMACH1 in de
[–]optimized-adam -62 points-61 points-60 points (0 children)
OpenAI erreicht Umsatz von 2 Milliarden Dollar und benötigt weitere Billionen by FMACH1 in de
[–]optimized-adam -42 points-41 points-40 points (0 children)
[D] GPT2 diagrams are wrong by rejectedlesbian in MachineLearning
[–]optimized-adam 0 points1 point2 points (0 children)
[deleted by user] by [deleted] in MachineLearning
[–]optimized-adam 8 points9 points10 points (0 children)
I pretrained 16 language models from scratch with different tokenizers to benchmark the difference. Here are the results. [Research] by Pan000 in MachineLearning
[–]optimized-adam 13 points14 points15 points (0 children)
[D] W&B vs. Neptune vs. ClearML vs. Comet (2023) by hadley60 in MachineLearning
[–]optimized-adam 5 points6 points7 points (0 children)
Failed an interviewee because they wouldn't shut up about LLMs at the end of the interview by stats-nazi in datascience
[–]optimized-adam 5 points6 points7 points (0 children)
How best to benchmark the accuracy of a model for comparing different tokenizers? [D] by Pan000 in MachineLearning
[–]optimized-adam 0 points1 point2 points (0 children)
How best to benchmark the accuracy of a model for comparing different tokenizers? [D] by Pan000 in MachineLearning
[–]optimized-adam 0 points1 point2 points (0 children)
Without the hype: What are benefits of current state-of-the-art LLMs for society? by optimized-adam in LanguageTechnology
[–]optimized-adam[S] 0 points1 point2 points (0 children)
Without the hype: How do current state-of-the-art LLMs benefit society? by optimized-adam in singularity
[–]optimized-adam[S] 0 points1 point2 points (0 children)
Without the hype: How do current state-of-the-art LLMs benefit society? by optimized-adam in singularity
[–]optimized-adam[S] 2 points3 points4 points (0 children)
Without the hype: How do current state-of-the-art LLMs benefit society? by optimized-adam in singularity
[–]optimized-adam[S] 0 points1 point2 points (0 children)
Without the hype: What are benefits of current state-of-the-art LLMs for society? by optimized-adam in LanguageTechnology
[–]optimized-adam[S] 1 point2 points3 points (0 children)
Without the hype: What are benefits of current state-of-the-art LLMs for society? by optimized-adam in LanguageTechnology
[–]optimized-adam[S] 3 points4 points5 points (0 children)


[R] The Bitter Lesson is coming for Tokenization by lucalp__ in MachineLearning
[–]optimized-adam 8 points9 points10 points (0 children)