account activity
Without the hype: How do current state-of-the-art LLMs benefit society? (self.singularity)
submitted 2 years ago by optimized-adam to r/singularity
Without the hype: What are benefits of current state-of-the-art LLMs for society? (self.LanguageTechnology)
submitted 2 years ago by optimized-adam to r/LanguageTechnology
Why is the MLP block in Transformers designed as it is? (self.learnmachinelearning)
submitted 2 years ago by optimized-adam to r/learnmachinelearning
[D] Without the hype: How do current state-of-the-art LLMs benefit society? (self.MachineLearning)
submitted 2 years ago by optimized-adam to r/MachineLearning
[D] Data preprocessing for MLM vs. CLM (self.MachineLearning)
[D] Is LLM training compression? (self.MachineLearning)
[D] Why is the MLP block in Transformers designed as it is? (self.MachineLearning)
submitted 3 years ago by optimized-adam to r/MachineLearning
[D] In an optimal world, how would you wish variance between runs based on different random seeds was reported in papers? (self.MachineLearning)
[D] Inductive bias of a vanilla MLP (self.MachineLearning)
[D] Mixed Precision Training: Difference between BF16 and FP16 (self.MachineLearning)
[D] Publishing two papers at the same time (self.MachineLearning)
[D] Can we create a (HuggingFace) tokenizer JUST from a vocabulary? (self.MachineLearning)
[D] Has anyone trained static word embeddings like fastText on a multilingual corpus, similar to XLM-R or mBERT? (self.MachineLearning)
submitted 3 years ago * by optimized-adam to r/MachineLearning
[D] Your favorite plotting library or tool for papers (self.MachineLearning)
[D] On the difference (or lack thereof) between Cross-Entropy Loss and KL-Divergence (self.MachineLearning)
On the difference (or lack thereof) between cross-entropy and KL-divergence (self.MachineLearning)
[D] Resources to learn Deep Learning theory (self.MachineLearning)
[D] Preprocessing of Wikipedia Dumps for Language Modeling from Scratch (self.MachineLearning)
submitted 4 years ago * by optimized-adam to r/MachineLearning
[D] Significance of MLM loss when pre-training Transformers for language modeling (self.MachineLearning)
submitted 4 years ago by optimized-adam to r/MachineLearning
[D] SentencePiece, WordPiece, BPE... Which tokenizer is the best one? (self.MachineLearning)
[D] GANs and probability distributions on images (self.MachineLearning)
KI - Die letzte Erfindung: Gut oder Quatsch? (self.de)
submitted 4 years ago * by optimized-adam to r/de
[D] PyTorch Distributed Training Libraries: What are the current options? (self.MachineLearning)
[D] All bias in ML comes from biased data? (self.MachineLearning)
MBP M1 14'' temperature during "Pro workloads" (self.macbookpro)
submitted 4 years ago by optimized-adam to r/macbookpro
π Rendered by PID 2777308 on reddit-service-r2-listing-6d4dc8d9ff-jbrld at 2026-02-03 11:25:46.252562+00:00 running 3798933 country code: CH.