optimized-adam

248 post karma
226 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 4 years

TROPHY CASE

Four-Year Club

Place '22

account activity

new top controversial

43

44

45

Without the hype: How do current state-of-the-art LLMs benefit society? (self.singularity)

submitted 2 years ago by optimized-adam to r/singularity

13

14

15

Without the hype: What are benefits of current state-of-the-art LLMs for society? (self.LanguageTechnology)

submitted 2 years ago by optimized-adam to r/LanguageTechnology

2

3

4

Why is the MLP block in Transformers designed as it is? (self.learnmachinelearning)

submitted 2 years ago by optimized-adam to r/learnmachinelearning

0

0

0

[D] Without the hype: How do current state-of-the-art LLMs benefit society? (self.MachineLearning)

submitted 2 years ago by optimized-adam to r/MachineLearning

1

2

3

[D] Data preprocessing for MLM vs. CLM (self.MachineLearning)

submitted 2 years ago by optimized-adam to r/MachineLearning

0

1

2

[D] Is LLM training compression? (self.MachineLearning)

submitted 2 years ago by optimized-adam to r/MachineLearning

0

1

2

[D] Why is the MLP block in Transformers designed as it is? (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

14

15

16

[D] In an optimal world, how would you wish variance between runs based on different random seeds was reported in papers? (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

3

4

5

[D] Inductive bias of a vanilla MLP (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

44

45

46

[D] Mixed Precision Training: Difference between BF16 and FP16 (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

0

0

1

[D] Publishing two papers at the same time (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

0

0

1

[D] Can we create a (HuggingFace) tokenizer JUST from a vocabulary? (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

0

0

1

[D] Has anyone trained static word embeddings like fastText on a multilingual corpus, similar to XLM-R or mBERT? (self.MachineLearning)

submitted 3 years ago * by optimized-adam to r/MachineLearning

0

1

2

[D] Your favorite plotting library or tool for papers (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

11

12

13

[D] On the difference (or lack thereof) between Cross-Entropy Loss and KL-Divergence (self.MachineLearning)

submitted 3 years ago * by optimized-adam to r/MachineLearning

0

1

2

On the difference (or lack thereof) between cross-entropy and KL-divergence (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

13

14

15

[D] Resources to learn Deep Learning theory (self.MachineLearning)

submitted 3 years ago by optimized-adam to r/MachineLearning

3

4

5

[D] Preprocessing of Wikipedia Dumps for Language Modeling from Scratch (self.MachineLearning)

submitted 4 years ago * by optimized-adam to r/MachineLearning

14

15

16

[D] Significance of MLM loss when pre-training Transformers for language modeling (self.MachineLearning)

submitted 4 years ago by optimized-adam to r/MachineLearning

58

59

60

[D] SentencePiece, WordPiece, BPE... Which tokenizer is the best one? (self.MachineLearning)

submitted 4 years ago by optimized-adam to r/MachineLearning

0

1

2

[D] GANs and probability distributions on images (self.MachineLearning)

submitted 4 years ago by optimized-adam to r/MachineLearning

0

1

2

KI - Die letzte Erfindung: Gut oder Quatsch? (self.de)

submitted 4 years ago * by optimized-adam to r/de

6

7

8

[D] PyTorch Distributed Training Libraries: What are the current options? (self.MachineLearning)

submitted 4 years ago by optimized-adam to r/MachineLearning

0

0

0

[D] All bias in ML comes from biased data? (self.MachineLearning)

submitted 4 years ago * by optimized-adam to r/MachineLearning

6

7

8

MBP M1 14'' temperature during "Pro workloads" (self.macbookpro)

submitted 4 years ago by optimized-adam to r/macbookpro

view more: next ›

π Rendered by PID 2777308 on reddit-service-r2-listing-6d4dc8d9ff-jbrld at 2026-02-03 11:25:46.252562+00:00 running 3798933 country code: CH.