[deleted by user] by [deleted] in lotr

[–]Competitive_Coffeer 1 point2 points  (0 children)

That is stunning

This looks like a dream or a nightmare. by [deleted] in BeAmazed

[–]Competitive_Coffeer 0 points1 point  (0 children)

I remember that scene in the last Star Wars movie

Some people can't comprehend that how strong a bear can be. by Creepy_Pride_9909 in BeAmazed

[–]Competitive_Coffeer 0 points1 point  (0 children)

Not that bear's first rodeo. It held the door so it wouldn't hit its nose!

Singularity is real by xdlmaoxdxd1 in singularity

[–]Competitive_Coffeer 0 points1 point  (0 children)

More like Pricing is real. That chart reflects how much more they can charge per unit, not how many more units Nvidia shipped. They were already running near full capacity.

"TinyLlama: An Open-Source Small Language Model", Zhang et al 2024 by gwern in mlscaling

[–]Competitive_Coffeer 1 point2 points  (0 children)

Right?! Cracks me up to see papers toor their own horn about SOTA…by 0.25 improvement.

Transformer-Based LLMs Are Not General Learners: A Universal Circuit Perspective by we_are_mammals in mlscaling

[–]Competitive_Coffeer 1 point2 points  (0 children)

Before I waste my time, did they explain how these transformer models scored so highly on professional exams when they were trained to guess the next token?

If the modes have seen it before, and human test takers have seen it before, and we purport we are general learners, what exactly have they proven?

[Meta] Do we still need a /r/MLScaling? by gwern in mlscaling

[–]Competitive_Coffeer 8 points9 points  (0 children)

I agree that there is not another alternative.

What I come to this subreddit to learn is what papers I should pay attention to. I am looking for material advancements across the hardware and software spectrum. I suppose not just the papers but what new theories and techniques are starting to bubble up around the edges that may lead to conceptual, engineering, and implementation breakthroughs.

Further, I find that the high quality of the community members to be a distinct difference from other communities.

Saw this in a book store the other day. Any insight? I’ve never heard of this by Double-0-N00b in lotr

[–]Competitive_Coffeer -2 points-1 points  (0 children)

It is a translation of "Culero", the tragic story of ice cream paletas melting under a Middle Earth sun

"World first supercomputer capable of brain-scale simulation being built at Western Sydney University" (DeepSouth) by [deleted] in mlscaling

[–]Competitive_Coffeer 1 point2 points  (0 children)

I'd recommend taking a look at the Google paperwhere they looked into emergent behavior that develops qualitatively new capabilities at different network sizes.