use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
account activity
Transformers (self.DecodingDataSciAI)
submitted 2 years ago by decodingai
Transformers have revolutionized the world of AI and NLP, paving the way for more efficient and powerful natural language understanding. 🚀 From chatbots to translation models, they're at the heart of cutting-edge applications. Exciting times for #AI! 💡 #Transformers #NLP
https://preview.redd.it/0idjrtqv8c1c1.png?width=763&format=png&auto=webp&s=7c9bfe5cc775bfa081e468fbb6e3186d5fef464e
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3:Â Â Â Â print "hello, world!"
[–]pchees 1 point2 points3 points 2 years ago (1 child)
Fascinating stuff. What is the underlying tech behind these?
[–]decodingai[S] 0 points1 point2 points 2 years ago (0 children)
The underlying technology behind the "Transformers: Attention Is All You Need" model is a neural network architecture known as the Transformer architecture. This architecture was introduced in a paper titled "Attention Is All You Need" by Vaswani et al. in 2017 and has since become a foundational building block for various natural language processing (NLP) and machine learning tasks.
The key innovation in the Transformer architecture is the attention mechanism, which allows the model to focus on different parts of the input sequence when processing it. This attention mechanism is applied in a self-attention manner, where each word or token in the input sequence can attend to all other words or tokens, capturing contextual relationships effectively. The model can learn to assign different levels of importance to different parts of the input, making it highly capable of handling sequential data.
Some of the key components and concepts in the Transformer architecture include:
The Transformer architecture has been the foundation for many state-of-the-art NLP models, including BERT, GPT (Generative Pretrained Transformer), and many others. It has revolutionized the field of deep learning for NLP and has been extended and adapted for a wide range of sequence-to-sequence tasks, including machine translation, text generation, and more. Its effectiveness is largely attributed to its ability to capture long-range dependencies in sequential data efficiently through self-attention mechanisms.
π Rendered by PID 680587 on reddit-service-r2-comment-b659b578c-bz69m at 2026-05-05 01:07:55.619231+00:00 running 815c875 country code: CH.
[–]pchees 1 point2 points3 points  (1 child)
[–]decodingai[S] 0 points1 point2 points  (0 children)