
Beginner’s Guide to Transformers: What Are They & How Do They Work?
- Frank
- May 25, 2023
- AI
- Transformers
This video is from AssemblyAI delves into transformers, because there’s more than meets the eye. I could not resist. Transformers were introduced a couple of years ago with the paper Attention is All You Need by Google Researchers. Since its introduction transformers has been widely adopted in the industry.
Read More
Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)
Yannic Kilcher explains this paper that promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strengths and weaknesses are.
Read More
Illustrated Guide to Transformers Neural Network: A step by step explanation
Transformers are the rage nowadays, but how do they work? This video demystifies the novel neural network architecture with step by step explanation and illustrations on how transformers work. CORRECTIONS: The sine and cosine functions are actually applied to the embedding dimensions and time steps!
Read More
Transformers, explained: Understand the model behind GPT, BERT, and T5
- Frank
- June 18, 2022
- AutoML
- automl transformer
- Bert
- Dale Markowitz
- GDS: Yes
- Google Cloud
- how do transformers work
- Machine Learning
- Making with Machine Learning
- Making with ML
- ML
- pr_pr: Google Cloud
- purpose: Educate
- series: Making with Machine Learning
- transformer
- transformer model
- transformer models
- Transformers
- transformers explained
- transformers machine learning
- transformers ml
- type: DevByte+
- understanding transformers
Curious how an ML model could write a poem or an op ed? Transformers can do it all. In this episode of Making with ML, Dale Markowitz explains what transformers are, how they work, and why they’re so impactful. Over the past few years, Transformers, a neural network architecture, have completely transformed state-of-the-art natural language […]
Read More
Geometric Deep Learning: The Erlangen Programme of ML
- Frank
- July 6, 2021
- AI
- artificial intelligence
- Cancer
- CNN
- Computer Graphics
- Computer Vision
- Convolutional Neural Networks
- Deep Learning
- drug design
- equivariance
- erlangen program
- geometric deep learning
- Geometry
- GNN
- graph learning
- graph neural networks
- group theory
- hyperfoods
- immunotherapy
- invariance
- Machine Learning
- manifld learning
- Neural Network
- positional encoding
- Proteins
- symmetry
- transformer
- Transformers
The ICLR 2021 Keynote “Geometric Deep Learning: The Erlangen Programme of ML“ by Michael Bronstein is presented below.
Read More
How I Read AI Papers
Yannic Kilcher retraces his first reading of Facebook AI’s DETR paper and explain my process of understanding it. OUTLINE: 0:00 – Introduction 1:25 – Title 4:10 – Authors 5:55 – Affiliation 7:40 – Abstract 13:50 – Pictures 20:30 – Introduction 22:00 – Related Work 24:00 – Model 30:00 – Experiments 41:50 – Conclusions & Abstract […]
Read More
GPT-3: Language Models are Few-Shot Learners
- Frank
- June 2, 2020
- AI
- artificial intelligence
- Arxiv
- attention
- autoregressive
- Bert
- boolq
- common crawl
- context
- corpus
- deep language
- Deep Learning
- explained
- Few Shot
- glue
- GPT-2
- gpt-3
- gpt2
- gpt3
- heads
- language model
- Machine Learning
- Math
- Microsoft
- mlm
- Natural Language Processing
- natural questions
- Neural Networks
- news
- NLP
- OpenAI
- Paper
- preplexity
- question answering
- sota
- strings
- superglue
- training data
- Transformers
- turing
- Wikipedia
- zero shot
How far can you go with ONLY language modeling? Can a large enough language model perform NLP task out of the box? OpenAI take on these and other questions by training a transformer that is an order of magnitude larger than anything that has ever been built before and the results are astounding. Yannic Kilcher […]
Read More
AI Language Models & Transformers
Rob Miles on Language Models and Transformers, plausible text generation, how does it work, and what’s next.
Read More