Transformers

AI Large Language Models Natural Language Processing

Beginner’s Guide to Transformers: What Are They & How Do They Work?

This video is from AssemblyAI delves into transformers, because there’s more than meets the eye. I could not resist. Transformers were introduced a couple of years ago with the paper Attention is All You Need by Google Researchers. Since its introduction transformers has been widely adopted in the industry.

Read More
AI Large Language Models Research

Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

Yannic Kilcher explains this paper that promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strengths and weaknesses are.

Read More
Natural Language Processing Research

LLaMA: Open and Efficient Foundation Language Models (Paper Explained)

Large Language Models (LLMs) are all the rage right now. ChatGPT is the LLM everyone talks about, but there are others. With the attention (and money) that OpenAI is getting, expect more of them. LLaMA is a series of large language models from 7B to 65B parameters, trained by Meta AI. They train for longer […]

Read More
AI Natural Language Processing Neural Networks

Illustrated Guide to Transformers Neural Network: A step by step explanation

Transformers are the rage nowadays, but how do they work? This video demystifies the novel neural network architecture with step by step explanation and illustrations on how transformers work. CORRECTIONS: The sine and cosine functions are actually applied to the embedding dimensions and time steps!

Read More
AI Natural Language Processing

Transformers, explained: Understand the model behind GPT, BERT, and T5

Curious how an ML model could write a poem or an op ed? Transformers can do it all. In this episode of Making with ML, Dale Markowitz explains what transformers are, how they work, and why they’re so impactful. Over the past few years, Transformers, a neural network architecture, have completely transformed state-of-the-art natural language […]

Read More
AI

How I Read AI Papers

Yannic Kilcher retraces his first reading of Facebook AI’s DETR paper and explain my process of understanding it. OUTLINE: 0:00 – Introduction 1:25 – Title 4:10 – Authors 5:55 – Affiliation 7:40 – Abstract 13:50 – Pictures 20:30 – Introduction 22:00 – Related Work 24:00 – Model 30:00 – Experiments 41:50 – Conclusions & Abstract […]

Read More
AI Natural Language Processing

GPT-3: Language Models are Few-Shot Learners

How far can you go with ONLY language modeling? Can a large enough language model perform NLP task out of the box? OpenAI take on these and other questions by training a transformer that is an order of magnitude larger than anything that has ever been built before and the results are astounding. Yannic Kilcher […]

Read More
AI Generative AI Natural Language Processing

AI Language Models & Transformers

Rob Miles on Language Models and Transformers, plausible text generation, how does it work, and what’s next.

Read More