transformer

AI Generative AI Large Language Models Natural Language Processing Neural Networks

How Neural Networks Learned to Talk | ChatGPT: A 30 Year History

This video from Art of the Problem explores the journey of language models, from their modest beginnings through the development of OpenAI’s GPT models & hints at Q* / Google Gemini. Our journey takes us through the key moments in neural network research involved in next word prediction. We delve into the early experiments with […]

Read More
Natural Language Processing

AI Language Models & Transformers

Plausible text generation has been around for a couple of years, but how does it work – and what’s next? Rob Miles on Language Models and Transformers.

Read More
AI Natural Language Processing

Transformers, explained: Understand the model behind GPT, BERT, and T5

Curious how an ML model could write a poem or an op ed? Transformers can do it all. In this episode of Making with ML, Dale Markowitz explains what transformers are, how they work, and why they’re so impactful. Over the past few years, Transformers, a neural network architecture, have completely transformed state-of-the-art natural language […]

Read More
AI Natural Language Processing

AlphaCode Explained: AI Code Generation

AlphaCode is DeepMind’s new massive language model for generating code. It is similar to OpenAI Codex, except for in the paper they provide a bit more analysis. The field of NLP within AI and ML has exploded get a lot more papers all the time. Hopefully this video can help you understand how AlphaCode works […]

Read More
AI Research

GLOM: How to represent part-whole hierarchies in a neural network (Geoff Hinton’s Paper Explained)

Yannic Kilcher covers a paper where Geoffrey Hinton describes GLOM, a Computer Vision model that combines transformers, neural fields, contrastive learning, capsule networks, denoising autoencoders and RNNs. GLOM decomposes an image into a parse tree of objects and their parts. However, unlike previous systems, the parse tree is constructed dynamically and differently for each input, […]

Read More
AI Natural Language Processing

Transformers for Image Recognition at Scale

Yannic Kilcher explains why transformers are ruining convolutions. This paper, under review at ICLR, shows that given enough data, a standard Transformer can outperform Convolutional Neural Networks in image recognition tasks, which are classically tasks where CNNs excel. In this Video, I explain the architecture of the Vision Transformer (ViT), the reason why it works […]

Read More
AI Research

Explaining the Paper: Hopfield Networks is All You Need

Yannic Kilcher explains the paper “Hopfield Networks is All You Need.” Hopfield Networks are one of the classic models of biological memory networks. This paper generalizes modern Hopfield Networks to continuous states and shows that the corresponding update rule is equal to the attention mechanism used in modern Transformers. It further analyzes a pre-trained BERT […]

Read More
Generative AI Natural Language Processing

14 Cool Apps Built on OpenAI’s GPT-3 API

Bakz T. Future shows off 14 Cool applications built on top of OpenAI’s GPT-3 (general purpose transformer) API (currently in private beta). 

Read More