Yannic Kilcher

Natural Language Processing

Exploring GPT-3

In this special edition of Machine Learning Street Talk, Dr. Tim Scarfe, Yannic Kilcher and Dr. Keith Duggar speak with Professor Gary Marcus, Dr. Walid Saba and Connor Leahy about GPT-3. We have all had a significant amount of time to experiment with GPT-3 and show you demos of it in use and the considerations. […]

Read More
AI

How I Read AI Papers

Yannic Kilcher retraces his first reading of Facebook AI’s DETR paper and explain my process of understanding it. OUTLINE: 0:00 – Introduction 1:25 – Title 4:10 – Authors 5:55 – Affiliation 7:40 – Abstract 13:50 – Pictures 20:30 – Introduction 22:00 – Related Work 24:00 – Model 30:00 – Experiments 41:50 – Conclusions & Abstract […]

Read More
Natural Language Processing

Language Models are Few-Shot Learners (OpenAI GPT-3)

Machine Learning Street Talk  Tim Scarfe, Yannic Kilcher and Connor Shorten discuss their takeaways from OpenAI’s GPT-3 language model. OpenAI trained a 175 BILLION parameter autoregressive language model. The paper demonstrates how self-supervised language modelling at this scale can perform many downstream tasks without fine-tuning.  Paper Links: GPT-3: https://arxiv.org/abs/2005.14165 Content index: 00:00:00 Intro 00:00:54 ZeRO1+2 […]

Read More
AI Natural Language Processing

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Yannic Kilcher investigates BERT and the white paper associated with it https://arxiv.org/abs/1810.04805 Abstract:We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As […]

Read More