
AI
Natural Language Processing
GPT-3: Language Models are Few-Shot Learners
- Frank
- June 2, 2020
- AI
- artificial intelligence
- Arxiv
- attention
- autoregressive
- Bert
- boolq
- common crawl
- context
- corpus
- deep language
- Deep Learning
- explained
- Few Shot
- glue
- GPT-2
- gpt-3
- gpt2
- gpt3
- heads
- language model
- Machine Learning
- Math
- Microsoft
- mlm
- Natural Language Processing
- natural questions
- Neural Networks
- news
- NLP
- OpenAI
- Paper
- preplexity
- question answering
- sota
- strings
- superglue
- training data
- Transformers
- turing
- Wikipedia
- zero shot
How far can you go with ONLY language modeling? Can a large enough language model perform NLP task out of the box? OpenAI take on these and other questions by training a transformer that is an order of magnitude larger than anything that has ever been built before and the results are astounding. Yannic Kilcher […]
Read More
Machine Learning
TensorFlow
Optimization, Machine Learning Models, and TensorFlow
This is Part 2 of a four-part series that breaks up a talk that Seth Juarez gave at the Toronto AI Meetup. (Watch Part 1) Index: [00:13] Optimization (I explain calculus!!!) [04:40] Gradient descent [06:26] Perceptron (or linear models – we learned what these are in part 1 but I expound a bit more) [07:04] […]
Read More