peer review

AI Natural Language Processing

Transformers for Image Recognition at Scale

Yannic Kilcher explains why transformers are ruining convolutions. This paper, under review at ICLR, shows that given enough data, a standard Transformer can outperform Convolutional Neural Networks in image recognition tasks, which are classically tasks where CNNs excel. In this Video, I explain the architecture of the Vision Transformer (ViT), the reason why it works […]

Read More