paper explained

AI Large Language Models Research

Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

Yannic Kilcher explains this paper that promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strengths and weaknesses are.

Read More
Generative AI

TransGAN: Two Transformers Can Make One Strong GAN

Generative Adversarial Networks (GANs) hold the state-of-the-art when it comes to image generation. However, while the rest of computer vision is slowly taken over by transformers or other attention-based architectures, all working GANs to date contain some form of convolutional layers. This paper changes that and builds TransGAN, the first GAN where both the generator […]

Read More