Yannic Kilcher explains how the AI community has gone through regular cycles of AI Springs, where rapid progress gave rise to massive overconfidence, high funding, and overpromise, followed by these promises being unfulfilled, subsequently diving into periods of disenfranchisement and underfunding, called AI Winters.
This video he explores a paper which examines the reasons for the repeated periods of overconfidence and identifies four fallacies that people make when they see rapid progress in AI.
Here’s an interesting documentary (“Canada – The Rise of AI”, Ep. 11) on the “Canadian Silicon Valley.”
Silicon Valley may be home to some of the biggest tech giants in the world but it’s being challenged like never before. Crazy tech geniuses have popped up all over the planet making things that will blow your mind. Author and journalist Ashlee Vance is on a quest to find the most innovative tech creations and meet the beautiful freaks behind them. Bloomberg Businessweek presents an exclusive premiere of the latest episode of Hello World, the tech-travel show hosted by journalist and best-selling author Ashlee Vance and watched by millions of people around the globe.
Yannic Kilcher covers a paper where Geoffrey Hinton describes GLOM, a Computer Vision model that combines transformers, neural fields, contrastive learning, capsule networks, denoising autoencoders and RNNs.
GLOM decomposes an image into a parse tree of objects and their parts. However, unlike previous systems, the parse tree is constructed dynamically and differently for each input, without changing the underlying neural network. This is done by a multi-step consensus algorithm that runs over different levels of abstraction at each location of an image simultaneously. GLOM is just an idea for now but suggests a radically new approach to AI visual scene understanding.
Generative Adversarial Networks (GANs) hold the state-of-the-art when it comes to image generation.
However, while the rest of computer vision is slowly taken over by transformers or other attention-based architectures, all working GANs to date contain some form of convolutional layers. This paper changes that and builds TransGAN, the first GAN where both the generator and the discriminator are transformers. The discriminator is taken over from ViT (an image is worth 16×16 words), and the generator uses pixelshuffle to successfully up-sample the generated resolution. Three tricks make training work: Data augmentations using DiffAug, an auxiliary superresolution task, and a localized initialization of self-attention.
Their largest model reaches competitive performance with the best convolutional GANs on CIFAR10, STL-10, and CelebA.
Yannic Kilcher explains the paper “Every Model Learned by Gradient Descent Is Approximately a Kernel Machine.”
Deep Neural Networks are often said to discover useful representations of the data. However, this paper challenges this prevailing view and suggest that rather than representing the data, deep neural networks store superpositions of the training data in their weights and act as kernel machines at inference time. This is a theoretical paper with a main theorem and an understandable proof and the result leads to many interesting implications for the field.