introduction to deep learning

Natural Language Processing Research

LLaMA: Open and Efficient Foundation Language Models (Paper Explained)

Large Language Models (LLMs) are all the rage right now. ChatGPT is the LLM everyone talks about, but there are others. With the attention (and money) that OpenAI is getting, expect more of them. LLaMA is a series of large language models from 7B to 65B parameters, trained by Meta AI. They train for longer […]

Read More
AI Mathematics

This is a game changer! (AlphaTensor by DeepMind explained)

Matrix multiplication is the most used mathematical operation in all of science and engineering. Speeding this up has massive consequences. Thus, over the years, this operation has become more and more optimized. A fascinating discovery was made when it was shown that one actually needs less than N^3 multiplication operations to multiply to NxN matrices. […]

Read More
AI Hardware Research

How to make your CPU as fast as a GPU – Advances in Sparsity w/ Nir Shavit

Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, what it means in […]

Read More
Generative AI

[ML News] Stable Diffusion Takes Over! (Open Source AI Art)

Stable Diffusion has been released and is riding a wave of creativity and collaboration. But not everyone is happy about this — especially artists!

Read More
AI Generative AI

Parti – Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Parti is a new autoregressive text-to-image model that shows just how much scale can achieve. This model’s outputs are crips, accurate, realistic, and can combine arbitrary styles, concepts, and fulfil even challenging requests. Yannic explains the research paper. Time stamps: 0:00 – Introduction 2:40 – Example Outputs 6:00 – Model Architecture 17:15 – Datasets (incl. […]

Read More
AI Research

Why AI is Harder Than We Think

Yannic Kilcher  explains how the AI community has gone through regular cycles of AI Springs, where rapid progress gave rise to massive overconfidence, high funding, and overpromise, followed by these promises being unfulfilled, subsequently diving into periods of disenfranchisement and underfunding, called AI Winters. This video he explores a paper which examines the reasons for […]

Read More
AI Research

GLOM: How to represent part-whole hierarchies in a neural network (Geoff Hinton’s Paper Explained)

Yannic Kilcher covers a paper where Geoffrey Hinton describes GLOM, a Computer Vision model that combines transformers, neural fields, contrastive learning, capsule networks, denoising autoencoders and RNNs. GLOM decomposes an image into a parse tree of objects and their parts. However, unlike previous systems, the parse tree is constructed dynamically and differently for each input, […]

Read More
AI Robotics

Efficient Computing for Deep Learning, Robotics, and AI

Lex Fridman shared this lecture by Vivienne Sze in January 2020 as part of the MIT Deep Learning Lecture Series. Website: https://deeplearning.mit.edu Slides: http://bit.ly/2Rm7Gi1 Playlist: http://bit.ly/deep-learning-playlist LECTURE LINKS: Twitter: https://twitter.com/eems_mit YouTube: https://www.youtube.com/channel/UC8cviSAQrtD8IpzXdE6dyug MIT professional course: http://bit.ly/36ncGam NeurIPS 2019 tutorial: http://bit.ly/2RhVleO Tutorial and survey paper: https://arxiv.org/abs/1703.09039 Book coming out in Spring 2020! OUTLINE: 0:00 – Introduction […]

Read More