#### Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

This video is from Yannic Kilcher. Retention is an alternative to Attention in Transformers that can both be written in a parallel and in a recurrent fashion. This means the architecture achieves training parallelism while maintaining low-cost inference. Experiments in the paper look very promising. Paper: https://arxiv.org/abs/2307.08621

Read More#### Are Retentive Networks A Successor to Transformer for Large Language Models?

Retention is an alternative to Attention in Transformers that can both be written in a parallel and in a recurrent fashion. This means the architecture achieves training parallelism while maintaining low-cost inference. Experiments in the paper look very promising. Yannic Kilcher elaborates.

Read More#### Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

Yannic Kilcher explains this paper that promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strengths and weaknesses are.

Read More#### OpenAssistant RELEASED! The world’s best open-source Chat AI!

- Frank
- April 19, 2023
- AI
- Alpaca
- artificial intelligence
- Arxiv
- Deep Learning
- deep learning tutorial
- dolly
- explained
- huggingface
- laion
- laion chatgpt
- LLaMA
- Machine Learning
- Neural Networks
- open source AI
- open source chatgpt
- open source gpt
- open source gpt 4
- open source intelligence
- Paper
- pythia
- vincuna
- what is deep learning
- yannic chatgpt

This video is from Yannic Kilcher.

Read More#### GPT-4 is here! What we know so far (Full Analysis)

Yannic Kilcher provides this in depth analysis of GPT-4.

Read More#### ChatGPT: This AI has a JAILBREAK?!

- Frank
- December 9, 2022
- AI
- AI news
- artificial intelligence
- Arxiv
- chat GPT
- chatGPT
- chatgpt jailbreak
- Deep Learning
- deep learning tutorial
- explained
- gpt 3 chatbot
- gpt-3 chatbot
- gpt-4
- Machine Learning
- ml news
- mlnews
- Neural Networks
- openai chat gpt
- openai chatbot
- openai chatbot gpt
- Paper
- what is deep learning

Yannic explores ChatGPT and discovers that it has a JailBreak?! ChatGPT, OpenAI’s newest model is a GPT-3 variant that has been fine-tuned using Reinforcement Learning from Human Feedback, and it is taking the world by storm!

Read More#### This is a game changer! (AlphaTensor by DeepMind explained)

- Frank
- October 19, 2022
- AI
- ai matrix multiplication
- alpha tensor
- alpha tensor explained
- alpha zero
- alphatensor explained
- AlphaZero
- alphazero math
- artificial intelligence
- Arxiv
- Deep Learning
- deep learning tutorial
- Deep Mind
- DeepMind
- deepmind alphatensor
- deepmind math
- explained
- google deep mind
- google deepmind
- introduction to deep learning
- Machine Learning
- matrix multiplication
- matrix multiplication reinforcement learning
- Neural Networks
- Paper
- what is deep learning

Matrix multiplication is the most used mathematical operation in all of science and engineering. Speeding this up has massive consequences. Thus, over the years, this operation has become more and more optimized. A fascinating discovery was made when it was shown that one actually needs less than N^3 multiplication operations to multiply to NxN matrices. […]

Read More#### How to make your CPU as fast as a GPU – Advances in Sparsity w/ Nir Shavit

Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, what it means in […]

Read More