Deep Networks Are Kernel Machines

Yannic Kilcher explains the paper “Every Model Learned by Gradient Descent Is Approximately a Kernel Machine.”

Deep Neural Networks are often said to discover useful representations of the data. However, this paper challenges this prevailing view and suggest that rather than representing the data, deep neural networks store superpositions of the training data in their weights and act as kernel machines at inference time. This is a theoretical paper with a main theorem and an understandable proof and the result leads to many interesting implications for the field.


#DataScientist, #DataEngineer, Blogger, Vlogger, Podcaster at . Back @Microsoft to help customers leverage #AI Opinions mine. #武當派 fan. I blog to help you become a better data scientist/ML engineer Opinions are mine. All mine.