Using an Embedding Matrix on Tabular Data in R

Using an Embedding Matrix on Tabular Data in R

Here’s an interesting article on how to represent a categorical feature, with 100’s of levels, in a model in R.

In this post, we will discuss using an embedding matrix as an alternative to using one-hot encoded categorical features for in modeling. We usually find references to embedding matrices in natural language processing applications but they may also be used on tabular data. An embedding matrix replaces the spares one-hot encoded matrix with an array of vectors where each vector represents some level of the feature. Using an embedding matrix can greatly reduce the memory needed to handle the categorical features.

Frank

#DataScientist, #DataEngineer, Blogger, Vlogger, Podcaster at http://DataDriven.tv . Back @Microsoft to help customers leverage #AI Opinions mine. #武當派 fan. I blog to help you become a better data scientist/ML engineer Opinions are mine. All mine.