A Subset-Based Strategy for Faster AutoML

Teddy Lazebnik, CTO@DataClue recently presented this talk at PyData Tel Aviv.

Automated machine AutoML learning frameworks have become important tools in the data scientists’ arsenal. However, when the dataset is large, the overall AutoML running times become increasingly high. In this lecture, we present AutoML optimization strategy that tackles the data size, rather than configuration space by wrapping existing AutoML tools, and instead of execute them directly on the entire dataset, we find a small yet representative data subset to work with


#DataScientist, #DataEngineer, Blogger, Vlogger, Podcaster at http://DataDriven.tv . Back @Microsoft to help customers leverage #AI Opinions mine. #武當派 fan. I blog to help you become a better data scientist/ML engineer Opinions are mine. All mine.