Microsoft Research posted this video of Sumit Gulwani, who founded the PROSE research and engineering team at Microsoft. This team develops programming-by-example (PBE) APIs and ships them through multiple Microsoft products.

PBE is a new frontier in AI wherein the computer programs itself—the user provides input-output examples and the computer synthesizes an intended script. This is significant because 99% of computer users do not know programming. Even for programmers, this can provide a 10-100x productivity increase for many task domains.

A killer application of PBE is in the space of data cleaning/preparation since data scientists often spend up to 80% time wrangling data into a form suitable for learning models or drawing insights. In this video, Sumit illustrates how a data cleaning task, that Python programmers took an average of 30 minutes to finish, can be performed in 30 seconds by non-programmers using the PBE paradigm. In particular, PBE can help ingest a file into tabular format, split a column to extract constituent sub-fields, derive new columns, and suggest form entries.


In this talk, Bin Yu, professor at UC Berkley, discusses the intertwining importance and connections of three principles of data science.

The three principles will be demonstrated in the context of two neuroscience projects and through analytical connections. In particular, the first project adds stability to predictive models used for reconstruction of movies from fMRI brain signals to gain interpretability of the predictive models.

The second project employs predictive transfer learning and stable (manifold) deep dream images to characterize the difficult V4 neurons in primate vision cortex. Our results lend support, to a certain extent, to the resemblance to a primate brain of Convolutional Neural Networks (CNNs).