Traditional deep learning frameworks such as TensorFlow and PyTorch support training on a single deep neural network (DNN) model, which involves computing the weights iteratively for the DNN model.

Designing a DNN model for a task remains an experimental science and is typically a practice of deep learning model exploration. Retrofitting such exploratory-training into the training process of a single DNN model, as supported by current deep learning frameworks, is unintuitive, cumbersome, and inefficient.

In this webinar, Microsoft Research Asia Senior Researcher Quanlu Zhang and Principal Program Manager Scarlett Li will analyze these challenges within the context of Neural Architecture Search (NAS).

Computational tools, with their power in big data processing and complex pattern modeling, may play an important role in helping us push the boundaries of our medical knowledge and what can be done for public health.

During the past few years, Microsoft Research Asia (MSRA) has been actively exploring opportunities to leverage artificial intelligence and computing technologies in biomedical sciences and public health. We will share the progress and vision on computational biomedical sciences and public health across MSRA and Microsoft in this talk.

Microsoft Research highlights this research topic on Knowledge Distillation.

More accurate machine learning models often demand more computation and memory at test time, making them difficult to deploy on CPU- or memory-constrained devices. Knowledge distillation alleviates this burden by training a less expensive student model to mimic the expensive teacher model while maintaining most of the original accuracy.

To explain and enhance this phenomenon, we cast knowledge distillation as a semiparametric inference problem with the optimal student model as the target, the unknown Bayes class probabilities as nuisance, and the teacher probabilities as a plug-in nuisance estimate.

Microsoft Research discusses the general architecture of speech enhancement pipelines for the needs of hands-free telecommunication and distant speech recognition.

The talk will discuss both classical approaches using statistical signal processing and deep learning using neural networks. It will be illustrated with real-life examples from the speech enhancement audio pipelines in Kinect, HoloLens, and Teams.

Microsoft Research  recently had Warren Powell speak on Sequential Decision Analytics: A unified framework

Warren Powell is Professor Emeritus at Princeton University and Chief Analytics Officer of Optimal Dynamics. He’s also Founder and Director of Castle Labs at Princeton which manages over 70 grants and contracts with government agencies and leading companies working to develop models of algorithms, freight logistics, energy systems, and other industries.

He’s created a new field called sequential decision analytics which he covers in this talk and in his new book: Reinforcement Learning and Stochastic Optimization: A unified framework for sequential decisions.

Microsoft Research shares this amazing talk on  the optimization of many deep learning hyperparameters can be formulated as a bilevel optimization problem.

While most black-box and gradient-based approaches require many independent training runs, we aim to adapt hyperparameters online as the network trains. The main challenge is to approximate the response Jacobian, which captures how the minimum of the inner objective changes as the hyperparameters are perturbed. To do this, we introduce the self-tuning network (STN), which fits a hypernetwork to approximate the best response function in the vicinity of the current hyperparameters. Differentiating through the hypernetwork lets us efficiently approximate the gradient of the validation loss with respect to the hyperparameters. We train the hypernetwork and hyperparameters jointly. Empirically, we can find hyperparameter settings competitive with Bayesian Optimization in a single run of training, and in some cases find hyperparameter schedules that outperform any fixed hyperparameter value.

Sustaining growth in storage and computational needs is increasingly challenging thanks to those pesky laws of physics.

For over a decade, exponentially more information has been produced year after year while data storage solutions are pressed to keep up. Soon, current solutions will be unable to match new information in need of storage. Computing is on a similar trajectory, with new needs emerging in search and other domains that require more efficient systems. Innovative methods are necessary to ensure the ability to address future demands, and DNA provides an opportunity at the molecular level for ultra-dense, durable, and sustainable solutions in these areas.

In this webinar, join Microsoft researcher Karin Strauss in exploring the role of biotechnology and synthetic DNA in reaching this goal. Although we have yet to achieve scalable, general-purpose molecular computation, there are areas of IT in which a molecular approach shows growing promise. These areas include storage as well as computation.

Learn how molecules, specifically synthetic DNA, can store digital data and perform certain types of special-purpose computation.