Research

AI Research

Three Explorations on Pre-Training: an Analysis, an Approach, and an Architecture

In this talk from Xinlei Chen, Facebook AI Research, covers three of their recent explorations on pre-training. First is an analysis on object/attribute detection pre-training, which produces bottom-attention features extensively used in vision and language research. The main finding is that plain grid features can work equally well without object proposals, while being significantly faster. […]

Read More
Computer Vision Natural Language Processing Research

Tightly Connecting Vision and Language

Remarkable progress has been made at the intersection of vision and language. While showing great promise, current vision and language models may only weakly “connect” the two modalities and often fail in the wild. In this talk, Goggle’s Soravit Changpinyo will present recent efforts aiming to bridge this gap along two dimensions: informativeness and controllability. […]

Read More
AI Research

DeepMind’s AI Plays Catch and So Much More!

Two Minute Papers examines the paper “Open-Ended Learning Leads to Generally Capable Agents.”

Read More
AI Interesting Research

PonderNet: Learning to Ponder

Humans don’t spend the same amount of mental effort on all problems equally. Instead, we respond quickly to easy tasks, and we take our time to deliberate hard tasks. DeepMind’s PonderNet attempts to achieve the same by dynamically deciding how many computation steps to allocate to any single input sample. This is done via a […]

Read More
AI Research

New AI Research Work Fixes Choppy Videos

Can AI be used to improve the frame rate of video footage? Recent research points to an emphatic yes. Two Minute papers explores the paper “Time Lens: Event-based Video Frame Interpolation”

Read More
AI Interesting Research

AlphaFold and the 50-year challenge to solve protein folding

Arxiv Insights explores what AlphaFold means for medicine, pharmacology, and science.

Read More
Computer Vision Research

Recent Advances in Image Captioning and Image-Text Retrieval

Take a look at recent advances in Image Captioning, Image-Text Retrieval and Visual Question Answering using Scene Graph Parsing.

Read More
AI Research

Introducing Retiarii: A deep learning exploratory-training framework on NNI

Traditional deep learning frameworks such as TensorFlow and PyTorch support training on a single deep neural network (DNN) model, which involves computing the weights iteratively for the DNN model. Designing a DNN model for a task remains an experimental science and is typically a practice of deep learning model exploration. Retrofitting such exploratory-training into the […]

Read More
Computer Vision Research Virtual Reality

Synthetic Data with Digital Humans

Microsoft Research posted this video where  Erroll Wood and Tadas Baltrusaitis discuss how synthetics drives work on understanding human faces and hands, including how it powers Fully Articulated Hand Tracking on HoloLens 2.

Read More
AI Research

SOLOIST: Building Task Bots at Scale

In this video tutorial by Microsoft Research, researchers demonstrate how pretrain grounded text generated (GTG) model can be finetuned and adapted to a specific task using conversation learner, a pretraining, finetuning and machine teaching framework to build task bots at scale.

Read More