image4

OptFormer: Towards Universal Hyperparameter Optimization with Transformers

Posted by Yutian Chen, Staff Research Scientist, DeepMind, and Xingyou (Richard) Song, Research Scientist, Google Research, Brain Team One of the most important aspects in machine learning is hyperparameter optimization, as finding the right hyperparameters for a machine learning task can make or break a model’s performance. Internally, we regularly use Google Vizier as the …

image3 1

Towards Helpful Robots: Grounding Language in Robotic Affordances

Posted by Brian Ichter and Karol Hausman, Research Scientists, Google Research, Brain Team Over the last several years, we have seen significant progress in applying machine learning to robotics. However, robotic systems today are capable of executing only very short, hard-coded commands, such as “Pick up an apple,” because they tend to perform best with …

image3

Rax: Composable Learning-to-Rank Using JAX

Posted by Rolf Jagerman and Honglei Zhuang, Software Engineers, Google Research Ranking is a core problem across a variety of domains, such as search engines, recommendation systems, or question answering. As such, researchers often utilize learning-to-rank (LTR), a set of supervised machine learning techniques that optimize for the utility of an entire list of items …

image1 2

Efficient Video-Text Learning with Iterative Co-tokenization

Posted by AJ Piergiovanni and Anelia Angelova, Research Scientists, Google Research, Brain Team Video is an ubiquitous source of media content that touches on many aspects of people’s day-to-day lives. Increasingly, real-world video applications, such as video captioning, video content analysis, and video question-answering (VideoQA), rely on models that can connect video content with text …

image4

Introducing the Google Universal Image Embedding Challenge

Posted by Bingyi Cao, Software Engineer, Google Research, and Mário Lipovský, Software Engineer, Google Lens Computer vision models see daily application for a wide variety of tasks, ranging from object recognition to image-based 3D object reconstruction. One challenging type of computer vision problem is instance-level recognition (ILR) — given an image of an object, the …

mpnasimage5

Building Efficient Multiple Visual Domain Models with Multi-path Neural Architecture Search

Posted by Qifei Wang, Senior Software Engineer, and Feng Yang, Senior Staff Software Engineer, Google Research Deep learning models for visual tasks (e.g., image classification) are usually trained end-to-end with data from a single visual domain (e.g., natural images or computer generated images). Typically, an application that completes visual tasks for multiple domains would need …

image1

Efficient Sequence Modeling for On-Device ML

Posted by Arun Kandoor, Software Engineer, Google Research The increasing demand for machine learning (ML) model inference on-device (for mobile devices, tablets, etc.) is driven by the rise of compute-intensive applications, the need to keep certain data on device for privacy and security reasons, and the desire to provide services when a network connection may …

image1 1

Enhancing Backpropagation via Local Loss Optimization

Posted by Ehsan Amid, Research Scientist, and Rohan Anil, Principal Engineer, Google Research, Brain Team While model design and training data are key ingredients in a deep neural network’s (DNN’s) success, less-often discussed is the specific optimization method used for updating the model parameters (weights). Training DNNs involves minimizing a loss function that measures the …

Look2520and2520Talk252006

Look and Talk: Natural Conversations with Google Assistant

Posted by Tuan Anh Nguyen, Staff Software Engineer, Google Assistant, and Sourish Chaudhuri, Staff Software Engineer, Google Research In natural conversations, we don’t say people’s names every time we speak to each other. Instead, we rely on contextual signaling mechanisms to initiate conversations, and eye contact is often all it takes. Google Assistant, now available …

image4

ML-Enhanced Code Completion Improves Developer Productivity

Posted by Maxim Tabachnyk, Staff Software Engineer and Stoyan Nikolov, Senior Engineering Manager, Google Research The increasing complexity of code poses a key challenge to productivity in software engineering. Code completion has been an essential tool that has helped mitigate this complexity in integrated development environments (IDEs). Conventionally, code completion suggestions are implemented with rule-based …