SQuId2520hero

Evaluating speech synthesis in many languages with SQuId

Posted by Thibault Sellam, Research Scientist, Google Previously, we presented the 1,000 languages initiative and the Universal Speech Model with the goal of making speech and language technologies available to billions of users around the world. Part of this commitment involves developing high-quality speech synthesis technologies, which build upon projects such as VDTTS and AudioLM, …

ml 12813 image001

Accelerate PyTorch with DeepSpeed to train large language models with Intel Habana Gaudi-based DL1 EC2 instances

Training large language models (LLMs) with billions of parameters can be challenging. In addition to designing the model architecture, researchers need to set up state-of-the-art training techniques for distributed training like mixed precision support, gradient accumulation, and checkpointing. With large models, the training setup is even more challenging because the available memory in a single …

sample input files.max 1000x1000 1

Document AI: Understanding invoices to passports and beyond

Editor’s note: In this post, I’ll be showing some amazing ways Document AI can help you extract meaning from your documents  – keep reading, or jump directly into a tutorial using the Cloud Console! Documents are a crucial part of most businesses, used to store and communicate important information. The variety is vast: invoices, contracts, …

Taking AI to School: A Conversation With MIT’s Anant Agarwal

In the latest episode of NVIDIA’s AI Podcast, Anant Agarwal, founder of edX and chief platform officer at 2U, shared his vision for the future of online education and how AI is revolutionizing the learning experience. Agarwal, a strong advocate for massive open online courses, or MOOCs, discussed the importance of accessibility and quality in …

12ABdvk9E3wNvTKi5ZzUcuQaQ

Flash Sale Ending Today! Don’t Miss Out on AI & Chatbot Certified Workshops!

Today is your last chance to grab exclusive discounts and secure your spot in this epic event! Don’t let it slip away! Why should you join us? Here’s the scoop: Supercharge Your Skills: Join our workshops and level up your AI game! Learn the latest techniques and tools straight from the experts. Stay ahead of the pack, …

Efficient Multimodal Neural Networks for Trigger-less Voice Assistants

The adoption of multimodal interactions by Voice Assistants (VAs) is growing rapidly to enhance human-computer interactions. Smartwatches have now incorporated trigger-less methods of invoking VAs, such as Raise To Speak (RTS), where the user raises their watch and speaks to VAs without an explicit trigger. Current state-of-the-art RTS systems rely on heuristics and engineered Finite …

VisualCaptions

Visual captions: Using large language models to augment video conferences with dynamic visuals

Posted by Ruofei Du, Research Scientist, and Alex Olwal, Senior Staff Research Scientist, Google Augmented Reality Recent advances in video conferencing have significantly improved remote video communication through features like live captioning and noise cancellation. However, there are various situations where dynamic visual augmentation would be useful to better convey complex and nuanced information. For …

image 10

Build high-performance ML models using PyTorch 2.0 on AWS – Part 1

PyTorch is a machine learning (ML) framework that is widely used by AWS customers for a variety of applications, such as computer vision, natural language processing, content creation, and more. With the recent PyTorch 2.0 release, AWS customers can now do same things as they could with PyTorch 1.x but faster and at scale with …

Climate Cardinals: Bridging the climate information gap with AI-powered translations

Editor’s note: On a trip to visit family in Iran, Sophia Kianni made an alarming observation. Despite facing the disproportionate impacts of climate change, and with temperatures in the Middle East rising twice as fast as the global average, her relatives knew almost nothing about the world’s environmental challenges. When she realized that scientific literature …

software

Fish-Farming Startup Casts AI to Make Aquaculture More Efficient, Sustainable

As a marine biology student, Josef Melchner always dreamed of spending his days cruising the oceans to find dolphins, whales and fish — but also “wanted to do something practical, something that would benefit the world,” he said. When it came time to choose a career, he dove head first into aquaculture. He’s now CEO …