Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models

Speech foundation models, such as HuBERT and its variants, are pre-trained on large amounts of unlabeled speech data and then used for a range of downstream tasks. These models use a masked prediction objective, where the model learns to predict information about masked input segments from the unmasked context. The choice of prediction targets in …

Picture1 godaddy

How GoDaddy built a category generation system at scale with batch inference for Amazon Bedrock

This post was co-written with Vishal Singh, Data Engineering Leader at Data & Analytics team of GoDaddy Generative AI solutions have the potential to transform businesses by boosting productivity and improving customer experiences, and using large language models (LLMs) in these solutions has become increasingly popular. However, inference of LLMs as single model invocations or …

image1 A6MLdoJ.max 1000x1000 1

10 months to innovation: Definity’s leap to data agility with BigQuery and Vertex AI

At Definity, a leading Canadian P&C insurer with a history spanning over 150 years, we have a long tradition of innovating to help our customers and communities adapt and thrive. To stay ahead in our rapidly evolving industry, we knew a unified data foundation was key to realizing the business and customer experience opportunities offered …

ML 18232 image001

Exploring creative possibilities: A visual guide to Amazon Nova Canvas

Compelling AI-generated images start with well-crafted prompts. In this follow-up to our Amazon Nova Canvas Prompt Engineering Guide, we showcase a curated gallery of visuals generated by Nova Canvas—categorized by real-world use cases—from marketing and product visualization to concept art and design exploration. Each image is paired with the prompt and parameters that generated it, …

An Efficient and Streaming Audio Visual Active Speaker Detection System

This paper delves into the challenging task of Active Speaker Detection (ASD), where the system needs to determine in real-time whether a person is speaking or not in a series of video frames. While previous works have made significant strides in improving network architectures and learning effective representations for ASD, a critical gap exists in …

ml 18337 Picture2

Benchmarking Amazon Nova and GPT-4o models with FloTorch

Based on original post by Dr. Hemant Joshi, CTO, FloTorch.ai A recent evaluation conducted by FloTorch compared the performance of Amazon Nova models with OpenAI’s GPT-4o. Amazon Nova is a new generation of state-of-the-art foundation models (FMs) that deliver frontier intelligence and industry-leading price-performance. The Amazon Nova family of models includes Amazon Nova Micro, Amazon …

Figure 1. LCA stages and system boundary.max 1000x1000 1

How Google Cloud measures its climate impact through Life Cycle Assessment (LCA)

As AI creates opportunities for business growth and societal benefits, we’re working to reduce their carbon intensity through efforts like optimizing software, improving hardware efficiency, and supporting our operations with carbon-free energy.  At Google, we’re committed to understanding the entirety of our environmental impact so we can apply the best, boldest, and most holistic solutions. …

2122

Transforming financial analysis with CreditAI on Amazon Bedrock: Octus’s journey with AWS

Investment professionals face the mounting challenge of processing vast amounts of data to make timely, informed decisions. The traditional approach of manually sifting through countless research documents, industry reports, and financial statements is not only time-consuming but can also lead to missed opportunities and incomplete analysis. This challenge is particularly acute in credit markets, where …