Advanced audio dialog and generation with Gemini 2.5
Gemini 2.5 has new capabilities in AI-powered audio dialog and generation.
Gemini 2.5 has new capabilities in AI-powered audio dialog and generation.
Cross-lingual transfer is a popular approach to increase the amount of training data for NLP tasks in a low-resource context. However, the best strategy to decide which cross-lingual data to include is unclear. Prior research often focuses on a small set of languages from a few language families or a single task. It is still …
We’ve witnessed remarkable advances in model capabilities as generative AI companies have invested in developing their offerings. Language models such as Anthropic’s Claude Opus 4 & Sonnet 4 and Amazon Nova on Amazon Bedrock can reason, write, and generate responses with increasing sophistication. But even as these models grow more powerful, they can only work …
Read more “Unlocking the power of Model Context Protocol (MCP) on AWS”
Many organizations in regulated industries and the public sector that want to start using generative AI face significant challenges in adopting cloud-based AI solutions due to stringent regulatory mandates, sovereignty requirements, the need for low-latency processing, and the sheer scale of their on-premises data. Together, these can all present institutional blockers to AI adoption, and …
Read more “Emulating the air-gapped experience: GDC Sandbox is now generally available”
*Equal Contributors Identifying mistakes (i.e., miscues) made while reading aloud is commonly approached post-hoc by comparing automatic speech recognition (ASR) transcriptions to the target reading text. However, post-hoc methods perform poorly when ASR inaccurately transcribes verbatim speech. To improve on current methods for reading error annotation, we propose a novel end-to-end architecture that incorporates the …
Read more “Prompting Whisper for Improved Verbatim Transcription and End-to-end Miscue Detection”
In these days, it is more common to companies adopting AI-first strategy to stay competitive and more efficient. As generative AI adoption grows, the technology’s ability to solve problems is also improving (an example is the use case to generate comprehensive market report). One way to simplify the growing complexity of problems to be solved …
Read more “Build GraphRAG applications using Amazon Bedrock Knowledge Bases”
Gemini 2.5 Pro continues to be loved by developers as the best model for coding, and 2.5 Flash is getting even better with a new update. We’re bringing new capabilities to our models, including Deep Think, an experimental enhanced reasoning mode for 2.5 Pro.
Amazon SageMaker Projects empower data scientists to self-serve Amazon Web Services (AWS) tooling and infrastructure to organize all entities of the machine learning (ML) lifecycle, and further enable organizations to standardize and constrain the resources available to their data science teams in pre-packaged templates. For AWS customers using Terraform to define and manage their infrastructure-as-code (IaC), …
Read more “Deploy Amazon SageMaker Projects with Terraform Cloud”
Welcome to the second Cloud CISO Perspectives for May 2025. Today, Enrique Alvarez, public sector advisor, Office of the CISO, explores how government agencies can use AI to improve threat detection — and save money at the same time. As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google …
We’re extending Gemini to become a world model that can make plans and imagine new experiences by simulating aspects of the world.