ML 16463 arch diagram 1024x773 1

Unlock organizational wisdom using voice-driven knowledge capture with Amazon Transcribe and Amazon Bedrock

Preserving and taking advantage of institutional knowledge is critical for organizational success and adaptability. This collective wisdom, comprising insights and experiences accumulated by employees over time, often exists as tacit knowledge passed down informally. Formalizing and documenting this invaluable resource can help organizations maintain institutional memory, drive innovation, enhance decision-making processes, and accelerate onboarding for …

Trillium TPU v5e Training Performance Ra.max 1000x1000 1

Powerful infrastructure innovations for your AI-first future

The rise of generative AI has ushered in an era of unprecedented innovation, demanding increasingly complex and more powerful AI models. These advanced models necessitate high-performance infrastructure capable of efficiently scaling AI training, tuning, and inferencing workloads while optimizing for both system performance and cost effectiveness. Google Cloud has been pioneering AI infrastructure for over …

Ultra-low power neuromorphic hardware show promise for energy-efficient AI computation

A team including researchers from Seoul National University College of Engineering has developed neuromorphic hardware capable of performing artificial intelligence (AI) computations with ultra-low power consumption. The research, published in the journal Nature Nanotechnology, addresses fundamental issues in existing intelligent semiconductor materials and devices while demonstrating potential for array-level technology.

Speculative Streaming: Fast LLM Inference Without Auxiliary Models

This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) workshop at NeurIPS 2024. Speculative decoding is a prominent technique to speed up the inference of a large target language model based on predictions of an auxiliary draft model. While effective, in application-specific settings, it often involves fine-tuning both draft and target …

ML 17145 image001

Empower your generative AI application with a comprehensive custom observability solution

Recently, we’ve been witnessing the rapid development and evolution of generative AI applications, with observability and evaluation emerging as critical aspects for developers, data scientists, and stakeholders. Observability refers to the ability to understand the internal state and behavior of a system by analyzing its outputs, logs, and metrics. Evaluation, on the other hand, involves …

image1 M2Eyluf

Gemini models are coming to GitHub Copilot

Today, we’re announcing that GitHub will make Gemini models – starting with Gemini 1.5 Pro – available to developers on its platform for the first time through a new partnership with Google Cloud. Developers value flexibility and control in choosing the best model suited to their needs — and this partnership shows that the next …