1 Pillars of LLM Observability.max 1000x1000 1

Arize, Vertex AI API: Evaluation workflows to accelerate generative app development and AI ROI

In the rapidly evolving landscape of artificial intelligence, enterprise AI engineering teams must constantly seek cutting-edge solutions to drive innovation, enhance productivity, and maintain a competitive edge. In leveraging an AI observability and evaluation platform like Arize AI with the advanced capabilities of Google’s suite of AI tools, enterprises looking to push the boundaries of …

Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs

Translating text that contains entity names is a challenging task, as cultural-related references can vary significantly across languages. These variations may also be caused by transcreation, an adaptation process that entails more than transliteration and word-for-word translation. In this paper, we address the problem of cross-cultural translation on two fronts: (i) we introduce XC-Translate, the …

ML 16463 arch diagram 1024x773 1

Unlock organizational wisdom using voice-driven knowledge capture with Amazon Transcribe and Amazon Bedrock

Preserving and taking advantage of institutional knowledge is critical for organizational success and adaptability. This collective wisdom, comprising insights and experiences accumulated by employees over time, often exists as tacit knowledge passed down informally. Formalizing and documenting this invaluable resource can help organizations maintain institutional memory, drive innovation, enhance decision-making processes, and accelerate onboarding for …

Trillium TPU v5e Training Performance Ra.max 1000x1000 1

Powerful infrastructure innovations for your AI-first future

The rise of generative AI has ushered in an era of unprecedented innovation, demanding increasingly complex and more powerful AI models. These advanced models necessitate high-performance infrastructure capable of efficiently scaling AI training, tuning, and inferencing workloads while optimizing for both system performance and cost effectiveness. Google Cloud has been pioneering AI infrastructure for over …

Speculative Streaming: Fast LLM Inference Without Auxiliary Models

This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) workshop at NeurIPS 2024. Speculative decoding is a prominent technique to speed up the inference of a large target language model based on predictions of an auxiliary draft model. While effective, in application-specific settings, it often involves fine-tuning both draft and target …

ML 17145 image001

Empower your generative AI application with a comprehensive custom observability solution

Recently, we’ve been witnessing the rapid development and evolution of generative AI applications, with observability and evaluation emerging as critical aspects for developers, data scientists, and stakeholders. Observability refers to the ability to understand the internal state and behavior of a system by analyzing its outputs, logs, and metrics. Evaluation, on the other hand, involves …

image1 M2Eyluf

Gemini models are coming to GitHub Copilot

Today, we’re announcing that GitHub will make Gemini models – starting with Gemini 1.5 Pro – available to developers on its platform for the first time through a new partnership with Google Cloud. Developers value flexibility and control in choosing the best model suited to their needs — and this partnership shows that the next …

Promoting Cross-Modal Representations to Improve Multimodal Foundation Models for Physiological Signals

Many healthcare applications are inherently multimodal, involving several physiological signals. As sensors for these signals become more common, improving machine learning methods for multimodal healthcare data is crucial. Pretraining foundation models is a promising avenue for success. However, methods for developing foundation models in healthcare are still in early exploration and it is unclear which …

ML 17168 Image 1 Solution Architecture Diagram

Import data from Google Cloud Platform BigQuery for no-code machine learning with Amazon SageMaker Canvas

In the modern, cloud-centric business landscape, data is often scattered across numerous clouds and on-site systems. This fragmentation can complicate efforts by organizations to consolidate and analyze data for their machine learning (ML) initiatives. This post presents an architectural approach to extract data from different cloud environments, such as Google Cloud Platform (GCP) BigQuery, without …