1 xdU0TAU.max 1000x1000 1

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

With Vertex AI Model Garden, Google Cloud strives to deliver highly efficient and cost-optimized ML workflow recipes. Currently, it offers a selection of more than 150 first-party, open and third-party foundation models. Last year, we introduced the popular open source LLM serving stack vLLM on GPUs, in Vertex Model Garden. Since then, we have witnessed …

Introduction to AutoML: Automating Machine Learning Workflows

AutoML is a tool designed for both technical and non-technical experts. It simplifies the process of training machine learning models. All you have to do is provide it with the dataset, and in return, it will provide you with the best-performing model for your use case. You don’t have to code for long hours or …

ML 16996 Picture1

Amazon SageMaker inference launches faster auto scaling for generative AI models

Today, we are excited to announce a new capability in Amazon SageMaker inference that can help you reduce the time it takes for your generative artificial intelligence (AI) models to scale automatically. You can now use sub-minute metrics and significantly reduce overall scaling latency for generative AI models. With this enhancement, you can improve the …

1 j1iPfal.max 1000x1000 1

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

Leveraging enterprise data for generative AI and large language models (LLMs) presents significant challenges related to data silos, quality inconsistencies, privacy and security concerns, compliance with data regulations, capturing domain-specific knowledge, and mitigating inherent biases. Organizations must navigate the complexities of consolidating fragmented data sources, ensuring data integrity, and addressing ethical considerations. Techniques like retrieval …