0 NAS Workflow scaled 1

Reduce inference time for BERT models using neural architecture search and SageMaker Automated Model Tuning

In this post, we demonstrate how to use neural architecture search (NAS) based structural pruning to compress a fine-tuned BERT model to improve model performance and reduce inference times. Pre-trained language models (PLMs) are undergoing rapid commercial and enterprise adoption in the areas of productivity tools, customer service, search and recommendations, business process automation, and …

1 Log entry.max 1000x1000 1

Figuring out microservices running on your GKE cluster with help from Duet AI

If you’ve joined a new team recently like I have, you’ve probably had a lot of questions. And answers to those questions may or may not be things you can find easily, and might rely heavily on the generosity, and spare time of your teammates. Let’s say you’re a DevRel engineer, working with Google Kubernetes …

Unlocking the power of chatbots: Key benefits for businesses and customers

Chatbots can help your customers and potential clients find or input information quickly by instantly responding to requests that use audio input, text input or a combination of both, eliminating the need for human intervention or manual research. Chatbots are everywhere, providing customer care support and assisting employees who use smart speakers at home, SMS, …

ASPIRE2520hero

Introducing ASPIRE for selective prediction in LLMs

Posted by Jiefeng Chen, Student Researcher, and Jinsung Yoon, Research Scientist, Cloud AI Team In the fast-evolving landscape of artificial intelligence, large language models (LLMs) have revolutionized the way we interact with machines, pushing the boundaries of natural language understanding and generation to unprecedented heights. Yet, the leap into high-stakes decision-making applications remains a chasm …

1 Copy of Embedding Generator Application v1

AlloyDB AI powers gen AI applications with seamless Vertex AI integration

At Next ‘23, we launched AlloyDB AI, an integrated set of capabilities built into AlloyDB for building generative AI applications. One of those capabilities allows you to call a Vertex AI model directly from the database using SQL. AlloyDB is a fully managed PostgreSQL-compatible database that offers superior performance, availability and scale. In our performance …