matrixhero 1

Mixed-input matrix multiplication performance optimizations

Posted by Manish Gupta, Staff Software Engineer, Google Research AI-driven technologies are weaving themselves into the fabric of our daily routines, with the potential to enhance our access to knowledge and boost our overall productivity. The backbone of these applications lies in large language models (LLMs). LLMs are memory-intensive and typically require specialized hardware accelerators …

Co-ML: Collaborative Machine Learning Model Building for Developing Dataset Design Practices

Machine learning (ML) models are fundamentally shaped by data, and building inclusive ML systems requires significant considerations around how to design representative datasets. Yet, few novice-oriented ML modeling tools are designed to foster hands-on learning of dataset design practices, including how to design for data diversity and inspect for data quality. To this end, we …

How does data deduplication work?

Recent years have witnessed an explosion in the proliferation of self-storage units. These large, warehouse units have sprung up nationally as a booming industry because of one reason—the average person now has more possessions than they know what to do with. The same basic situation also plagues the world of IT. We’re in the midst …

ML 15932 image001 1

Benchmark and optimize endpoint deployment in Amazon SageMaker JumpStart 

When deploying a large language model (LLM), machine learning (ML) practitioners typically care about two measurements for model serving performance: latency, defined by the time it takes to generate a single token, and throughput, defined by the number of tokens generated per second. Although a single request to the deployed endpoint would exhibit a throughput …

12AHHX5DPDpXLlowejWoqPxXw

User-Centered Machine Learning

A New Paradigm for Computer Vision Workflows Massive investments are being made across the DoD to develop and field Machine Learning (ML) based capabilities for intelligence and operational data sources. In particular, the application of Computer Vision (CV) models on top of overhead aerial imagery has become a key focus area for teams looking to …

Networks unchained: the shift toward intent-based autonomous operations

Telecommunications industry, a cornerstone of global connectivity, has been going through a technological renaissance for some time, driven by innovations such as 5G, IoT, cloud computing and AI. As a result, networks have become increasingly hard to manage. There is a need for automation to handle routine tasks, monitor network health and respond to issues in …

matrixhero

Mixed-input matrix multiplication performance optimizations

Posted by Manish Gupta, Staff Software Engineer, Google Research AI-driven technologies are weaving themselves into the fabric of our daily routines, with the potential to enhance our access to knowledge and boost our overall productivity. The backbone of these applications lies in large language models (LLMs). LLMs are memory-intensive and typically require specialized hardware accelerators …

NIST 1 1024x576 1

Architect defense-in-depth security for generative AI applications using the OWASP Top 10 for LLMs

Generative artificial intelligence (AI) applications built around large language models (LLMs) have demonstrated the potential to create and accelerate economic value for businesses. Examples of applications include conversational search, customer support agent assistance, customer support analytics, self-service virtual assistants, chatbots, rich media generation, content moderation, coding companions to accelerate secure, high-performance software development, deeper insights …