Enhancing data security and compliance in the XaaS Era 

Recent research from IDC found that 85% of CEOs who were surveyed cited digital capabilities as strategic differentiators that are crucial to accelerating revenue growth. However, IT decision makers remain concerned about the risks associated with their digital infrastructure and the impact they might have on business outcomes, with data breaches and security concerns being …

ml 15543 img01 mlops architecture 1

How LotteON built a personalized recommendation system using Amazon SageMaker and MLOps

This post is co-written with HyeKyung Yang, Jieun Lim, and SeungBum Shim from LotteON. LotteON aims to be a platform that not only sells products, but also provides a personalized recommendation experience tailored to your preferred lifestyle. LotteON operates various specialty stores, including fashion, beauty, luxury, and kids, and strives to provide a personalized shopping …

Screenshot 2024 05 16 at 5.37.34PM.max 1000x1000 1

To tune or not to tune? A guide to leveraging your data with LLMs

Customers tell us they see great potential in using large language models (LLMs) with their data to improve customer experiences, automate internal processes, access and find information, and create new content —just to name a few of the emerging generative AI use cases. There are many ways to leverage your data, so in this blog, …

12AebR0r4FmOLXAOgQHHPYuAA

AI on Air: Exploring GPT-4o

DALL-E illustration showcasing my audio demos and conversations with GPT-4o This week, OpenAI announced the release of GPT-4o, the latest iteration of its language model with new capabilities across multiple modalities. The “o” in GPT-4o stands for “omni,” highlighting its enhanced ability to reason in real-time across audio, vision, and text. This makes it especially useful for …

KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation

Large Language Model or LLM inference has two phases, the prompt (or prefill) phase to output the first token and the extension (or decoding) phase to the generate subsequent tokens. In this work, we propose an efficient parallelization scheme, KV-Runahead to accelerate the prompt phase. The key observation is that the extension phase generates tokens …

The power of remote engine execution for ETL/ELT data pipelines

Business leaders risk compromising their competitive edge if they do not proactively implement generative AI (gen AI). However, businesses scaling AI face entry barriers. Organizations require reliable data for robust AI models and accurate insights, yet the current technology landscape presents unparalleled data quality challenges. According to International Data Corporation (IDC), stored data is set to increase by …

ML 16059 image001 1024x267 1

Build a serverless exam generator application from your own lecture content using Amazon Bedrock

Crafting new questions for exams and quizzes can be tedious and time-consuming for educators. The time required varies based on factors like subject matter, question types, experience level, and class level. Multiple-choice questions require substantial time to generate quality distractors and ensure a single unambiguous answer, and composing effective true-false questions demands careful effort to …

1 EgmxPWl

Announcing general availability of Ray on Vertex AI

Developers and engineers face several major challenges when scaling AI/ML workloads. One challenge is getting access to the AI infrastructure they need. AI/ML workloads require a significant amount of computational resources, such as CPUs and GPUs. Developers need to have sufficient resources to run their workloads. Another challenge is handling the diverse patterns and programming …

Needle-Moving AI Research Trains Surgical Robots in Simulation

A collaboration between NVIDIA and academic researchers is prepping robots for surgery. ORBIT-Surgical — developed by researchers from the University of Toronto, UC Berkeley, ETH Zurich, Georgia Tech and NVIDIA — is a simulation framework to train robots that could augment the skills of surgical teams while reducing surgeons’ cognitive load. It supports more than …