FLAIR: Federated Learning Annotated Image Repository

Cross-device federated learning is an emerging machine learning (ML) paradigm where a large population of devices collectively train an ML model while the data remains on the devices. This research field has a unique set of practical challenges, and to systematically make advances, new datasets curated to be compatible with this paradigm are needed. Existing …

Two-Layer Bandit Optimization for Recommendations

Online commercial app marketplaces serve millions of apps to billions of users in an efficient manner. Bandit optimization algorithms are used to ensure that the recommendations are relevant, and converge to the best performing content over time. However, directly applying bandits to real-world systems, where the catalog of items is dynamic and continuously refreshed, is …

12A lwD7CP4CypfTGuFOGdlAw

Ontology: Finding meaning in data (Palantir RFx Blog Series, #1)

A functional data ecosystem must incorporate notions of Ontology in order to be scalable and sustainable. Editor’s note: This is the first post in the Palantir RFx Blog Series, which breaks down some of the key pillars of a data ecosystem using language commonly found in formal solicitations such as RFIs and RFPs. Each post …

12Aaw jENQWECVzWoJoPmLMHA

Evaluating Software (Palantir RFx Blog Series, #0)

This series tackles the rarely simple and often messy solicitation process. We explore how organizations can better evaluate digital transformation software. Welcome to the RFx Blog Series, which explores the question: how should commercial organizations evaluate digital transformation software? In this series, we use language commonly found in formal solicitations, including specific questions and functional …

Trustworthy AI helps provide equitable preventative care for diabetics

There are over 30 million people in America who have diabetes, and people with diabetes need to remain vigilant about their health. They need the extra attention and resources provided by their healthcare systems because, unfortunately, around 38% to 40% of people with diabetes end up visiting the ER due to complications. Healthcare organizations – …

Screen Shot 2022 09 13 at 11.54.02 AM 1024x752 1

Create high-quality data for ML models with Amazon SageMaker Ground Truth

Machine learning (ML) has improved business across industries in recent years—from the recommendation system on your Prime Video account, to document summarization and efficient search with Alexa’s voice assistance. However, the question remains of how to incorporate this technology into your business. Unlike traditional rule-based methods, ML automatically infers patterns from data so as to …

snowflake blog architecture high res 1024x562 1

Automate your time series forecasting in Snowflake using Amazon Forecast

This post is a joint collaboration with Andries Engelbrecht and James Sun of Snowflake, Inc. The cloud computing revolution has enabled businesses to capture and retain corporate and organizational data without capacity planning or data retention constraints. Now, with diverse and vast reserves of longitudinal data, companies are increasingly able to find novel and impactful …

ML 11204 image001

Achieve four times higher ML inference throughput at three times lower cost per inference with Amazon EC2 G5 instances for NLP and CV PyTorch models

Amazon Elastic Compute Cloud (Amazon EC2) G5 instances are the first and only instances in the cloud to feature NVIDIA A10G Tensor Core GPUs, which you can use for a wide range of graphics-intensive and machine learning (ML) use cases. With G5 instances, ML customers get high performance and a cost-efficient infrastructure to train and …

image5 Ob9b7BO.max 1000x1000 1

Building reusable Machine Learning workflows with Pipeline Templates

One of the best ways to share, reuse, and scale your ML workflows is to run them as pipelines. To maximize their value, it’s important to build these pipelines in such a way that you can easily reproduce runs that produce similar results, as described in the paper “Hidden Technical Debt in Machine Learning Systems”.  …