FLAIR: Federated Learning Annotated Image Repository
Cross-device federated learning is an emerging machine learning (ML) paradigm where a large population of devices collectively train an ML model while the data remains on the devices. This research field has a unique set of practical challenges, and to systematically make advances, new datasets curated to be compatible with this paradigm are needed. Existing …
Read more “FLAIR: Federated Learning Annotated Image Repository”
Two-Layer Bandit Optimization for Recommendations
Online commercial app marketplaces serve millions of apps to billions of users in an efficient manner. Bandit optimization algorithms are used to ensure that the recommendations are relevant, and converge to the best performing content over time. However, directly applying bandits to real-world systems, where the catalog of items is dynamic and continuously refreshed, is …
Read more “Two-Layer Bandit Optimization for Recommendations”
Ontology: Finding meaning in data (Palantir RFx Blog Series, #1)
A functional data ecosystem must incorporate notions of Ontology in order to be scalable and sustainable. Editor’s note: This is the first post in the Palantir RFx Blog Series, which breaks down some of the key pillars of a data ecosystem using language commonly found in formal solicitations such as RFIs and RFPs. Each post …
Read more “Ontology: Finding meaning in data (Palantir RFx Blog Series, #1)”
Evaluating Software (Palantir RFx Blog Series, #0)
This series tackles the rarely simple and often messy solicitation process. We explore how organizations can better evaluate digital transformation software. Welcome to the RFx Blog Series, which explores the question: how should commercial organizations evaluate digital transformation software? In this series, we use language commonly found in formal solicitations, including specific questions and functional …
Read more “Evaluating Software (Palantir RFx Blog Series, #0)”
Trustworthy AI helps provide equitable preventative care for diabetics
There are over 30 million people in America who have diabetes, and people with diabetes need to remain vigilant about their health. They need the extra attention and resources provided by their healthcare systems because, unfortunately, around 38% to 40% of people with diabetes end up visiting the ER due to complications. Healthcare organizations – …
Read more “Trustworthy AI helps provide equitable preventative care for diabetics”
Create high-quality data for ML models with Amazon SageMaker Ground Truth
Machine learning (ML) has improved business across industries in recent years—from the recommendation system on your Prime Video account, to document summarization and efficient search with Alexa’s voice assistance. However, the question remains of how to incorporate this technology into your business. Unlike traditional rule-based methods, ML automatically infers patterns from data so as to …
Read more “Create high-quality data for ML models with Amazon SageMaker Ground Truth”
Automate your time series forecasting in Snowflake using Amazon Forecast
This post is a joint collaboration with Andries Engelbrecht and James Sun of Snowflake, Inc. The cloud computing revolution has enabled businesses to capture and retain corporate and organizational data without capacity planning or data retention constraints. Now, with diverse and vast reserves of longitudinal data, companies are increasingly able to find novel and impactful …
Read more “Automate your time series forecasting in Snowflake using Amazon Forecast”
Achieve four times higher ML inference throughput at three times lower cost per inference with Amazon EC2 G5 instances for NLP and CV PyTorch models
Amazon Elastic Compute Cloud (Amazon EC2) G5 instances are the first and only instances in the cloud to feature NVIDIA A10G Tensor Core GPUs, which you can use for a wide range of graphics-intensive and machine learning (ML) use cases. With G5 instances, ML customers get high performance and a cost-efficient infrastructure to train and …
Building reusable Machine Learning workflows with Pipeline Templates
One of the best ways to share, reuse, and scale your ML workflows is to run them as pipelines. To maximize their value, it’s important to build these pipelines in such a way that you can easily reproduce runs that produce similar results, as described in the paper “Hidden Technical Debt in Machine Learning Systems”. …
Read more “Building reusable Machine Learning workflows with Pipeline Templates”