image7 HNRSRT0.max 1000x1000 1

Redacting PII data in Dialogflow CX with Google Cloud Data Loss Prevention (DLP)

Contact centers today handle all types of sensitive information including Personally Identifiable Information (PII), Protected Health Information (PHI), Payment Card Industry (PCI) data, and other confidential information (CI) as part of their day-to-day operations. This information can make its way into call recordings, call logs, agent notes, and application logs. It may also be used …

Subspace Recovery from Heterogeneous Data with Non-isotropic Noise

*= Equal Contributions Recovering linear subspaces from data is a fundamental and important task in statistics and machine learning. Motivated by heterogeneity in Federated Learning settings, we study a basic formulation of this problem: the principal component analysis (PCA), with a focus on dealing with irregular noise. Our data come from users with user contributing …

image2

Characterizing Emergent Phenomena in Large Language Models

Posted by Jason Wei and Yi Tay, Research Scientists, Google Research, Brain Team The field of natural language processing (NLP) has been revolutionized by language models trained on large amounts of text data. Scaling up the size of language models often leads to improved performance and sample efficiency on a range of downstream NLP tasks. …

prodegeblog1

How Prodege saved $1.5 million in annual human review costs using low-code computer vision AI

This post was co-authored by Arun Gupta, the Director of Business Intelligence at Prodege, LLC. Prodege is a data-driven marketing and consumer insights platform comprised of consumer brands—Swagbucks, MyPoints, Tada, ySense, InboxDollars, InboxPounds, DailyRewards, PollFish, and Upromise—along with a complementary suite of business solutions for marketers and researchers. Prodege has 120 million users and has …

ML 11592 image001

Identifying and avoiding common data issues while building no code ML models with Amazon SageMaker Canvas

Business analysts work with data and like to analyze, explore, and understand data to achieve effective business outcomes. To address business problems, they often rely on machine learning (ML) practitioners such as data scientists to assist with techniques such as utilizing ML to build models using existing data and generate predictions. However, it isn’t always …

MAEEG: Masked Auto-encoder for EEG Representation Learning

This paper was accepted at the Workshop on Learning from Time Series for Health at NeurIPS 2022. Decoding information from bio-signals such as EEG, using machine learning has been a challenge due to the small data-sets and difficulty to obtain labels. We propose a reconstruction-based self-supervised learning model, the masked auto-encoder for EEG (MAEEG), for …

12AYN0AGLHPUq8OxzI ATijTw

Seeing through hardware counters: a journey to threefold performance increase

By Vadim Filanovsky and Harshad Sane In one of our previous blogposts, A Microscope on Microservices we outlined three broad domains of observability (or “levels of magnification,” as we referred to them) — Fleet-wide, Microservice and Instance. We described the tools and techniques we use to gain insight within each domain. There is, however, a class of problems …

Approaches to long-term planning with IBM Planning Analytics

In our collective rush to react to ever-changing marketplace dynamics and shifts in the economy, it’s easy to focus on short-term plans, to the neglect of long-term planning. Today’s leaders need to have several plans – short-term, medium-term, and long-term. Different plans for different needs How do these plans differ? A short-term plan is designed …

image5

Multi-layered Mapping of Brain Tissue via Segmentation Guided Contrastive Learning

Posted by Peter H. Li, Research Scientist, and Sven Dorkenwald, Student Researcher, Connectomics at Google Mapping the wiring and firing activity of the human brain is fundamental to deciphering how we think — how we sense the world, learn, decide, remember, and create — as well as what issues can arise in brain disease or …

ML 11653 General Arch 1

Brain tumor segmentation at scale using AWS Inferentia

Medical imaging is an important tool for the diagnosis and localization of disease. Over the past decade, collections of medical images have grown rapidly, and open repositories such as The Cancer Imaging Archive and Imaging Data Commons have democratized access to this vast imaging data. Computational tools such as machine learning (ML) and artificial intelligence …