forecasting misuse og 2

Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and How to Reduce Risk

OpenAI researchers collaborated with Georgetown University’s Center for Security and Emerging Technology and the Stanford Internet Observatory to investigate how large language models might be misused for disinformation purposes. The collaboration included an October 2021 workshop bringing together 30 disinformation researchers, machine learning experts, and policy analysts, and culminated in a co-authored report building on …

How to unlock a scientific approach to change management with powerful data insights

Without a doubt, there is exponential growth in the access to and volume of process data we all, as individuals, have at our fingertips. Coupled with a current climate that is proving to be increasingly ambiguous and complex, there is a huge opportunity to leverage data insights to drive a more robust, evidence-based methodology to …

ML 12185 NLP Pipelinev2 1

Enriching real-time news streams with the Refinitiv Data Library, AWS services, and Amazon SageMaker

This post is co-authored by Marios Skevofylakas, Jason Ramchandani and Haykaz Aramyan from Refinitiv, An LSEG Business. Financial service providers often need to identify relevant news, analyze it, extract insights, and take actions in real time, like trading specific instruments (such as commodities, shares, funds) based on additional information or context of the news item. …

image1 vfOtKz5.max 1000x1000 1

Reading and storing data for custom model training on Vertex AI

Before you can train ML models in the cloud, you need to get your data to the cloud.  But when it comes to storing data on Google Cloud there are a lot of different options. Not to mention the different ways you can read in data when designing input pipelines for custom models. Should you …

Figure 1 NPS database table definitions

How to use Netezza Performance Server query data in Amazon Simple Storage Service (S3)

In this example, we will demonstrate using current data within a Netezza Performance Server as a Service (NPSaaS) table combined with historical data in Parquet files to determine if flight delays have increased in 2022 due to the impact of the COVID-19 pandemic on the airline travel industry. This demonstration illustrates how Netezza Performance Server …

ML 10461 result

Best practices for load testing Amazon SageMaker real-time inference endpoints

Amazon SageMaker is a fully managed machine learning (ML) service. With SageMaker, data scientists and developers can quickly and easily build and train ML models, and then directly deploy them into a production-ready hosted environment. It provides an integrated Jupyter authoring notebook instance for easy access to your data sources for exploration and analysis, so …

img0.max 1000x1000 1

Sparse Features Support in BigQuery

Introduction Most machine learning models require the input features to be in numerical format and if the features are in categorial format, pre-processing steps such as one-hot encoding are needed to convert them into numerical format. Converting a large number of categorical values may lead to creating sparse features, a set of features that contains …

ml 12274 img1 new

Get smarter search results with the Amazon Kendra Intelligent Ranking and OpenSearch plugin

If you’ve had the opportunity to build a search application for unstructured data (i.e., wiki, informational web sites, self-service help pages, internal documentation, etc.) using open source or commercial-off-the-shelf search engines, then you’re probably familiar with the inherent accuracy challenges involved in getting relevant search results. The intended meaning of both query and document can …

ml 10578 image001

Model hosting patterns in Amazon SageMaker, Part 1: Common design patterns for building ML applications on Amazon SageMaker

Machine learning (ML) applications are complex to deploy and often require the ability to hyper-scale, and have ultra-low latency requirements and stringent cost budgets. Use cases such as fraud detection, product recommendations, and traffic prediction are examples where milliseconds matter and are critical for business success. Strict service level agreements (SLAs) need to be met, …

Processing W2 & Payslips is now even simpler with Document AI

Documents like payslips and W2s are crucial to processes such as employment and income verification for mortgage loans, personal loans, personal finance, and benefits processing. Unfortunately, efficiently extracting data from these documents at scale can be challenging and time-consuming, with many organizations relying on manual examination of documents or automated approaches that don’t adequately capture …