This post is co-written with Aruna Abeyakoon and Denisse Colin from Light and Wonder (L&W).
Headquartered in Las Vegas, Light & Wonder, Inc. is the leading cross-platform global game company that provides gambling products and services. Working with AWS, Light & Wonder recently developed an industry-first secure solution, Light & Wonder Connect (LnW Connect), to stream telemetry and machine health data from roughly half a million electronic gaming machines distributed across its casino customer base globally when LnW Connect reaches its full potential. Over 500 machine events are monitored in near-real time to give a full picture of machine conditions and their operating environments. Utilizing data streamed through LnW Connect, L&W aims to create better gaming experience for their end-users as well as bring more value to their casino customers.
Light & Wonder teamed up with the Amazon ML Solutions Lab to use events data streamed from LnW Connect to enable machine learning (ML)-powered predictive maintenance for slot machines. Predictive maintenance is a common ML use case for businesses with physical equipment or machinery assets. With predictive maintenance, L&W can get advanced warning of machine breakdowns and proactively dispatch a service team to inspect the issue. This will reduce machine downtime and avoid significant revenue loss for casinos. With no remote diagnostic system in place, issue resolution by the Light & Wonder service team on the casino floor can be costly and inefficient, while severely degrading the customer gaming experience.
The nature of the project is highly exploratory—this is the first attempt at predictive maintenance in the gaming industry. The Amazon ML Solutions Lab and L&W team embarked on an end-to-end journey from formulating the ML problem and defining the evaluation metrics, to delivering a high-quality solution. The final ML model combines CNN and Transformer, which are the state-of-the-art neural network architectures for modeling sequential machine log data. The post presents a detailed description of this journey, and we hope you will enjoy it as much as we do!
In this post, we discuss the following:
In this section, we discuss the issues that necessitated this solution.
Slot machine environments are highly regulated and are deployed in an air-gapped environment. In LnW Connect, an encryption process was designed to provide a secure and reliable mechanism for the data to be brought into an AWS data lake for predictive modeling. The aggregated files are encrypted and the decryption key is only available in AWS Key Management Service (AWS KMS). A cellular-based private network into AWS is set up through which the files were uploaded into Amazon Simple Storage Service (Amazon S3).
LnW Connect streams a wide range of machine events, such as start of game, end of game, and more. The system collects over 500 different types of events. As shown in the following
, each event is recorded along with a timestamp of when it happened and the ID of the machine recording the event. LnW Connect also records when a machine enters a non-playable state, and it will be marked as a machine failure or breakdown if it doesn’t recover to a playable state within a sufficiently short time span.
Machine ID | Event Type ID | Timestamp |
---|---|---|
0 | E1 | 2022-01-01 00:17:24 |
0 | E3 | 2022-01-01 00:17:29 |
1000 | E4 | 2022-01-01 00:17:33 |
114 | E234 | 2022-01-01 00:17:34 |
222 | E100 | 2022-01-01 00:17:37 |
In addition to dynamic machine events, static metadata about each machine is also available. This includes information such as machine unique identifier, cabinet type, location, operating system, software version, game theme, and more, as shown in the following table. (All the names in the table are anonymized to protect customer information.)
Machine ID | Cabinet Type | OS | Location | Game Theme |
---|---|---|---|---|
276 | A | OS_Ver0 | AA Resort & Casino | StormMaiden |
167 | B | OS_Ver1 | BB Casino, Resort & Spa | UHMLIndia |
13 | C | OS_Ver0 | CC Casino & Hotel | TerrificTiger |
307 | D | OS_Ver0 | DD Casino Resort | NeptunesRealm |
70 | E | OS_Ver0 | EE Resort & Casino | RLPMealTicket |
We treat the predictive maintenance problem for slot machines as a binary classification problem. The ML model takes in the historical sequence of machine events and other metadata and predicts whether a machine will encounter a failure in a 6-hour future time window. If a machine will break down within 6 hours, it is deemed a high-priority machine for maintenance. Otherwise, it is low priority. The following figure gives examples of low-priority (top) and high-priority (bottom) samples. We use a fixed-length look-back time window to collect historical machine event data for prediction. Experiments show that longer look-back time windows improve model performance significantly (more details later in this post).
We faced a couple of challenges solving this problem:
In this section, we discuss our methods for data preparation and feature engineering.
Slot machine feeds are streams of unequally spaced time series events; for example, the number of events in a 3-hour window can range from tens to thousands. To handle this imbalance, we used event frequencies instead of the raw sequence data. A straightforward approach is aggregating the event frequency for the entire look-back window and feeding it into the model. However, when using this representation, the temporal information is lost, and the order of events is not preserved. We instead used temporal binning by dividing the time window into N equal sub-windows and calculating the event frequencies in each. The final features of a time window are the concatenation of all its sub-window features. Increasing the number of bins preserves more temporal information. The following figure illustrates temporal binning on a sample window.
First, the sample time window is split into two equal sub-windows (bins); we used only two bins here for simplicity for illustration. Then, the counts of the events E1, E2, E3, and E4 are calculated in each bin. Lastly, they are concatenated and used as features.
Along with the event frequency-based features, we used machine-specific features like software version, cabinet type, game theme, and game version. Additionally, we added features related to the timestamps to capture the seasonality, such as hour of the day and day of the week.
To extract data efficiently for training and testing, we utilize Amazon Athena and the AWS Glue Data Catalog. The events data is stored in Amazon S3 in Parquet format and partitioned according to day/month/hour. This facilitates efficient extraction of data samples within a specified time window. We use data from all machines in the latest month for testing and the rest of the data for training, which helps avoid potential data leakage.
In this section, we discuss our baseline model with AutoGluon and how we built a customized neural network with SageMaker automatic model tuning.
With any ML use case, it’s important to establish a baseline model to be used for comparison and iteration. We used AutoGluon to explore several classic ML algorithms. AutoGluon is easy-to-use AutoML tool that uses automatic data processing, hyperparameter tuning, and model ensemble. The best baseline was achieved with a weighted ensemble of gradient boosted decision tree models. The ease of use of AutoGluon helped us in the discovery stage to navigate quickly and efficiently through a wide range of possible data and ML modeling directions.
After experimenting with different neural networks architectures, we built a customized deep learning model for predictive maintenance. Our model surpassed the AutoGluon baseline model by 121% in recall at 80% precision. The final model ingests historical machine event sequence data, time features such as hour of the day, and static machine metadata. We utilize SageMaker automatic model tuning jobs to search for the best hyperparameters and model architectures.
The following figure shows the model architecture. We first normalize the binned event sequence data by average frequencies of each event in the training set to remove the overwhelming effect of high-frequency events (start of game, end of game, and so on). The embeddings for individual events are learnable, while the temporal feature embeddings (day of the week, hour of the day) are extracted using the package GluonTS. Then we concatenate the event sequence data with the temporal feature embeddings as the input to the model. The model consists of the following layers:
We use the cross-entropy loss with class weights as tunable hyperparameters to adjust for the class imbalance issue. In addition, the numbers of CNN and TRANS layers are crucial hyperparameters with the possible values of 0, which means specific layers may not always exist in the model architecture. This way, we have a unified framework where the model architectures are searched along with other usual hyperparameters.
We utilize SageMaker automatic model tuning, also known as hyperparameter optimization (HPO), to efficiently explore model variations and the large search space of all hyperparameters. Automatic model tuning receives the customized algorithm, training data, and hyperparameter search space configurations, and searches for best hyperparameters using different strategies such as Bayesian, Hyperband, and more with multiple GPU instances in parallel. After evaluating on a hold-out validation set, we obtained the best model architecture with two layers of CNN, one layer of TRANS with four heads, and an AGG layer.
We used the following hyperparameter ranges to search for the best model architecture:
To further improve model accuracy and reduce model variance, we trained the model with multiple independent random weight initializations, and aggregated the result with mean values as the final probability prediction. There is a trade-off between more computing resources and better model performance, and we observed that 5–10 should be a reasonable number in the current use case (results shown later in this post).
In this section, we present the model performance evaluation metrics and results.
Precision is very important for this predictive maintenance use case. Low precision means reporting more false maintenance calls, which drives costs up through unnecessary maintenance. Because average precision (AP) doesn’t fully align with the high precision objective, we introduced a new metric named average recall at high precisions (ARHP). ARHP is equal to the average of recalls at 60%, 70%, and 80% precision points. We also used precision at top K% (K=1, 10), AUPR, and AUROC as additional metrics.
The following table summarizes the results using the baseline and the customized neural network models, with 7/1/2022 as the train/test split point. Experiments show that increasing the window length and sample data size both improve the model performance, because they contain more historical information to help with the prediction. Regardless of the data settings, the neural network model outperforms AutoGluon in all metrics. For example, recall at the fixed 80% precision is increased by 121%, which enables you to quickly identify more malfunctioned machines if using the neural network model.
Model | Window length/Data size | AUROC | AUPR | ARHP | Recall@Prec0.6 | Recall@Prec0.7 | Recall@Prec0.8 | Prec@top1% | Prec@top10% |
---|---|---|---|---|---|---|---|---|---|
AutoGluon baseline | 12H/500k | 66.5 | 36.1 | 9.5 | 12.7 | 9.3 | 6.5 | 85 | 42 |
Neural Network | 12H/500k | 74.7 | 46.5 | 18.5 | 25 | 18.1 | 12.3 | 89 | 55 |
AutoGluon baseline | 48H/1mm | 70.2 | 44.9 | 18.8 | 26.5 | 18.4 | 11.5 | 92 | 55 |
Neural Network | 48H/1mm | 75.2 | 53.1 | 32.4 | 39.3 | 32.6 | 25.4 | 94 | 65 |
The following figures illustrate the effect of using ensembles to boost the neural network model performance. All the evaluation metrics shown on the x-axis are improved, with higher mean (more accurate) and lower variance (more stable). Each box-plot is from 12 repeated experiments, from no ensembles to 10 models in ensembles (x-axis). Similar trends persist in all metrics besides the Prec@top1% and Recall@Prec80% shown.
After factoring in the computational cost, we observe that using 5–10 models in ensembles is suitable for Light & Wonder datasets.
Our collaboration has resulted in the creation of a groundbreaking predictive maintenance solution for the gaming industry, as well as a reusable framework that could be utilized in a variety of predictive maintenance scenarios. The adoption of AWS technologies such as SageMaker automatic model tuning facilitates Light & Wonder to navigate new opportunities using near-real-time data streams. Light & Wonder is starting the deployment on AWS.
If you would like help accelerating the use of ML in your products and services, please contact the Amazon ML Solutions Lab program.
AI accelerationists have won as a consequence of the election, potentially sidelining those advocating for…
L'Oréal's first professional hair dryer combines infrared light, wind, and heat to drastically reduce your…
TL;DR A conversation with 4o about the potential demise of companies like Anthropic. As artificial…
Whether a company begins with a proof-of-concept or live deployment, they should start small, test…
Digital tools are not always superior. Here are some WIRED-tested agendas and notebooks to keep…
Machine learning (ML) models are built upon data.