This is a guest post co-authored by Nafi Ahmet Turgut, Mutlu Polatcan, Pınar Baki, Mehmet İkbal Özmen, Hasan Burak Yel, and Hamza Akyıldız from Getir.
Getir is the pioneer of ultrafast grocery delivery. The tech company has revolutionized last-mile delivery with its “groceries in minutes” delivery proposition. Getir was founded in 2015 and operates in Turkey, the UK, the Netherlands, Germany, France, Spain, Italy, Portugal, and the United States. Today, Getir is a conglomerate incorporating nine verticals under the same brand.
Predicting future demand is one of the most important insights for Getir and one of the biggest challenges we face. Getir relies heavily on accurate demand forecasts at a SKU level when making business decisions in a wide range of areas, including marketing, production, inventory, and finance. Accurate forecasts are necessary for supporting inventory holding and replenishment decisions. Having a clear and reliable picture of predicted demand for the next day or week allows us to adjust our strategy and increase our ability to meet sales and revenue goals.
Getir used Amazon Forecast, a fully managed service that uses machine learning (ML) algorithms to deliver highly accurate time series forecasts, to increase revenue by four percent and reduce waste cost by 50 percent. In this post, we describe how we used Forecast to achieve these benefits. We outline how we built an automated demand forecasting pipeline using Forecast and orchestrated by AWS Step Functions to predict daily demand for SKUs. This solution led to highly accurate forecasting for over 10,000 SKUs across all countries where we operate, and contributed significantly to our ability to develop high scalable internal supply chain processes.
Forecast automates much of the time-series forecasting process, enabling you to focus on preparing your datasets and interpreting your predictions.
Step Functions is a fully managed service that makes it easier to coordinate the components of distributed applications and microservices using visual workflows. Building applications from individual components that each perform a discrete function helps you scale more easily and change applications more quickly. Step Functions automatically triggers and tracks each step and retries when there are errors, so your application executes in order and as expected.
Six people from Getir’s data science team and infrastructure team worked together on this project. The project was completed in 3 months and deployed to production after 2 months of testing.
The following diagram shows the solution’s architecture.
The model pipeline is executed separately for each country. The architecture includes four Airflow cron jobs running on a defined schedule. The pipeline starts with feature creation which first creates the features and loads them to Amazon Redshift. Next, a feature processing job prepares daily features stored in Amazon Redshift and unloads the time series data to Amazon Simple Storage Service (Amazon S3). A second Airflow job is responsible for triggering the Forecast pipeline via Amazon EventBridge. The pipeline consists of Amazon Lambda functions, which create predictors and forecasts based on parameters stored in Amazon S3. Forecast reads data from Amazon S3, trains the model with hyperparameter optimization (HPO) to optimize model performance, and produces future predictions for product sales. Then the Step Functions “WaitInProgress” pipeline is triggered for each country, which enables parallel execution of a pipeline for each country.
Amazon Forecast has six built-in algorithms (ARIMA, ETS, NPTS, Prophet, DeepAR+, CNN-QR), which are clustered into two groups: statististical and deep/neural network. Among those algorithms, deep/neural networks are more suitable for e-commerce forecasting problems as they accept item metadata features, forward-looking features for campaign and marketing activities, and – most importantly – related time series features. Deep/neural network algorithms also perform very well on sparse data set and in cold-start (new item introduction) scenarios.
Overall, in our experimentations, we observed that deep/neural network models performed significantly better than the statistical models. We therefore focused our deep-dive testing on DeepAR+ and CNN-QR
One of the most important benefits of Amazon Forecast is scalability and accurate results for many product and country combinations. In our testing both DeepAR+ and CNN-QR algorithms brought success in capturing trends and seasonality, allowing us to obtain efficient results in products whose demand changes very frequently.
Deep AutoRegressive Plus (DeepAR+) is a supervised univariate forecasting algorithm based on recurrent neural networks (RNNs) created by Amazon Research. Its main advantages are that it is easily scalable, able to incorporate relevant co-variates into the data (such as related data and metadata), and able to forecast cold-start items. Instead of fitting separate models for each time series, it creates a global model from related time series to handle widely-varying scales through rescaling and velocity-based sampling. The RNN architecture incorporates binomial likelihood to produce probabilistic forecasting and is advocated to outperform traditional single-item forecasting methods (like Prophet) by the authors of DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks.
We ultimately selected the Amazon CNN-QR (Convolutional Neural Network – Quantile Regression) algorithm for our forecasting due to its high performance in the backtest process. CNN-QR is a proprietary ML algorithm developed by Amazon for forecasting scalar (one-dimensional) time series using causal Convolutional Neural Networks (CNNs).
As previously mentioned, CNN-QR can employ related time series and metadata about the items being forecasted. Metadata must include an entry for all unique items in the target time series, which in our case are the products whose demand we are forecasting. To improve accuracy, we used category and subcategory metadata, which helped the model understand the relationship between certain products, including complementary and substitutes. For example, for beverages, we provide an additional flag for snacks since the two categories are complementary to each other.
One significant advantage of CNN-QR is its ability to forecast without future related time series, which is important when you can’t provide related features for the forecast window. This capability, along with its forecast accuracy, meant that CNN-QR produced the best results with our data and use case.
Forecasts created through the system are written to separate S3 buckets after they are received on a country basis. Then, forecasts are written to Amazon Redshift based on SKU and country with daily jobs. We then carry out daily product stock planning based on our forecasts.
On an ongoing basis, we calculate mean absolute percentage error (MAPE) ratios with product-based data, and optimize model and feature ingestion processes.
In this post, we walked through an automated demand forecasting pipeline we built using Amazon Forecast and AWS Step Functions.
With Amazon Forecast we improved our country-specific MAPE by 10 percent. This has driven a four percent revenue increase, and decreased our waste costs by 50 percent. In addition, we achieved an 80 percent improvement in our training times in daily forecasts in terms of scalability. We are able to forecast over 10,000 SKUs daily in all the countries we serve.
For more information about how to get started building your own pipelines with Forecast, see Amazon Forecast resources. You can also visit AWS Step Functions to get more information about how to build automated processes and orchestrate and create ML pipelines. Happy forecasting, and start improving your business today!
AI accelerationists have won as a consequence of the election, potentially sidelining those advocating for…
L'Oréal's first professional hair dryer combines infrared light, wind, and heat to drastically reduce your…
TL;DR A conversation with 4o about the potential demise of companies like Anthropic. As artificial…
Whether a company begins with a proof-of-concept or live deployment, they should start small, test…
Digital tools are not always superior. Here are some WIRED-tested agendas and notebooks to keep…
Machine learning (ML) models are built upon data.