hapag eta image 1
This post is cowritten with Thomas Voss and Bernhard Hersberger from Hapag-Lloyd.
Hapag-Lloyd is one of the world’s leading shipping companies with more than 308 modern vessels, 11.9 million TEUs (twenty-foot equivalent units) transported per year, and 16,700 motivated employees in more than 400 offices in 139 countries. They connect continents, businesses, and people through reliable container transportation services on the major trade routes across the globe.
In this post, we share how Hapag-Lloyd developed and implemented a machine learning (ML)-powered assistant predicting vessel arrival and departure times that revolutionizes their schedule planning. By using Amazon SageMaker AI and implementing robust MLOps practices, Hapag-Lloyd has enhanced its schedule reliability—a key performance indicator in the industry and quality promise to their customers.
For Hapag-Lloyd, accurate vessel schedule predictions are crucial for maintaining schedule reliability, where schedule reliability is defined as percentage of vessels arriving within 1 calendar day (earlier or later) of their estimated arrival time, communicated around 3 to 4 weeks before arrival.
Prior to developing the new ML solution, Hapag-Lloyd relied on simple rule-based and statistical calculations, based on historical transit patterns for vessel schedule predictions. While this statistical method provided basic predictions, it couldn’t effectively account for real-time conditions such as port congestion, requiring significant manual intervention from operations teams.
Developing a new ML solution to replace the existing system presented several key challenges:
Hapag-Llyod’s previous approach to schedule planning couldn’t effectively address these challenges. A comprehensive solution that could handle both the complexity of vessel schedule prediction and provide the infrastructure needed to sustain ML operations at global scale was needed.
The Hapag-Lloyd network consists of over 308 vessels and many more partner vessels that continuously circumnavigate the globe on predefined service routes, resulting in more than 3,500 port arrivals per month. Each vessel operates on a fixed service line, making regular round trips between a sequence of ports. For instance, a vessel might repeatedly sail a route from Southampton to Le Havre, Rotterdam, Hamburg, New York, and Philadelphia before starting the cycle again. For each port arrival, an ETA must be provided multiple weeks in advance to arrange critical logistics, including berth windows at ports and onward transportation of containers by sea, land or air transport. The following table shows an example where a vessel travels from Southampton to New York through Le Havre, Rotterdam, and Hamburg. The vessel’s time until arrival at the New York port can be calculated as the sum of ocean to port time to Southampton, and the respective berth times and port-to-port times for the intermediate ports called while sailing to New York. If this vessel encounters a delay in Rotterdam, it affects its arrival in Hamburg and cascades through the entire schedule, impacting arrivals in New York and beyond as shown in the following table. This ripple effect can disrupt carefully planned transshipment connections and require extensive replanning of downstream operations.
Port | Terminal call | Scheduled arrival | Scheduled departure |
SOUTHAMPTON | 1 | 2025-07-29 07:00 | 2025-07-29 21:00 |
LE HAVRE | 2 | 2025-07-30 16:00 | 2025-07-31 16:00 |
ROTTERDAM | 3 | 2025-08-03 18:00 | 2025-08-05 03:00 |
HAMBURG | 4 | 2025-08-07 07:00 | 2025-08-08 07:00 |
NEW YORK | 5 | 2025-08-18 13:00 | 2025-08-21 13:00 |
PHILADELPHIA | 6 | 2025-08-22 06:00 | 2025-08-24 16:30 |
SOUTHAMPTON | 7 | 2025-09-01 08:00 | 2025-09-02 20:00 |
When a vessel departs Rotterdam with a delay, new ETAs must be calculated for the remaining ports. For Hamburg, we only need to estimate the remaining sailing time from the vessel’s current position. However, for subsequent ports like New York, the prediction requires multiple components: the remaining sailing time to Hamburg, the duration of port operations in Hamburg, and the sailing time from Hamburg to New York.
As an input to the vessel ETA prediction, we process the following two data sources:
These data sources are combined to create training datasets for the ML models. We carefully consider the timing of available data through temporal splitting to avoid data leakage. Data leakage occurs when using information that wouldn’t be available at prediction time in the real world. For example, when training a model to predict arrival time in Hamburg for a vessel currently in Rotterdam, we can’t use actual transit times that were only known after the vessel reached Hamburg.
A vessel’s journey can be divided into different legs, which led us to develop a multi-step solution using specialized ML models for each leg, which are orchestrated as hierarchical models to retrieve the overall ETA:
All four models are trained using the XGBoost algorithm built into SageMaker, chosen for its ability to handle complex relationships in tabular data and its robust performance with mixed numerical and categorical features. Each model has a dedicated training pipeline in SageMaker Pipelines, handling data preprocessing steps and model training. The following diagram shows the data processing pipeline, which generates the input datasets for ML training.
As an example, this diagram shows the training pipeline of the Berth model. The steps in the SageMaker training pipelines of the Berth, P2P, O2P, and Combined models are identical. Therefore, the training pipeline is implemented once as a blueprint and re-used across the other models, enabling a fast turn-around time of the implementation.
Because the Combined model depends on outputs from the other three specialized models, we use AWS Step Functions to orchestrate the SageMaker pipelines for training. This helps ensure that the individual models are updated in the correct sequence and maintains prediction consistency across the system. The orchestration of the training pipelines is shown in the following pipeline architecture.
The individual workflow begins with a data processing pipeline that prepares the input data (vessel schedules, AIS data, port congestion, and port performance metrics) and splits it into dedicated datasets. This feeds into three parallel SageMaker training pipelines for our base models (O2P, P2P, and Berth), each following a standardized process of feature encoding, hyperparameter optimization, model evaluation, and registration using SageMaker Processing and hyperparameter turning jobs and SageMaker Model Registry. After training, each base model runs a SageMaker batch transform job to generate predictions that serve as input features for the combined model training. The performance of the latest Combined model version is tested on the last 3 months of data with known ETAs, and performance metrics (R², mean absolute error (MAE)) are computed. If the model’s performance is below a set MAE threshold, the entire training process fails and the model version is automatically discarded, preventing the deployment of models that don’t meet the minimum performance threshold.
All four models are versioned and stored as separate model package groups in the SageMaker Model Registry, enabling systematic version control and deployment. This orchestrated approach helps ensure that our models are trained in the correct sequence using parallel processing, resulting in an efficient and maintainable training process.The hierarchical model approach helps further ensure that a degree of explainability comparable to the current statistical and rule-based solution is maintained—avoiding ML black box behavior. For example, it becomes possible to highlight unusually long berthing time predictions when discussing predictions results with business experts. This helps increase transparency and build trust, which in turn increases acceptance within the company.
The inference infrastructure implements a hybrid approach combining batch processing with real-time API capabilities as shown in Figure 5. Because most data sources update daily and require extensive preprocessing, the core predictions are generated through nightly batch inference runs. These pre-computed predictions are complemented by a real-time API that implements business logic for schedule changes and ETA updates.
This architecture enables millisecond response times to custom requests while achieving a 99.5% availability (a maximum 3.5 hours downtime per month).
Hapag Lloyd’s ML powered vessel scheduling assistant outperforms the current solution in both accuracy and response time. Typical API response times are in the order of hundreds of milliseconds, helping to ensure a real-time user experience and outperforming the current solution by more than 80%. Low response times are crucial because, in addition to fully automated schedule updates, business experts require low response times to work with the schedule assistant interactively. In terms of accuracy, the MAE of the ML-powered ETA predictions outperform the current solution by approximately 12%, which translates into climbing by two positions in the international ranking of schedule reliability on average. This is one of the key performance metrics in liner shipping, and this is a significant improvement within the industry.
To learn more about architecting and governing ML workloads at scale on AWS, see the AWS blog post Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker and the accompanying AWS workshop AWS Multi-Account Data & ML Governance Workshop.
We acknowledge the significant and valuable work of Michal Papaj and Piotr Zielinski from Hapag-Lloyd in the data science and data engineering areas of the project.
Thomas Voss works at Hapag-Lloyd as a data scientist. With his background in academia and logistics, he takes pride in leveraging data science expertise to drive business innovation and growth through the practical design and modeling of AI solutions.
Bernhard Hersberger works as a data scientist at Hapag-Lloyd, where he heads the AI Hub team in Hamburg. He is enthusiastic about integrating AI solutions across the company, taking comprehensive responsibility from identifying business issues to deploying and scaling AI solutions worldwide.
At AWS, Gabija Pasiunaite was a Machine Learning Engineer at AWS Professional Services based in Zurich. She specialized in building scalable ML and data solutions for AWS Enterprise customers, combining expertise in data engineering, ML automation and cloud infrastructure. Gabija has contributed to the AWS MLOps Framework used by AWS customers globally. Outside work, Gabija enjoys exploring new destinations and staying active through hiking, skiing, and running.
Jean-Michel Lourier is a Senior Data Scientist within AWS Professional Services. He leads teams implementing data driven applications side by side with AWS customers to generate business value out of their data. He’s passionate about diving into tech and learning about AI, machine learning, and their business applications. He is also an enthusiastic cyclist.
Mousam Majhi is a Senior ProServe Cloud Architect focusing on Data & AI within AWS Professional Services. He works with Manufacturing and Travel, Transportation & Logistics customers in DACH to achieve their business outcomes by leveraging data and AI powered solutions. Outside of work, Mousam enjoys hiking in the Bavarian Alps.
submitted by /u/mtrx3 [link] [comments]
Imbalanced datasets are a common challenge in machine learning.
Organizations are increasingly integrating generative AI capabilities into their applications to enhance customer experiences, streamline…
Many data science teams rely on Apache Spark running on Dataproc managed clusters for powerful,…
The upgraded version of the Legion Go S with SteamOS makes for a nice Steam…
Artificial intelligence is transforming biology and medicine by accelerating the discovery of new drugs and…