It’s no secret that large-scale upheavals in the global aviation industry, including the catastrophic impact of the pandemic, have sent airline companies reeling over the past few years. Despite the worldwide chaos, UAE national airline Etihad has managed to generate productivity gains and cost savings from insights using data science.
Based in Abu Dhabi and in operation since 2003, in recent years Etihad has used a data lake and a unified set of AI-driven analytics tools to optimise staffing, the handling of passengers, and responses to customer inquiries.
“Our digital transformation has allowed us to be more streamlined, more agile, and more efficient. In reviewing our positioning as a mid-sized carrier, our governance and way of thinking has had to change,” says Dr Reem Alaya Lebhar, director of Strategy, Management & Portfolio Governance at Etihad.
Reem Alaya Lebhar
Etihad began its data science journey with the Cloudera Data Platform and moved its data to the cloud to set up a data lake. They were, however, using multiple vendor technologies to support the data lake, which led to inefficiencies in the way they analysed their data. A change was needed.
“Etihad is on a digital transformation journey. Our data strategy supports our vision of harnessing all of the data that is available across the organisation, breaking down the silos to enhance every business process that we have,” says Martin Hammer, head of Enterprise Data Management at Etihad.
Unifying analytics on a data science platform
Etihad made a decision to unify their data modeling and analytics, choosing Dataiku’s end-to-end machine learning platform to do so.
“Etihad were collecting data, but what they needed was to be able to make insights from this data,” says Siddhartha Bhatia, regional vice president, Middle East and Turkey, at Dataiku. “They wanted to standardize everything, break those silos, into something very standardized.”
As a global airline, Etihad’s custodians of data operate out of different countries. As a server and browser-based application, Dataiku allowed remote and distributed teams to work collaboratively across different time zones and departments.
The low code, visualization tools embedded inside Dataiku allowed business heads to work closely with data scientists. It also gave the company an opportunity to upskill analysts, notes Talal Mufti, data science manager at Etihad.
Talal Mufti
Etihad wanted to deploy, schedule and automate their data models very rapidly. They also wanted to be able to demonstrate cost reductions.
Etihad identified a large number of use cases that were short term in nature, which they further developed to evaluate which would provide the biggest hit first.
As a first step, Etihad prioritized use cases based on where there was a maximum benefit, and which could be done in the earlier stages of rolling out the Dataiku platform.
Financial benefits and cost savings became a big driver in quite a lot of the use cases shortlisted by Etihad. While the adoption and roll-out of the analytics platform predates COVID, it did have an impact at a later stage.
Predicting passenger arrivals
One of the use cases was how to predict passenger arrivals, so that Etihad coul more efficiently deploy ground staff at airports to handle the flights.
The movement of flight operations requires a large amount of support staff, some of them permanent and onsite while others are contracted based on requirements. Overall, this can include check-in staff and baggage handlers. The justification of this model was that it is not always clear when you need operational and support staff. The window of the forecasting was 14 days, with 30-minute continuous intervals, right up to four hours before each flight.
Martin Hammer
Using the Dataiku platform, Etihad built a forecasting system to model and predict passenger arrivals. The benefit was that airport managers were able to make better decisions on ground staffing, what staff they needed and when. And with external suppliers this resulted in better contractual negotiations.
Another use case that was taken up by the Dataiku team was managing and responding to incoming inquiry emails. The Etihad CRM system was receiving and logging incoming email queries. The challenge was to categorize, forward, and respond in the shortest possible time to these emails. These emails needed to reach the right person through automated categorization.
“The problem was, how do you route those emails efficiently to make sure that they are dealt with, by the correct people and that responses are getting back to the people who are asking the questions as rapidly as possible,” Dataiku’s Bhatia says.
Using NLP to optimize customer response times
What Dataiku built was an email classification system that could look at what was being asked and using NLP (natural language processing) classify the emails. Using these classifications, the CRM system would then make sure it was routed to the correct person to be deal with.
Natural language processing gives computer systems the ability to understand and make decisions from either spoken words or text. The natural language algorithm is fundamental here to provide an automatic summarization of the main points in a document or email. These algorithms also classify text according to categories, they can organise information, and complete email routing and spam filtering.
Inside Dataiku, the natural language processing model would pick up the emails, do some intelligent analysis on them, and then categorize them according to the particular issue, and create automatic cases within the CRM system.
Incoming emails would be fired at a suitable API within Dataiku. The API would connect with the natural language processing model and process the email, yielding the classification and the call to action within the CRM system.
“Dataiku has helped develop use cases across the organization that are expected to result in significant cost savings over the next five years,” says Etihad’s Mufti.
Solving data modelling problems
One of the later-stage challenges of data science is data drift. This is when, over a period of time the incoming data begins to deviate from the original data that was used to build the model in the first case. The impact of this is that the model that was built, which was trained on the original data, is no longer valid.
Sid Bhatia
“So, your predictive capacity and the predictive power of your model, is no longer as efficient as it should have been,” says Dataiku’s Bhatia. Dataiku has the capacity to take the model back into development again, rebuild, and retrain your model, and put it out again.
The initial use cases for the Dataiku platform have generated significant costs savings for Etihad, which has built confidence in the continued usage of data sciences through these challenging, post-pandemic recovery times, company officials say.
“Dataiku is one of the critical components of our enterprise data platform that gives our data science community all the tools that they need in one place, and facilitates collaboration across different groups of stakeholders,” Etihad’s Hammer says.
Moving forward, Etihad plans to continue to use the data-modelling platform to solve operational bottlenecks and deliver process efficiencies in a variety of use cases.