An optimization approach towards air traffic forecasting: A case study of air traffic in Changi airport

South-East Asia is considered one of the fastest air traffic growing regions in the world. Congested air traffic and weather conditions have thus become major factors in air traffic management. In this paper the model for air traffic forecasting was able to forecast the country, city-pair and airport-pair air traffic. Results show that the passenger forecasting for Singapore has dependence not only on that country but on neighbouring countries as well. The paper predicts that the passenger movements in Changi airport will increase up to 81.65 million people by year 2023, which is 31.23 % more than that in 2017. Also, the number of passenger aircraft between Singapore and Jakarta city-pair will increase up to 34702 by year 2023, which is 26.6 % more than that in 2017. In addition, the number of passenger aircraft between Changi airport and KLIA airport-pair will be between 31698 and 40311 by year 2023.


Introduction
Forecasting of the air traffic is an essential part of any project planning.The project success heavily depends on the accuracy of the forecasting.That is why accurate air traffic forecasting is needed for the aviation industry.
In general air traffic forecasting models are divided into 2 categories: macroscopic models and microscopic models [4].Macroscopic models aim to forecast air traffic in a whole region or country or a city.On the other hand, microscopic models aim to forecast air traffic between city pairs or regions only.Also, depending on explanatory variables used, forecasting models can be divided into geo-economic factors and service-related factors.Geoeconometric factors can be location-based factors and economic factors like GDP.Usually, these factors are out of airline control.Service-related factors are airline dependant factors, like the quality of service and airfare price.In this paper, a macroscopic model with geo-economic factors has been implemented which uses the microscopic method of forecasting as a base.The model shows the relationship between city-pair air traffic and country air traffic.
The future of air traffic is unknown, but it is essential to have an idea of air traffic in order to plan current projects.Efficient ATFM relies on accurate air traffic forecasting [5].Also, Ak?a shows that by analyzing future passenger demand, air traffic network performance and accessibility of the waypoints can be evaluated [2].Air traffic forecasting has been an important topic since 1970 [3].Initially, air traffic forecasting was done using linear or quadratic models, but they were not able to predict accurately.Lately, correlation between GDP and air traffic has been discovered.Using this finding, researchers developed econometric models, which use GDP as the biggest parameter to estimate air traffic [22].Past researches have shown that there is a need for liberalization of air traffic and adopting an open sky concept [9].These methods might help to increase competitiveness and increase the efficiency of airspace.In addition, Hazledine suggests that forecasting air traffic of passenger flights on city-pair routes depends on many factors [11].

Literature review
In the forecasting area, many different models have been proposed.Some researchers used a neural network approach as in [21], and some researchers used support vector machines [25] to determine air traffic.Wenzel, S. et al [23] proposed a model which forecasts air traffic movements taking into account airport capacity.The model consists of 5 steps.However, it has a lot of assumptions in each step.Gravio, G. et al. [7] used time series to forecast safety performance of the air traffic.It is done by estimating the number of occurrences of each event.However, it is very difficult to have an accurate database for it.
Meanwhile, Jenatabadi and Ismail [12] conducted the research on the impact of economic situations on airline performance.It was done using a triangular model, and by considering internal operations.They addressed the three most common gaps in estimating performance, by using a latent variable instead of the measurement variable.Their research concluded that the economic status of the country plays an important role in planning, decision and strategy making.
There have been many ways of forecasting air traffic [19,18].There has been researching on air traffic forecasting based on Time Series Models with Aggregated and Disaggregated Approaches in Spain [14].Hazledine [11] used a gravity model of trade to forecast air traffic.However, treating passengers as trade between countries did not produce promising results.Some researchers proposed forecasting air traffic congestion status instead of normal forecasting to evaluate air traffic situation [26].It was done by using fuzzy C-means and support vector machine.Similarly, Athina states that Madrid-Barajas planners forecasted air traffic in Madrid-Barajas airport using Spain GDP only [17].However, it might be insufficient and can lead to unnecessary deviation.These were points which motivated the development of the proposed model.Also, there is a probabilistic method proposed by researchers in NASA [15].This method estimates probability distribution function of aircraft passing a particular point.However, probability distribution function was estimated based on each individual aircraft probability of passing that point.If large number of aircraft is considered, the resulting cumulative error can be large too.
In the past, many researchers and organizations proved that there is a correlation between GDP and air traffic [16].ICAO has been using worldwide GDP to estimate world traffic in [1].However, their model would not be able to represent some specific airspace or country air traffic accurately.That is why this paper proposes an econometric model combined with optimization to increase accuracy of the air traffic forecasting.

Metholodoly
In order to accurately forecast air traffic, historical data have been obtained, and investigation of the past forecasting models has been done.There have been different models proposed in the past starting from linear models till using GDP data to forecast air traffic.However, based on the studied literature review, an optimization has not been used for air traffic forecasting, especially in Singapore.Optimization helps to find a specific model for the specific country to forecast air traffic accurately.
One of the ways to examine airport congestion is to determine the number of aircraft movements or passenger movements in the airport.Similarly, to examine airspace congestion is to determine the number of aircraft movements in the airspace.The mathematical model was developed in order to calculate it.General formulas of this model are: N AM j is a number of aircraft movements in airport j.N AM jk is a number of aircraft movements between j and k.N P F jk is a number of passengers flying between j and k.LF is a load factor [1], which is a ratio of passengerkilometres travelled to seat-kilometres available.AS is an aircraft size, which is a ratio of total seats offered to total number of aircraft.CF jk is a number of cargo flights between j and k. e j and e k are the export of goods (millions of USD) of country j and k respectively.A, B, C, D, r, l and m are constants to be found by optimization program.Formula (1) calculates the number of aircraft movements in airport j.It can be seen that the number of aircraft movements in airport j is equal to the sum of the number of aircraft movements between j and k (other countries).Formula (2) gives the formula of how to calculate a number of aircraft movements between 2 countries.They are equal to a number of passengers flying between those 2 countries divided by load factor and aircraft size [1], plus cargo flights in between.As it is shown in [1], air traffic depends on GDP of the countries.Formula (3) shows how a number of passengers flying between 2 countries are calculated.This formula is proposed based on fact that traffic between 2 countries is related to GDP of these countries.That is why general formula dependent on GDP has been proposed, where coefficients have to be found by optimization program.Similarly, when j = k, number of domestic flights can be calculated from Formula (2).Formula (2) shows that a number of domestic aircraft movements in j is equal to the number of passengers in domestic flights divided by load factor and aircraft size plus cargo flights in country j.The model proposes to find cargo flight movements between j and k to be found by Formula (4).When j = k, Formula (4) can be used for calculation of internal cargo flights.
The traffic on specific air routes can be found with help of these formulas.Assuming airport j has w numbers of air routes from 1 to w.Let the number of aircraft movements in airport j using air route a to be N AM aj .Then ∑ w a=1 N AM aj = N AM j .If we assume that the ratio of N AM j to N AM aj to be: N AM j N AM aj = p 1 + p 2 t, where p 1 and p 2 are constants.Then the number of flights on the air route a of airport j can be found by formula: N AM j can be found by Formula (1).p 1 and p 2 can be found from historical data.These general formulas can be applied to any country.Using formula (3), number of passengers movements to airport j can be found: Due to the limitation of available data, formula (1) can be simplified.Instead of using all other countries GDP data, some flight movements can be simplified as: In order to apply this model to Changi airport, the number of countrys GDP which was used for the model is restricted to be 8.They are Singapore, Indonesia, Malaysia, Philippines, Australia, China, Thailand, and India.Singapore does not have domestic flights, which is why domestic flights have not been considered.Formula (8) shows how number of passenger movements in Changi airport can be calculated.
Here x, y, B, l and m are variables which will be found through optimization program in such a way that software minimizes sum of squared errors.Optimization program used for this calculation is CPLEX.GDP values of Singapore and its neighbour countries are given in [8].These values were used for formulas from (1) to (8).Then testing on 2010-2014 data has been done, followed by the forecasting of the Singapore air traffic from 2018 to 2023.In addition, data on aircraft movements, air freight movements and aircraft movements in Singapore are given in [24].

Modelling
First of all, there is a need to determine if the proposed methodology can produce valid results.In order to do that, air traffic data from 1999 to 2019 have been used to forecast air traffic, and data from 2010 to 2015 have been used to validate.There are 8 models proposed which consider different number of inputs.The first model uses formula (8), where the GDPs of Singapore, Malaysia, Indonesia, Philippines, Australia, China, Thailand, and India have been used to find a number of passenger movements in Singapore.The second model is a simplified version of the first model, where the least impactful country has been eliminated from the model.Similarly, other models are simplified versions of the previous models, where the least significant impact giving country has been eliminated.The least significant impact giving country for air traffic forecasting of Singapore has been determined by looking at the coefficients of each contributor (B values), as shown in Table 3.The smaller the value of the contributor, the less significant it is.
Figure 1 shows the structure of the forecasting model proposed in this chapter.Firstly, all the countries with high traffic flow need to be determined.The number of countries can be from 5 to 10.For example, if the country that is going to be forecasted is Singapore, then the top busiest route countries might be Malaysia, Indonesia, Thailand, Philippines, China, Australia and India.After that, past traffic data and GDP data have to be collected.
Secondly, the model coefficients can be determined using CPLEX.Each country in each model has its own B coefficient.Using these coefficients, the optimal model can be determined.After that, future air traffic can be forecasted using forecasted GDP values.These values can be extracted from different sources, for example, from World Bank website.More details are given in the results and discussion section.

Case study 1: Forecasting passenger movements in Changi airport
Firstly, the simulation has been done using data from 1999 to 2009.Average error, maximum error, and RMS (root mean square) error of forecasting for years 2010 to 2015 have been found to be 2.84 %, 7.82 %, and 3.79 % respectively.These errors (below 10 %) are considered to be acceptable for forecasting.Hence, proposed methodology was applied to the prediction of passenger movements in Singapore.Passenger movements in Changi airport has been forecasted using 8 different models.Numbers of variables have been decreased 1 by 1.It was done by eliminating the least important variables.In Figure 2, model 1 is a model with all GDP data, model 2 is a model with all GDP data excluding GDP of China, model 3 is a model with all GDP data excluding GDP of China and India.Model 4 is a model with all GDP data excluding GDP of China, India and Indonesia.Model 5 is a model with all GDP data excluding GDP of China, India, Indonesia and Australia.Model 6 is a model with GDP data from Singapore, Malaysia, Thailand and linear model.Model 7 is a model with GDP data from Singapore, Thailand and linear model.Model 8 is a linear model.The coefficients (B values, x, y, l, m) of the model are computed using Historical data of older years usually have less importance compared to recent years.Therefore, the objective function uses the prioritized least sum of square errors, instead of the least sum of square errors.
After running simulation using all data, we can see that B China has the smallest absolute value among all B values.Therefore, GDP of China has been dropped for the second simulation.In the third simulation, GDP of India has been dropped because B India has the smallest absolute value among B values in the second simulation.Similarly, GDP of the country with the lowest B value is dropped in each of the following simulation until the model becomes linear.
Each B value represents the level of significance of a particular country.The smallest absolute B value for model 1 is B China .It means that GDP value of China does not affect Singapore air traffic as much as GDP values of other countries considered in the case study.After that, real values of passenger movements in Changi airport and values from 8 models have been compared for years 1998 to 2017. Figure 2 shows graphical representation of passenger movement values for Changi airport from 1998 to 2017 using 8 models.Table 1 shows average, maximum, RMS values of the errors and the values of the prioritized objective function for all models mentioned above.Model 1 has the smallest value of objective function.It shows that for accurate prediction of the air traffic, there is a need for more GDP data as an input to the model.However, from model 3 to model 1, error percentages of passenger movements do not decrease significantly.RMS errors from model 1 to model 7 increase up to 5.19 %, which is still 2.37 times less than RMS error for linear model.Also, model 3 shows that it can forecast passenger movements in Changi airport with similar accuracy as model 1 and model 2, where the RMS error is less than 5% and the average error is less than 3 %.In order to forecast passenger movements in Changi airport, there is a need for GDP data of 8 countries.The GDP prediction from World bank has been used for air traffic forecasting [10].
Passenger movements forecasting has been done using 8 models mentioned above.Graphical representation of forecasting the passenger movements in Changi airport from 2018 to 2023 is shown in Figure 4. Figure 4 shows that passenger movements in Changi airport will increase up to 81.65 million people by year 2023, which is 31.23 % more than that in 2017.The paper proposes that model 1 to model 3 can be used to forecast passenger movements in Changi airport accurately.
Figure 5 shows a graphical representation of the ASEAN traffic on 25th October 2013.Simulation is done using traffic data from ASEAN countries.Data were collected and consolidated.The simulation was done using SAAM software.Airspace border of each ASEAN country is highlighted in black.Figure 6 shows a graphical representation of the ASEAN air traffic in 2021.Comparing Figure 6 with Figure 5, a significant increase in the numbers of orange and red boxes can be observed, suggesting that congested areas are expected to increase between Singapore, Kuala-Lumpur, Bangkok and Jakarta.Also, the number of flights between Singapore and China is expected to grow significantly.Also, the proposed model is applied to forecast a city-pair and an airport-pair air traffic.Here the model is used to forecast a number of non-stop passenger aircraft between Changi airport and KLIA airport-pair and Singapore-Jakarta city-pair.Formula (7) with GDP data of Singapore, China, Australia, India, Indonesia, Malaysia, Philippines and Thailand is used for forecasting numbers of passenger aircraft.The coefficients are calculated in such a way In order to give more importance in minimizing latest years errors rather than older years error, the objective function uses prioritized least sum of square errors, instead of the least sum of square errors.

Case study 2: Forecasting the number of passenger aircraft between SingaporeCJakarta city-pair
Numbers of passenger aircraft between Changi airport and Jakarta International airport are forecasted using 8 different models.Number of passenger aircraft between Changi airport and Jakarta International Airport for 8 models are compared with actual values (from Innovata) from 2004 to 2017.Numbers of passenger aircraft between Changi airport and Jakarta International airport from 2004 to 2017 using 8 models is shown in Figure 7. Graphical representation of the error percentages of 8 models for the number of passenger aircraft between Changi airport and Jakarta International airport from 2004 to 2017 is shown in Figure 8.It can be observed that the error percentage of each model has decreasing trend from year 2004 to 2017.Error percentages in 2004 for model 1 to model 8 are up to 8.88 %, whereas in 2017 error percentages from model 1 to model 7 are up to 1.64 %.
Table 2 shows average, maximum, RMS values of the errors and the value of the objective function for all models mentioned above.Model 1 has the smallest value of the objective function, whereas model 8 has the highest value of the objective function.It shows that for accurate prediction of the air traffic, there is a need for more GDP data as an input to the model.However, from model 4 to model 1, error percentages of passenger aircraft do not decrease significantly, indicating that they have similar level of accuracy.Also, model 4 shows that it can forecast the number of passenger aircraft between Changi airport and Jakarta International airport with similar accuracy as from model 1 to model 3, where the RMS error is less than 2.40 % and the average error is less than 1.70 %.RMS errors from model 1 to model 7 increase up to 2.36 %, which is still 3 times less than the RMS error for the linear model.Forecasting of the number of passenger aircraft is done using 8 models.Figure 9 shows graphical representation of forecasting the number of passenger aircraft between Changi airport and Jakarta International airport from 2018 to 2023. Figure 9 shows that the number of passenger aircraft will increase up to 34702 by year 2023, which is 26.6 % more than that in 2017.The project proposes that model 1 to model 4 can be used to forecast the number of passenger aircraft accurately.Forecasting of the number of passenger aircraft is done using 4 models.Figure 12 shows graphical representation of forecasting the number of passenger aircraft between Changi airport and KLIA from 2018 to 2023.According to models 1, 2 and 3, the number of passenger aircraft will be between 31698 and 40311 by year 2023.

Advantages and limitations of the model
1. Compared to the model proposed by Viktor Surian [26], the model described here takes into account important factors such as the economic well-being of surrounding countries.Also, Viktor Surian [20] considers only 4 variables in his model, but this research considers 11 variables initially.
2. The model can be used to forecast air traffic of any airport.A specific model can be obtained for a specific airport using optimization toolbox.An optimization toolbox finds values of coefficients of general formula using data provided.3. The model provides flexibility for the user to trade-off between accuracy of forecasting to the complexity of the model.There are multiple models that can be obtained for a specific airport depending on a number of variables.The user can choose which model suits best to his requirements.4. By using optimization to find coefficients of the model, the best fit model to the historical data has been found.In that way, it gives the least possible error of fitting.By having optimal coefficients, it gives a model with more accurate forecasting.
5. The accuracy of the forecasted GDP values of countries is unknown.Air traffic forecasting of any country is directly related to GDP of that country.That is why air traffic forecasting accuracy depends on the accuracy of forecasting of GDP values of that country and countries nearby.In case study, forecasted GDP values are taken from World Bank website.
6. Forecasting cannot be certain.Unexpected events could happen in the future, like a global crisis, some virus outbreak or war.It is very difficult to estimate effects of such unexpected events on the air traffic, which is why they are not discussed in this paper.

Conclusion
In this research the model for air traffic forecasting has been developed.The case studies were performed, and the model was validated based on low average errors and RMS errors.The model was able to forecast the country, city-pair and airport-pair air traffic.Results show that passenger forecasting for Singapore has dependence not only on that country but on neighbouring countries as well.The research predicts that the passenger movements in Changi airport will increase up to 81.65 million people by year 2023, which is 31.23 % more than that in 2017.Also, the number of passenger aircraft between Singapore and Jakarta city-pair will increase up to 34702 by year 2023, which is 26.6 % more than that in 2017.In addition, the number of passenger aircraft between Changi airport and KLIA airport-pair will be between 31698 and 40311 by year 2023.It can be concluded that the accuracy of the forecasting improves with the increase of a number of neighbouring countries involved.

Figure 1 .
Figure 1.The structure of the forecasting model

Figure 2 .
Figure 2. Passenger movements values for 1998 to 2017 using 8 different models (millions of people).

Figure 3 .
Figure 3.The error percentages of 8 models for 1998 to 2017 (in %)

Figure 7 .Figure 8 .
Figure 7.The number of passenger aircraft between Changi airport and Jakarta International airport

7 .Figure 9 .Figure 10 .
Figure 9. Forecasting of the number of passenger aircraft between Changi airport and Jakarta International airport using 8 models

Figure 11 .
Figure 11.Error percentages of the number of passenger aircraft between Changi airport and KLIA

Figure 12 .
Figure 12.Forecasting of the number of passenger aircraft

Table 1 .
Average, maximum and RMS errors for 8 models Models have number of variables ranging from 11 to 2. -indicates where it is not applicable.The country with the lowest B value is dropped in each of the following simulation until the model becomes linear.For example, model 1 is a model with all GDP data.Since, the smallest absolute value among these 7 coefficients is B China , GDP of China is not considered in model 2. Model 2 is a model with all GDP data excluding GDP of China.In model 3, India has been dropped because B India has the smallest absolute value amoung B values in model 2. Model 3 is a model with all GDP data excluding GDP of China and India.Model 4 is a model with all GDP data except China, India and Indonesia.Model 5 is a model with all GDP data excluding GDP of China, India, Indonesia and Australia.Model 6 is a model with GDP data of Singapore, Malaysia, Thailand and linear model.Model 7 is a model with GDP data of Singapore, Thailand and linear model.Model 8 is a linear model.

Table 2 .
Average, maximum and RMS errors for different models

Table 3 .
Average, maximum and RMS error for different models

Table 3
shows average, maximum, RMS values of the errors and value of the objective function for all models mentioned above.Model 1 has the smallest value of the objective function, which is 45 % lower than the objective function of model 2.