Hierarchical Forecasting of the Zimbabwe International Tourist Arrivals

The objectives of the paper is to: (1) adopt the hierarchical forecasting methods in modelling and forecasting international tourist arrivals in Zimbabwe; and (2) coming up with Zimbabwe international tourist arrivals Prediction Intervals (PIs) in Quantile Regression Averaging (QRA) to hierarchical tourism forecasts. The unavailability of statistical models for Zimbabwe international tourist arrivals that cater for disaggregated tourism data and account for uncertainty due to parameter estimation methods, has resulted in poor marketing strategies, infrastructure and policies targeting wrong tourism groups. Furthermore, the country is failing to attract significant Foreign Direct Investment for particular tourist arrivals. Zimbabwe’s monthly international tourist arrivals data from January 2002 to December 2018 was used. The data set was disaggregated according to the purpose of the visit. Three hierarchical forecasting approaches, namely top-down, bottom-up and optimal combination approaches were applied to the data. The results showed the superiority of the bottom-up approach over both the top-down and optimal combination approaches. Forecasts indicate a general increase in aggregate series. The combined methods provide a new insight into modelling tourist arrivals. The approach is useful to the government, tourism stakeholders, and investors among others, for decision-making, resource mobilisation and allocation. The Zimbabwe Tourism Authority (ZTA) could adopt the forecasting techniques to produce informative and precise tourism forecasts. The data set used is before the COVID-19 pandemic and the models indicate what could happen outside the pandemic. During the pandemic the country was under lockdown with no tourist arrivals to report on. The models are useful for planning purposes beyond the COVID-19 pandemic.


Introduction
The tourism sector is an important economic sector of every economy. It helps alleviate poverty and promotes economic growth ( [1]). Therefore the sector needs to be properly managed. Accurate tourism demand forecasts lead to proper tourism management as they help in decision-making and planning. Tourism managers, the government and investors are interested in tourism purpose of visit as well as forecasts at the national level, provincial level and each tourist attraction centre. Accurate international tourism demand forecasts are beneficial to investors in particular, in determining how much to invest and in which tourism sector. The government needs accurate forecasts for resource allocation and policy implementation. The tourism managers need accurate forecasts in policy formulation and amendment. Accurate tourism forecasts can be generated through the use of statistical methods and other techniques. Statistical forecasting methods have better forecasting performance as compared to judgemental methods ( [2]). Pure time series forecasting methods deliver more accurate tourism forecasts as compared to econometric models ( [3]). However, the general agreement in terms of forecasting The arrival of international tourists in Zimbabwe is influenced by several factors, such as the purpose of visit (PoV), season and weather conditions. Therefore, hierarchical forecasting is appropriate for this type of data. Hierarchical forecasting will improve the decision-making processes, as particular characteristics and dynamics in the tourism data can be viewed and explained ([2]).
In this paper, the Zimbabwe international tourist data are categorised according to PoV that is: In-transit (I), Business (B), Education (E), Holiday (H) and Shopping (S). This implies total international arrivals can be disaggregated into these five groups, resulting in a one level hierarchy. This decomposition is chosen based on the availability of data, as highlighted by [6]. Political instability and a deteriorating economy can be attributed to tourist arrivals decline ( [7]. With hierarchical forecasting, one can identify the declining tourist arrivals category as well as other factors contributing to the decline, so that the necessary corrective measures can be taken. Prediction intervals (PIs) are vital in forecasting; they provide supplementary information that can be used in decision-making. According to [8], PIs allow better assessment of future uncertainty and help in planning. The purpose of this paper is to adopt both the hierarchical forecasting methods proposed by [3] and the Quantile Regression Averaging (QRA) PIs in forecasting international tourist arrivals in Zimbabwe, for both aggregated and disaggregated data by PoV. Hierarchical forecasting methods observe the grouped structure of time series data, and in so doing improve forecasting accuracy ( [9]). PIs take into account uncertainty from the parameter estimation method as well as the noise in the input data and they improve prediction accuracy ( [10]).
To the best of our knowledge, the application of the QRA PIs to hierarchical forecasting tourism forecasts has not been considered on African tourism, particularly Zimbabwe, where international tourist arrivals are disaggregated by PoV. Using the best hierarchical forecast approach estimates to come up with the QRA PIs in the tourism industry provides new insight into modelling tourist arrivals in Africa and Zimbabwe in particular.
The rest of the paper is organised as follows: Section 1 presented the introduction, Section 2 reviews the literature. The modelling and individual series prediction approaches are presented in Section 3, empirical results and discussions are in Section 4. Section 5 concludes the paper.

Literature Review
In the tourism industry, tourism forecasting plays a significant role in decision-making. Because of its importance, several and frequently used models namely: Autoregressive (AR), Autoregressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA), Autoregressive Distributed Lag Model (ADLM), Error Correction T. MAKONI, D. CHIKOBVU AND C. SIGAUKE 139 Model (ECM), Artificial Neural Networks (ANN), Vector Autoregressive (VAR), and the Na?ve model among others, have been developed. These models offer various advantages over others.
Several casual methods (see: [11]; [12]; [13]; [14]; [15]; [16]) and non-casual tourism studies for Africa exists (see: [17]; [18]). Casual methods analyse the causal relationships between the tourism demand variable and its influencing factors ( [19]). In casual methods, data gathering and analysing is cost and time consuming ( [20]) and the approach is not especially accurate in forecasting tourism demand ( [21]), hence are not employed in this paper. Non-casual methods (univariate time series) such as the Autoregressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA) and Exponential Smoothing (ES) are less time consuming and costly than regression-based or econometric models ( [22]). Univariate time series are incorporated in hierarchical forecasting techniques.
In Zimbabwe, [23], [24], [25], [26] and [27] are among those researchers who focused tourism determinants. The current study differs in that it applies non-casual methods because of their ability to capture tourism dynamics, quantifies future uncertainty and generates more informative forecasts. [28] focused at specific tourist destination sites (the Victoria Falls Rainforest, Zimbabwe) and not at country level as considered in this paper. Tourism forecasts for national level are more important for policy formulation.
The current paper employ the hierarchical forecasting technique, Quantile Regression Averaging (QRA) and PIs which capture tourism dynamics and quantify the uncertainities in future values due to parameter estimation unlike the hierarchical forecasting model developed by [29] which only captures tourism seasonality. QRA PIs provide supplementary information useful for effective marketing strategies and decision-making ( [30]; [8]). [31], employed a SARIMA(0, 0, 0)(1, 1, 0) 12 model in forecasting Bhutan's tourism demand. [32], modelled tourist demand for Macedonia and an ARIMA(1, 1, 1) model was found as an appropriate model. The model projected an increase in tourist arrivals. In Sri Lanka, [33] fitted a SARIMA model using monthly data. The HEGY test was used to identify seasonality. The SARIMA(1, 0, 16)(36, 0, 24) 12 model with a seasonal component was found to be a better model for the data.
[34] fitted a SARIMA model to Taiwan's outbound tourists. They projected an increase in outbound tourists to Japan, Hong Kong and the USA, which are the major tourist destinations for Taiwanese. [35] used a SARIMA model in estimating South Africa's tourism demand from its major overseas tourism markets namely: the USA, Germany, France, Great Britain and the Netherlands. The SARIMA(1, 0, 1)(0, 1, 1) 12 model and SARIMA(1, 1, 1)(0, 1, 1) 12 model fitted well to tourist arrivals from France and Netherlands, respectively. Tourist arrivals from the USA, Germany and Great Britain were found to follow a SARIMA(0, 1, 1)(0, 1, 1) 12 model. SARIMA models have stood the test of time since their existence from the 1970s to date.
According to a Denmark research done by [36], a Time-Varying Parameter (TVP) model fitted well to the tourism demand of Denmark. The results were achieved after the evaluation of the forecasting performance of both univariate and econometric models. Exponential Smoothing (ES), ARIMA, and ANN models were used by [37] in modelling Hong Kong's tourism demand. The ANN models were found to be the best forecasting approach for Hong Kong's tourism demand. [4] compared forecasting performance of ECM, AR, VAR and ARIMA models while projecting UK's outbound tourism demand. Research findings indicated the superiority of the ECM over other models.
Much has been achieved in modelling tourism demand using current techniques, but the use of hierarchical forecasting techniques has received little attention. Credit is given to [2] in hierarchical forecasting. They proposed the approach while forecasting tourist arrivals in Australia. The major advantage of the hierarchical forecasting approach is that it generates more accurate and coherent disaggregated forecasts than other corresponding 140 HIERARCHICAL FORECASTING OF THE ZIMBABWE INTERNATIONAL TOURIST ARRIVALS forecasting approaches applied to the same data set.
In Australia, [2] disaggregated the tourism data by states, zones and regions and used hierarchical forecasting methods. [2] used two different top-down, two mixed bottom-up and the optimal combination approaches in forecasting Australian domestic tourism demand. They noted that the optimal combination and the top-down approaches based on forecast of proportions performed better in their paper and this was based on the accuracy measures used.
The new top-down and bottom-up approaches proposed by [2], together with the optimal combination approach, can capture various characteristics of tourism data. The optimal combination approach makes use of a regression model to come up with unbiased combination forecasts as well as minimum variances ( [2]). Results from their study were used for decision-making in Australia and they indicated areas that needed special attention.
When forecasting these types of hierarchical series, the aggregated forecasts should exactly correspond to disaggregated estimates ( [38]). However, according to [9], estimates of higher-level may not automatically tally with forecasts from lower levels. The incoherence of the series is applicable to tourism data as base forecasts of PoV (using disaggregated series) may not tally with national level forecasts (using aggregated series). Therefore, reconciling international tourist arrivals using hierarchical forecasting methods proposed by [2] is suggested as a solution to the problem articulated. The bottom-up, top-down (average historical proportions, the proportion of historical averages and forecast proportions) and optimal combination approaches developed by [3] are adopted. These approaches are capable of summing up the point forecasts of PoV, so that they are identical to the point forecasts of the aggregated data.
Different views exist in determining the best approach in hierarchical time series analysis. [39] concluded that no method could outperform the other and further indicated that performance varies according to the situation at hand in hierarchical models. [40] noted no difference in terms of performance between the top-down and bottom-up approaches during forecasting the aggregate demand in production planning. During an investigation of aggregated variable time series forecast strategies with specific sub-aggregated time series statistical correlation, [41] concluded the top-down approach outperforming other approaches. Similar findings were also obtained by [42], while applying hierarchical forecasting methods in the production planning and inventory control cases. Their top-down approach produced better results than the results produced by the bottom-up approach.
Conversely, [43], indicated that the bottom-up approach performed better and also showed that it was the best method when forecasts are statistically generated. Whilst forecasting items in a product line using 15000 aggregated series, [44] noted that the bottom-up approach produced more accurate forecasts than the top-down approach. They compared predictions from the bottom-up and top-down approaches. Another study by [45] indicated the bottom-up approach as better than the top-down approach while estimating future total-entity earnings of 26 diversified companies using segment and consolidated revenue and profitability data. However, [2], [3], [39] and [5] all noted that: for disaggregated series, the bottom level series are volatile, hence tricky to model and the bottom-up approach can be outdone by other methods.
PIs based on the Quantile Regression Averaging (QRA) technique are more accurate than those from the best performing individual model ( [46]). PIs perform well when the sample size considered is large ( [47]). The sample size considered in this study allows for the construction of PIs.
According to [48], PIs generate a range of future values; thus, the interval width and coverage rate can be used to evaluate the accuracy of PIs. A good PI model is one with a narrow width and coverage rate closer to the nominal coverage rate ( [49]; [50]). [51] developed an instrument useful for PI forecasting assessment called Winkler score. During combining tourism interval forecasts for tourist arrivals in Hong Kong, [49] used Winkler scores to measure interval forecasting performance.

Methods
In this section, statistical methods in the form of the hierarchical forecasting approach, QRA and PIs and other methods used in this paper will be discussed.

Hierarchical structure
According to [52], a grouped time series is a collection of time series that can be grouped together in a number of non-hierarchical ways, while a hierarchical time series is a collection of several time series are linked together in a hierarchical structure. Tourist arrivals by state and PoV are an example of grouped time series, while tourist arrivals by state and region represent a hierarchical time series. The hierarchical structure adapted to the Zimbabwe international tourist arrivals is the total series and PoV series as shown in Figure 1. This is a Level 1 hierarchy. Completely aggregated series is named total arrivals' and is under Level 0, whereas Level 1 is constituted by disaggregation series (decomposed by PoV).
In hierarchical time series, individual forecast at each level will make up forecast of the upper level when summed. For grouped time series, each group's forecasts will be equivalent to the forecasts of individual series constituting that group. Both hierarchical and grouped time series are capable of producing coherent forecasts. Hierarchical time series is less complex as compared to grouped time series with disaggregation constraints. For this study, the authors considered grouped time series, since total international tourist arrivals are being disaggregated according to PoV. Hierarchical forecasting approach deals with both grouped and hierarchical time series.
Adopting the notation of [52], Y t represents an m-vector of all tourism observations at time t, B t becomes an n-vector (n = 5) of disaggregated tourism observations at the bottom level (level 1) and Y (X,t) being the vector of observation on series X at time t. From these, we have Equation (1), Equation (1) can be reduced to Equation (2), where matrix S m×m k (S 6×5 ) is the summing matrix that stores the structure of the hierarchy (m=6 is the total number of series in the hierarchy and m k = 5 is the number of bottom level series). There are several methods used in hierarchical time series modelling, some of which are discussed in the following sections.

The bottom-up method
With this approach, lowest level forecasts are produced, which are later on aggregated to the higher level through the use of the summing matrix ( [2]). The dynamics of specific series will be captured since information is not lost with this approach but fitting a forecasting model becomes difficult if the bottom level series is unstructured and noisy ( [2]). More bottom level series imply more series to be forecasted at the lower levels.

The top-down method
According to [2], top-level forecasts are produced and are disaggregated to the lower levels of the hierarchy through the use of proportions. The approach works well in the presence of low count data ( [2]). With this method, it is easy to come up with single forecasting models and reliable forecasts for aggregated levels. [53], highlighted three methods (average historical proportions, proportions of historical averages and forecasted proportions) of coming up with the proportions.

Average historical proportions
For each series at the lower level of the hierarchy, proportions are found, Under this method, the individual proportion p i is mirroring the average of the historical proportions of the bottom level series (level 1 series) ( Y (X,t) ) from t = 1, ..., 204, relative to the total aggregate Y t ), which is the total international tourist arrivals.

Proportions of historical averages
The formula is given by: Under this method, the individual proportion p i uses the average historical value of the bottom level series (level 1 series) ( Y (X,t) ) from t = 1, ..., 204, relative to the total aggregate (Y t ), which is the total international tourist arrivals.

Forecasted proportions
This method makes use of h-step ahead base forecasts that can be denoted byŶ n (h). Independent lowest level forecasts for all the series in the hierarchy are produced. Proportions of individual base forecasts at each level against the cumulative base forecasts at that level are computed. The general formula for M level hierarchy is given by: where a (a = 1, 2, ..., m k ) denotes a particular bottom level series,Ŷ l a,t represents the base forecasts of the series corresponding to the node which is l levels above a whereas the sum of base forecasts below the series that are l levels above node a and are directly linked with that series. As for the bottom level series H in Figure (1), we have: whereŜ Total,t =Ŝ Total arrivals,t andŜ H,t =ŷ (H,t) +ŷ (B,t) +ŷ (I,t) +ŷ (E,t) +ŷ (S,t) . [54] introduced this approach that makes use of a regression model for the purpose of coming up with optimally combined and reconciled forecasts. With this method, interactions and relationships between series at every level in the hierarchy, as well as special tourism events and seasonality, are considered. The h-step-ahead base forecasts in a hierarchy are represented by:Ŷ

Optimal combination approach
where is the vector of the unknown means of base forecasts of bottom level M,Ŷ n (h) and ε h is the error term with a mean of zero and a covariance matrix ∑ h . Equation (7) then reduces to: Due to the unavailability of information and complexity of computing ∑ h , [54] assumed that the error is predicted using the forecast error of the bottom level, hence ε h ≈ Sε (M,h) . With this assumption, the error terms will meet the aggregation constraints as the data in the hierarchy. The unbiased estimator for β h becomes: and revised forecasts will be given by:Ȳ The revised forecasts are unbiased because SPS = S since P = (S ′ S) −1 S ′ .

Individual series prediction method
There are various models (Exponential Smoothing (ES), Random Walk (RW) and ARIMA, etc.) that are used in the prediction of individual series at various levels and hierarchies. The ES and ARIMA were adopted by [55] while [2], adopted the ES alone. In this paper, the modelling and forecasting of individual hierarchical series at all the two levels is done using the ES methods through a default R software algorithm incorporated in forecast package by [54]. According to [56], this is the commonly used methods when it comes to generating individual forecasts and in time series analysis. They further acknowledged the inexpensiveness of ES and their ability to produce accurate forecasts in various fields.
According to [57], the best three ES methods are the simple exponential smoothing, Holt-Winters' seasonal method and the Holt's linear trend method. The simple exponential smoothing (SES) is most appropriate to non-seasonal and de-trended time series data ( [58]); [56] and [49]). [59] indicated that the Holt-Winters smoothing techniques are more applicable to seasonal business data. Tourism data usually exhibit seasonality and trend hence the individual series cannot be predicted using the SES. As a way of dealing with the tourism trends and seasonal components, the paper will use the Holt-Winters' methods.

Forecast combination and prediction Intervals Quantile Regression Averaging (QRA)
The forecasts are combined using Quantile Regression Averaging (QRA) ( [46]; [60]. [61] noted that combining or averaging forecasts of two or more models improves accuracy while also reducing the variance of forecasting errors. In this paper we will also use QRA to compute the PIs. The QRA is given as: whereŷ ti represents the forecasts from method i,ŷ QRA t,τ denotes the combined forecasts and ε t,τ is the error term. The parameters of Equation (12) are estimated by minimizing the loss function for a particular τ th quantile,

Performance measures
The Winkler score is used to measure the accuracy of PIs. Coverage rate and interval widths are useful in evaluating performance of PIs, but this study will make use of Winkler score. Winkler's score accounts for merits of both coverage rates and interval widths ( [49]). The Winkler score measures both the interval width and the coverage rate. According to [49], a smaller Winkler score shows better performance of PIs. Given that y t denotes future predicted values, Winkler score can be expressed as: where y t denotes exact future values, p is the PI percentage, w t denotes the width of the interval, L t and U t are the lower and upper limits, respectively.
The Mean Absolute Percentage Error (MAPE) is used to measure accuracy of hierarchical forecasting approaches. The MAPE can be expressed as: whereŶ t and Y t are the forecasted and actual values, respectively, and n represents total observations.

Exploratory Data Analysis
Monthly Zimbabwe international tourist arrivals data provided by both the Zimbabwe Tourism Authority (ZTA) and the Zimbabwe National Statistics Agency (ZIMSTAT) from January 2002 to December 2018 are used. The data set used was before the COVID-19 period. The models derived from the data indicate what could happen outside of the pandemic period. The data are disaggregated according to PoV to come up with a hierarchy with Level 0 and Level 1. The descriptive statistics of the tourist arrivals are summarised in Table 1  [62] concluded that tourist arrival pattern can be recognised using box-plots. The box-plots for the monthly international tourist arrivals are illustrated in Figure 7 (Annexure 1).
The box-plots in Figure 7 (Annexure 1) show summaries of international tourist arrivals for each category. The box-plots indicate that the international tourist arrivals to Zimbabwe are dominated by those who come for holiday, followed by in-transit, shopping, business and lastly education. All these international tourist arrivals may visit various tourist attractions in various provinces in the country.

Zimbabwe international tourist arrivals.
International tourist arrivals, according to PoV are shown in Figure 2.
From Level 1, the blue line represents holiday (H), the red line represents in-transit (I), the purple line represents shopping (S), the light green represents business (B) and the green line represents education (E). A mixed pattern of international tourist arrivals is shown at both levels, with high numbers experienced between years 2000 and 2008, then a slight increase around 2016. At Level 1, it is noted that most international tourists are for holiday purposes, followed by in-transit, then shopping, and finally business and education visitors. In-transit tourist arrivals are high, probably due to the fact that Zimbabwe connects several Southern African countries (South Africa, Mozambique, Zambia, Botswana and Namibia), hence a lot of foreign tourists pass through Zimbabwe on their way to various neighbouring countries. [2] used ES based on innovations state space model in forecasting the Austrian tourism demand. The same method is used in this study in modelling and forecasting all Level 0 and Level 1 tourism series. The methods include additive models that result in better forecast accuracy, according to [2]. training the models. Tourism observations from October 2017 to March 2018 are used for validation. One to sixstep-ahead tourism forecasts are done using the bottom-up, top-down (average historical proportions, the proportion of historical averages and forecast proportions) and the optimal combination approaches. These approaches reconcile and optimally combine the international tourist forecasts. The one to six-step-ahead tourism forecasts are done by increasing the sample size by one month (observation) and re-doing the forecasts. Table 2 summarises the results.

Forecasting accuracy of the models
The last column labelled 'Average MAPE' in Table 2 indicates the MAPE value for each approach across the entire forecasted horizons (1, 2, 3, 4, 5 and 6). The best performing method is the one with a lower MAPE value and is identified by bold entries in Table 2. For both the upper level series (total tourist arrivals) and lower level series (purpose of visit), it is noted that the bottom-up approach is the best performing method according to the average MAPE values displayed in Table 2. The method is capable of generating more accurate tourism forecasts.
The bottom-up approach is the best performing for Level 0 and Level 1 series, as exhibited in Table 2, since the average MAPE values are lower. The bottom-up approach outperforms the other approaches when forecasting total arrivals as well as educational and shopping tourists, as supported by the MAPE values. However, the optimal combination approach seems to be the best for holiday tourists. The top-down approach based on average historical proportions and forecasted proportions are the best for modelling and forecasting business and in-transit international tourists respectively. [3] showed that any top-down method introduces bias into the reconciled forecasts at each disaggregation level, even if the base forecasts are unbiased. To overcome this short-coming, we are going to combine the forecasts using QRA. It is concluded that the bottom-up approach gives the overall best fit for both hierarchies based on the lower average MAPE values associated with the approach. This may be due to the nature of tourism data and the way it is disaggregated, that is according to PoV. Furthermore, the short-term forecasts used for evaluation purposes may have favoured these approaches. This is consistent with part of the results of [2] study, when they concluded that the bottom-up approach forecasts outperformed top-down forecasts based on forecasted proportions. Furthermore, [43] noted the good performance of the bottom-up approach over the top-down approach.

International tourist forecasts
The bottom-up approach is applied to predict future 60-months out-of-sample Zimbabwean international tourist arrivals across all levels of the hierarchy. Tables 4 and 5 (Annexure 3 and 4) summarises the forecasts produced using the bottom-up approach, while Figure 3 is the graphical representation of the forecasts and the original series data.
From Figure 3, the solid line(s) represent(s) the historical international arrivals tourism data, while the dashed line(s) represent(s) the projected international arrivals tourism forecasts. Of course these forecasts did not anticipate the Covid-19 pandemic. However, the forecasts give the potential for each grouping beyond the pandemic. The colours represent Level 1 forecasts, the blue line represents holiday (H), red line represents in-transit (I), the purple line represents shopping (S), light green represents business (B) and green line represents education (E).
According to the bottom-up approach, international tourist arrivals are seasonal, as high peaks are observed towards the end of every year. The plot for aggregated series, holiday tourists and in-transit also indicated seasonality with a slightly trend. Holiday tourists have the highest frequency, probably due to the wonderful geographical nature and warm temperatures in Zimbabwe. Harare is the sunshine city of the world climate wise. Investors can invest in the accommodation and transport sectors, targeting these holiday makers and in-transit tourists. The government could channel more resources to the accommodation and transport sectors, so as to accommodate all international tourist arrivals. Tourism managers could also keep on marketing the available tourist attraction centres and be creative to introduce new cultural and other entertaining activities when the COVID-19 pandemic is over or manageable. The plot also indicated that the overall educational tourism flow in Zimbabwe will be low and stable over time, probably due to the deteriorating economy and industry, as well as poor educational facilities. Unavailability and relatively high costs of some basic goods in shops may be the reasons for the relatively constant number of shopping tourists. The education sector may introduce some role model scholarships, with the help of the government, as a way of luring more educational tourists. Zimbabwe has a good reputation in education in the region, but has not harvested any fruits in the form of large numbers of educational tourists, like Australia. Graduates will be ambassadors in their respective countries. Business visitor arrivals were increasing slightly, probably due to the on-going campaign by the president of Zimbabwe, telling the world that Zimbabwe is open for business. There is a need for better policies from the government so that the country will receive more business tourist arrivals and this will improve the country's foreign currency reserves. Improving the political environment in Zimbabwe will certainly help in this endeavour. However the COVID-19 pandemic needs to be brought under control first.

Forecast combination results and prediction intervals
Forecasts from the bottom-up approach are used in the QRA, particularly the aggregated series denoted by BUtotal, since the rest were not giving good prediction intervals. Plots of the forecasts, density, normal Q-Q plot and the box plot are done and displayed in Figure 4.  Figure 4 shows some variations in international tourist arrivals and the forecasts are a little bit diverting from a normal distribution, according to the density and normal Q-Q plot. The forecasts (πt) are smoothed using the penalised cubic smoothing spline function. The function is given by: where α is the smoothing parameter, f (t) is a smoothing spline function at time t obtained from a noisy observation denoted by y t and f ′′ (t) is the second derivative of the smoothing spline function at time t. According to [63], the Generalised Cross Validation (GCV) method gives an optimal regularisation parameter based on leaveone-out cross validation procedure. The α value based on the GCV (α = 1:49995) was not used, as it gave a straight line. It was then selected based on the trial and error method until we got a fairly good fit as shown in Figure 5 (α = 0:049995).
The fitted values were then extracted and used as the independent variable in the quantile regression model. The quantile(s) to be estimated is denoted by τ, BUtotal is regressed against BUfitted values for the τ = 0 : 5 for median forecasts and τ = 0 : 025 and τ = 0 : 975 for the lower and upper 95% PIs. 99% PIs were also constructed using a similar procedure. The performance of 99% and 95% PIs were evaluated using Winkler score and the results are in Table 3.
From the results in Table 3, Winkler score values for a 99% PIs are lower than that of 95% PIs for all forecasting periods. This implies 99% PIs outperform the 95% PIs. Because of this, the 99% PI is considered as the best.  For a full probability distribution of the forecast, we find quantiles in the range τ ∈ (0; 1). We further regressed BUtotal against BUfitted values for the τ = 0.01 for median forecasts and τ = 0.01 and τ = 0.99 for the lower and upper 99% prediction intervals. Figure 6 displays forecasts from these values. Figure 6 indicates forecasts with small deviations from the PIs. The obtained operational forecasts from the QRA are important to decision-makers in the tourism sector.

Conclusion
The hierarchical forecasting methodology is applied to Zimbabwe's international tourist arrivals. The bottom-up, optimal combination and three versions of the top-down approaches are considered. The bottom-up approach is considered the best performing approach for Zimbabwe's international tourist arrivals according to the forecasting performance evaluation method. These results are consistent with what was concluded by [64], when their bottom-up approach outperformed other approaches. Our aggregate Zimbabwean international tourist arrival forecasts exhibited a mixed pattern with seasonality over the 60 months forecasted.
Forecasts using disaggregated tourism series by the PoV (holiday and in-transit) indicated a slight increase, while forecasts for educational tourists indicated a low and stable decrease. Educational infrastructure and resources need to be improved so that the educational sector attracts more foreigners. Zimbabwe has the potential to boost its education tourism in the region of Southern Africa after South Africa. The government, through the Reserve Bank of Zimbabwe (RBZ), could alleviate foreign currency shortages experianced by shops and manufacturing companies so that they are able to restock and produce more goods. This will bring in the much-needed foreign currency into the country, through shopping tourist, thus creating a virtuous cycle.
Since business tourists are few, the president of Zimbabwe and other interested parties have been campaigning and advertising that Zimbabwe is open for business. If a success, this will lead to foreign direct investments and improve the economy and livelihoods of Zimbabweans. Further, business policies need to be amended so that they are favourable to the international investors in the tourism sector. Holiday tourists are the only group with high figures on the hierarchy, motivated mainly by the geographical nature of tourist attraction centres in the country (Victoria Fall Rainforest, Chinhoyi Caves, Great Zimbabwe Monuments, etc.). Disaggregated series allowed us to recognise the nature of international tourists received in Zimbabwe. The hierarchical approach helped us in coming up with demand for specific categories and this helps in the identification of the needed attention and interventions in the particular categories. This is a wealth of information in preparation for the post Covid-19 period.
QRA techniques are applied to the aggregated forecasts from the bottom-up approach. Under the QRA, smoothing techniques are applied to the forecasts and points forecasts with PIs are generated. The aggregated QRA forecasts using 99% PIs indicated a good fit as a small variance is observed. The QRA forecasts can be used to guide business people and other investors. Hence the ZTA could adopt the QRA approach in coming up with improved tourism forecasts with prediction intervals, for investors and government to be guided accordingly. PIs are helpful in planning purposes. The mentioned approaches apply outside the COVID-19 period, assuming the pandemic is eventually contained.