Economic dispatch of electrical power in South Africa: An application to the Northern Cape province

Power utility companies rely on forecasting for the operation of electricity demand. This presents an application of linear quantile regression, non-linear quantile regression, and additive quantile regression models for forecasting extreme electricity demand at peak hours such as 18:00, 19:00, 20:00 and 21:00 using Northern Cape data period 01 January 2000 to 31 March 2014. The variables were selected using the least absolute shrinkage and selection operator. Additive quantile regression models were found to be the best-fitting models for hours 18:00 and 19:00, whereas linear quantile regression models were found to be the best-fitting models for hours 20:00 and 21:00. Out of sample forecasts for seven days (01 to 07 April 2014) were used to solve the unit commitment problem using mixed-integer programming. The unit commitment problem results showed that using all the generating units such as hydroelectric, wind power, concentrated solar power and solar photovoltaic is less costly. This study’s main contribution is the development of models for forecasting hourly extreme peak electricity demand. These results could be useful to system operators in the energy sector who have to maintain the minimum cost by scheduling and dispatching electricity during peak hours when the grid is constrained due to peak load demand.


Background
Economic dispatch is essential in power system operation and is defined as the power planning operation with minimum operating costs [1]. The purpose of economic dispatch is to provide optimal power generation at a minimum cost of operation. It also provides the important aspects of power system operation such as meeting load demand at minimum cost by scheduling the committed generating units, reducing the emissions, maintaining the system stability, and security restriction [2]. Electricity load forecasting is important for economic dispatch because it provides future electricity production and consumption [3], which helps electricity utility to maintain the balance of demand and supply [4]. Electricity load forecasting is also important for production planning and trading on the electricity markets. It has different implementations such as energy acquiring and production, load switching, contract rating and infrastructure evolution. Load forecasting helps electric utility management in planning the distribution of electricity [5,6,7]. Electricity load forecasting faces rising challenges due to innovative technologies such as smart grids, electric cars, and renewable energy production. The purpose of Peak electricity is the highest load at a given time. Peak electricity load forecasting is essential because it assures the availability of enough supply. Under forecasting of peak electricity load results in an insufficient capacity for meeting load demand and blackouts. Power blackouts are a problem because they affect the operation of the economy. Extreme peak electricity load forecasting is the solution to the underprediction of the peak electricity load demand. Accurate peak electricity load forecasting is very important as it provides future forecasts which are useful to prevent system failure and power blackouts [9].

An overview of the literature on load forecasting
Electricity load forecasting has received much attention from industrialists and academics in recent years [8]. Various forecasting techniques for forecasting electricity loads have been developed and they are classified into statistical and artificial intelligence techniques [10]. The type of electricity load forecasting can vary depending on the result desired. It may be spatial if it mainly relates to studying future patterns in a specific region, country, or state. Otherwise, it is temporal if concerned with forecasting hourly, daily, monthly or yearly. Electricity load forecasting is carried out according to their time horizons, such as very short-term load forecasts (VSTLF), short-term load forecasts (STLF), which are from one hour to one week, medium-term load forecasts (MTLF) are generated from a week to a year, and long term load forecasts (LTLF) range up for longer than a year [10,11,12]. Reference [9] used a quantile regression model to develop 1.00 quantiles of the distribution to prevent system failure and eliminate power blackouts. In other words, the idea was to provide a model that would avoid under prediction. The estimates from the quantile regression model were compared with the actual demand to investigate the ability of the model to avoid power blackouts (avoid underprediction). The results showed that the prediction of the upper limit for the daily peak demand is accurate when using the 1.00 quantile from the 0.99 to 0.97 quantiles of the distribution [9].
Unit sectors require accurate peak electricity load forecasting, as climate change, technological development, and energy policies contribute to rising peak load. Decision-makers and power generation companies rely on accurate energy demand forecasting regarding policy creation and power generation planning [13]. Accurate forecasting of peak electricity demand is essential in the electricity sector for planning capacity enlargement and medium-term risk assessment [14]. Peak electricity load forecasting and modelling using South African data are discussed in the literature, see [14,15,16]. Reference [15] used the additive model that allows non-linear and nonparametric terms to forecast daily winter peak electricity demand in South Africa. The study showed that peak electricity demand is highly sensitive to cold temperatures compared to hot temperatures in South Africa. Modelling extreme peak electricity load is useful for quantifying the amount of electricity that can be shifted from the grid to off-peak periods [16]. Reference [16] modelled extreme daily increases in peak electricity demand in South Africa, focusing on tail quantiles of the distribution of daily peak electricity demand. The forecasting of peak electricity demand using South African data is studied by [14] in which the authors focused on an application of partially linear additive quantile regression models for modelling and forecasting peak electricity demand.
Additive quantile regression (AQR) models were used in South Africa to forecast short-term hourly load. A combination of forecasts from four developed models was done based on pinball loss and quantile regression averaging (QRA). The study found that the AQR model with interactions produced accurate forecasts compared to the QRA model [17]. Factors such as wind speed, solar irradiance, temperature, cloud cover, and seasonal variations significantly affect electricity demand, resulting in uncertain and unpredictable electricity demand patterns. However, it is important to create a robust, intelligent, and adaptive forecast model that accommodates the factors affecting power demand for higher forecast accuracy [18].
Unit commitment (UC), also known as an optimisation problem, helps in managing generating units (when to switch on or off) under various restrictions and environments [19]. UC maximises power systems operational with the minimal costs of production and reserve requirements [20,21]. A reasonable solution to the optimisation problem is very important because it provides operational planners with the optimal number of required generators. According to [22] there are different methods for determining a reasonable solution to the optimisation problem. Some optimisation techniques used in solving the unit commitment are, Lagrangian relaxation (LR), mixed-integer linear programming (MILP), tabu search (TS), dynamic programming (DP) and stochastic programming (SP) techniques, see [19]. However, reference [23] revealed that many authors used mixed integer programming (MIP) and LR methods. Reference [23] developed an approach that integrates short-term load forecasting with UC, which minimises production costs from one thermal plant in Turkey, the Kutahya region. Applying the LR method [23] provides solutions to a UC problem using forecasts from two developed models for electricity demand. Accurate load forecasts are essential as they are some of the inputs in solving the UC problem.

Motivation and contributions
This study intends to solve the economic dispatch of electricity through the following contributions: • Forecasting extreme high quantile of the distribution, which improves the accuracy of forecasts, • Using seven days out of sample forecasts to solve the unit commitment problem, • Improve the economic dispatch of electricity through the inclusion of renewable energy sources on the unit commitment problem, • Providing results which help operate electricity utility at a minimal cost.

Linear quantile regression model
Quantile regression (QR) was introduced by [24], which provides a modelling approach to predict conditional quantiles of the response variable. Linear quantile regression assumes a linear relationship between the response variable and a vector of explanatory variables when estimating the quantiles of the cumulative distribution function of the response variable [25]. Following [25], linear quantile regression is given by: where y t,h,τ is electricity demand on day t = 1, .., n at hour h, h = 18 : 00, 19 : 00, 20 : 00, 21 : 00 at quantile τ , β denotes a vector of parameters and ξ t,τ is a random error term. From [24], quantiles are estimated using asymmetric weights to the mean absolute error. The quantile loss function is represented by: where τ is the quantile probability level. The τ th quantile is estimated by the quantile regression method and the vectorβ using the minimisation problem given by:

Non-linear quantile regression model
Non-linear quantile regression assumes a non-linear relationship between the response variable and a vector of explanatory variables when predicting the quantiles of the distribution. In non-linear quantile regression modelling, 1238 ECONOMIC DISPATCH OF ELECTRICAL POWER IN SOUTH AFRICA a non-linear mapping function q τ converted the vectors of explanatory variables X t into a viable higher dimensional feature space [26]. A non-linear quantile regression model is defined by [26,27,28]: where y t,h,τ is electricity demand on day t = 1, .., n at hour h, h = 18 : 00, 19 : 00, 20 : 00, 21 : 00 at quantile τ , β τ denotes a vector of parameters, q τ is the mapping function and ξ t,τ denotes random error term. Equation (4) can be estimated by:β

Additive quantile regression model
An additive quantile regression (AQR) model is a hybrid model that integrates GAM and QR models. AQR models were initially used by [29] to estimate short term load demand and also extended by [30,31]. The AQR model is written as [29,30,31]: where y t,h,τ is electricity demand on day t = 1, .., n at hour h, h = 18 : 00, 19 : 00, 20 : 00, 21 : 00 at quantile τ , s i,h,τ denote the smooth functions and ξ t,h,τ is the error term. The smooth function, s, is given by: where β ij represents the i th unknown coefficient (parameter) and b ij (x ti ) are known as spline basis functions. The parameter estimates of Equation (6) are obtained by minimising the function given by: where represents the pinball loss function. This study uses the penalised pinball loss function discussed in [31]. Let µ(x t ) = X T t β, with x t denoting the t th row of the n × d design matrix X. The penalised pinball loss function is then defined as [31]: where γ denotes a vector of smooth parameters, i.e. γ = {γ 1 , ..., γ p }, S i represents the positive semi-definite matrices which are meant to penalise the wiggliness of µ(x) [31]. The term 1 σ represents the learning rate used to determine the weight of the loss and the penalty. For more details see [31].

Error measures for probabilistic forecasting and evaluation of methods
Probabilistic forecasts from proposed models will be evaluated and compared using scoring rules. The scoring rule assigns a penalty score represented by S(y, F ) to the probabilistic forecast, where y denotes the observation used for forecast assessment and F represents the forecast distribution [32]. A small score shows a better forecast. In this study, three error measures will be used such as continuous rank probability score (CRPS), Dawid-Sebastiani score (DSS) and the pinball loss function (PLF).

Continuous rank probability score
The CRPS compares the distance between the predicted and the observed cumulative density functions of scalar variables [32]. The CRPS is defined by: where F represents the forecast distribution and QSτ denotes the quantile score defined by: where I denotes an indicator function.

Dawid-Sebastiani score
The drawback of CRPS is that it is difficult to compute complex forecast distributions. The DDS is the alternative solution because it overcomes the drawback of CRPS by computing the complex forecast distribution easily. It is given by [33]: where F denotes forecast distribution, with the mean and standard deviation given by µ F and σ 2 F , respectively.

Pinball loss function
The PLF is relatively easy to use and is defined by: where q τ represents quantile forecast and y t is the observed value of electricity demand.

Unit commitment
Unit commitment (UC) minimises generating units' total cost within a specific time, or interval [21]. According to [34] different methods and algorithms used to determine the optimal solution of the UC problems are classified into deterministic, heuristic, and hybrid approaches. This study will use mixed-integer linear programming (MILP), which falls under the deterministic approach. MILP is a special class of linear programming [21,35], where variables are made of integer and continuous variables [34]. This study envisions demonstrating how to use forecasts to solve the UC problem. Let P ht Gi be Northern Cape system load at hour h, h = 18 : 00; 19 : 00; 20 : 00; 21 : 00 on day t, t = 1, ..., n, P ht Gi (min) is the lower limit of the power output, P ht Gi (max) represents the upper limit of the unit power output, x ht i be the 0 -1 variable (This study assumes that during the peak period all units are up, i.e x ht i = 1 for all units), F si is the start-up cost of unit i at hour h (for this study we assume that the start-up cost is zero), P ht R is the power reserve at hour h on day t and F i is the average production cost of unit i (cost/MW). This study will use the fuel cost to represent the average production cost per megawatt.
The objective function to minimise generating unit cost over a specific time is given by [21,35]: Generator power output limits x ht i P Gimin ≤ P ht Gi ≤ x ht i P Gimax , h = 18, 19, 20, 21, t = 1, ..., n, i = 1, ..., m In this study m = 36. The temperature data were aggregated to get each station's maximum, minimum, and average daily temperature. The maximum, minimum, and average daily temperature for both stations were then combined to get the average of the maximum (AveMaxT), minimum (AveMinT), and average daily temperature (AveTem) of the province that they represent. A penalised cubic smoothing spline was fitted to the response variable to determine the non-linear trend (noltrend18, noltrend19, noltrend20, and noltrend21). Daytype variable denotes the days of the week, coded as 1 for Monday, 2 for Tuesday to 7 for Sunday.

Exploratory data analysis
The summary statistics of electricity demand at hours 18:00, 19:00, 20:00, and 21:00 for the sampling period 01 January 2000 to 31 March 2014 is presented in Table 1. The maximum demand is 880, 905, 884, and 855 for each of the four hours. The mean and the median are not equal, which confirms that the distributions of the four hours are not normally distributed. The skewness and kurtosis also confirm that the distributions are non-normal. The Northern cape electricity demand time series and density plots at 18:00, 19:00, 20:00, and 21:00 are shown in Figure 1. The left panels of Figure 1 present the time series plots, whereas the right panels show density plots of the demand for the considered hours, respectively. Seasonality patterns of electricity demand in the Northern Cape on the left panels of Figure 1 show that higher electricity demand occurs in winter and lower demand occurs in summer yearly. The densities of the four hours on the right panels of Figure 1 indicate that the distributions do not follow normal distributions, which supports the report of skewness and kurtosis in Table 1. Figure 2 highlighted electricity demand for four hours in box and whisker plots. Figure 3 shows the plots of electricity demand for four hours superimposed with a non-linear trend.

Forecasting results
3.2.1. Forecasting electricity demand when covariates are given The data is for 01 January 2000 to 31 March 2014, giving a sample size of n = 5204 observations. The data is split into two parts. The data for the period 01 January 2000 to 25 May 2011 is used for training, which is 80% of the data (n 1 = 4163), and the period from 26 May 2011 to 31 March 2014 is used for testing, which is 20% of the data (n 2 = 1041). The models considered are linear quantile regression (LQR), non-linear quantile regression (NLQR), and additive quantile regression (AQR). Based on the pinball loss presented in Table 2, the best fitting model is AQR for hours 18:00 and 19:00, and LQR for hours 20:00 and 21:00. This is done for τ = 0.9999 using R-package "qgam" for AQR, and "quantreg" for LQR, and NLQR.    The best-fitting models were used to forecast the electricity demand after variable importance and variable selection using the least absolute shrinkage and selection operator (Lasso). The plots of actual demand and forecasts from the developed models are for hours 18:00, 19:00, 20:00, and 21:00 in Figure 4. It is shown in Figure 4 that the forecasts from each model follow the actual demand data remarkably well at a high quantile.
3.2.2. Forecasting electricity demand when covariates are not given The predictor variables Daytype, Month, and Trend are used to forecast the unknown variables: AveMaxT, AveMinT, AveTem, noltrend18, noltrend19, noltrend20, and noltrend21. The forecasted predictor variables are then used to forecast electricity demand at hours 18:00, 19:00, 20:00, and 21:00 shown in Figure 5.  Table 3 shows the base load demand stations, i.e. CSP and PV. This study used fuel cost to represent the average production cost, given in column 2 of Table 3. Columns 3 and 4 represent the megawatts' minimum and maximum production levels, respectively. Data for peaking stations, i.e. hydroelectric and wind power, are given in Table 4, which is similar to Table 3. The fuel costs data is from [36]. The out-ofsample forecasts for the first seven days of April 2014 obtained using the models for hours 18

Conclusion
Using Northern Cape data, this study applied LQR, NLQR, and AQR models to peak electricity demand forecasting at extreme quantile (τ = 0.9999). The paper's thrust was on peak hourly electricity demand at an extremely high quantile (τ = 0.9999). Lasso was for variable selection. The AQR models were found to be the best fitting models for hours 18:00 and 19:00, whereas LQR models were the best fitting models for hours 20:00 and 21:00. The bestfitting models were used to forecast the demand for each hour. Operational forecasts were also done for each of the four hours. Out of sample forecasts given in Table 5 from the four hours were then used as inputs in solving the UC problem. The Lingo version 18 was used to solve optimisation models. showed that using all the hydroelectric generating units, wind power, CSP and PV is less costly. These were all selected as part of the optimal solution. This study's main contribution is the development of models for forecasting hourly extreme peak electricity demand. These results could be useful to system operators in the energy sector who have to maintain the minimum cost by scheduling and dispatching electricity during peak hours when the grid is constrained due to peak load demand.