Volatility Modelling of the BRICS Stock Markets

Volatility modelling is a key factor in equity markets for risk and portfolio management. This paper focuses on the use of a univariate generalized autoregressive conditional heteroscedasticity (GARCH) models for modelling volatility of the BRICS (Brazil, Russia, India, China and South Africa) stock markets. The study extends the literature by conducting the volatility modelling under the assumptions of seven error distributions that include the normal, skewed-normal, Student’s t, skewed-Student’s t, generalized error distribution (GED), skewed-GED and the generalized hyperbolic (GHYP) distribution. It was observed that using an ARMA(1, 1)-GARCH(1, 1) model, volatilities of the Brazilian Bovespa and the Russian IMOEX markets can both be well characterized (or described) by a heavy-tailed Student’s t distribution, while the Indian NIFTY market’s volatility is best characterized by the generalized hyperbolic (GHYP) distribution. Also, the Chinese SHCOMP and South African JALSH markets’ volatilities are best described by the skew-GED and skew-Student’s t distribution, respectively. The study further observed that the persistence of volatility in the BRICS markets does not follow the same hierarchical pattern under the error distributions, except under the skew-Student’s t and GHYP distributions where the pattern is the same. Under these two assumptions, i.e. the skew-Student’s t and GHYP, in a descending hierarchical order of magnitudes, volatility with persistence is highest in the Chinese market, followed by the South African market, then the Russian, Indian and Brazilian markets, respectively. However, under each of the five non-Gaussian error distributions, the Chinese market is the most volatile, while the least volatile is the Brazilian market.


Introduction
The British economist Jim O'Neill conceptualized the word BRICS as an acronym of five key emerging regional economies of Brazil, Russia, India, China and South Africa. Prior to the introduction of South Africa in 2010, the first four nations were formerly called "BRIC". The term BRIC came into light in a paper of Goldman Sachs [39] where long-term economic growth rates were projected for the four nations till 2050 [29]. The BRICS nations cover 40% (approximately) of the world's population, create an approximate 20% of global output [21], and have an estimate of US$4.7tn in joint foreign reserves [31]. The BRICS economies' Growth Domestic Product (GDP) amounted to $13.2tn in 2011, which is fairly over the Eurozone's of $13.1tn and not far off from the U.S. of $15.1tn [10].
In 2015, the BRICS proposed plans for economic, investment and trade cooperation until 2020 in view of having a unified mega market with a land-air-sea connection network [38]. Presently, the BRICS Development Bank which is currently known as the "New Development Bank" (NDB), and the Contingent Reserve Arrangement (CRA) are 751 to 2014 and observed that volatility clustering prevails in the markets. The authors also showed that the decay of the volatility clustering was exponential for all except Brazil. Salisu and Gupta [34] analysed the response effect of the BRICS stock markets volatility on their oil shocks using the Generalised Autoregressive Conditional Heteroskedasticity variant of Mixed Data Sampling (GARCH-MIDAS) model. The authors' findings revealed heterogeneous response from the volatility of the BRICS stock markets to the oil's negative and positive shocks.
With some exceptions (e.g. [20]; [40]; [22]; [1]), it can be observed from the literature that little research has been carried out on modelling the volatility of the collective BRICS stock returns series. It is even more important to emphasize the fact that with this little research work done, none has modelled the markets with the collective seven error distributions we have used in this study. Hence, to the best of the authors' knowledge, this study is possibly the first attempt to formally model and analyse the BRICS stock markets' volatility with potential contribution to the literature using these seven error distributions together.
This paper focuses on modelling volatility of the BRICS stock returns using GARCH models under the assumption of the stated seven error distributions. The models and error distributions are discussed in Section 2. Empirical results are presented and discussed in Section 3, while Section 4 draws conclusions.

ARCH model
The ARCH model was first proposed by Engle [11] for modelling the changing variance of a time series due to the non-constant conditional variance of financial asset's return series (r t ), given past returns. For the conditional variance, the ARCH (1) model is specified as: where α and ω are unknown parameters and ε 2 t−1 is a lagged innovation term. Also ω ≥ 0 and α ≥ 0 (positive conditions) are necessary to ensure that σ 2 t ≥ 0. Another necessary condition to ensure stationarity is that α < 1. This model is generalized to an ARCH(v) model as: where α j are parameters with j = 1,..., v.

Parsimonious parametrization of the GARCH model
Engle [11] introduced a volatility process with time varying conditional variance: the ARCH process. However in practice, empirical evidence shows that a large lag order or high ARCH order has to be selected for the ARCH modelling. This high ARCH order implies the estimation of many parameters, and that usually leads to tedious calculations. To reduce this computational burden, Bollerslev [6] extended the ARCH model of Engle by including past conditional variances. This was actualized by proposing the generalised ARCH model (GARCH) as a natural solution to the challenge faced by the high ARCH orders. The GARCH model is based on an infinite ARCH specification, and it intensely reduced the number of estimated parameters from an infinite number to just a few. Thus, a GARCH specification often leads to a more parsimonious representation of the conditional variance process and provides added flexibility over the linear ARCH model when parametrizing the conditional variance. It gives parsimonious models that are easy to estimate and, even in its simplest form, has proven remarkably successful in forecasting conditional variances [12].

The GARCH model
Modelling of the magnitudes of the BRICS markets returns volatility will be done via the use of GARCH model developed by Bollerslev [6]. The conditional variance of the GARCH model is stated as a linear function of its own 752 VOLATILITY MODELLING OF THE BRICS STOCK MARKETS lags, and the model is usually specified by its conditional variance and conditional mean equations. The simplest model specification is GARCH(1, 1) model with the mean equation defined as: where r t denotes the return series, ε t is the part of the time series return that is unpredictable and it is known as the residuals or innovations, µ denotes the mean function usually expressed as an ARMA process, i.e., where θ i and ϕ i are parameters with i = 1, ..., q and i = 1, ..., p respectively. In this study, this volatile residual (ε t ) of the return will be modelled assuming the following distributions: normal, Student's t, generalized error distribution (with their skew variants) and the generalized hyperbolic (GHYP) distribution.
The variance equation of the GARCH(k, v) can be stated as i.e. the conditional variance σ 2 t of the GARCH(k, v) model is a linear function of the past conditional variances and past squared innovation. The ω represents the intercept and z t denotes the standardized residual returns. The residuals are random variables with variance 1 and mean 0 [37] and they are known to be independent and identically distributed, i.i.d., [28].

Conditional distributions
This section describes the seven main univariate error distributions used in this study for modelling the markets' volatilities, and they include the normal, skewed-normal, Student's t, skewed-Student's t, GED, skewed-GED and the GHYP distribution.

The normal distribution
The normal distribution is characterized entirely by its mean and variance (which are its first two moments). It is a symmetric and uni-modal (i.e., singe-peaked) distribution having zero excess kurtosis and zero skewness. A random variable X can be described as normally distributed with mean µ and variance σ 2 . The density is stated as [16]: When the residuals ε are standardized by σ 2 (following a mean filtration process), it produces the standard normal density expressed as

The skewed-normal distribution
The skewed-normal (SN ) is a parametric class of probability distributions with a shape parameter ξ that regulates the skewness. Hence, it is an extension of the normal distribution that allows for a continuous variation from normality to non-normality [2]. Let ϕ(·) and Φ(·) be the standard normal density function and (its) distribution function respectively for a random variable X with ξ as the shape parameter [9,2]. The probability density function (pdf) of the skewed-normal distribution is then given as (see [3]) The density function in equation (8) can be stated as [2] f where X ∼ SN (0, 1, ξ) for −∞ < x < ∞ and any given ξ ∈ R. When ξ = 0, the SN distribution reduces to the normal distribution, but it becomes the half-normal distribution if ξ → ±∞. The skewness of the distribution increases as ξ increases (in absolute value). The square of a random variable that follows an SN distribution is a Chi-square variable with one degree of freedom (X 2 ∼ χ 2 1 ) irrespective of the value of ξ [3]. For practical numerical work, the scale and location parameters can be incorporated by transforming linearly as Y = µ + σX, where µ and σ are the location and scale parameters respectively. This then follows the skewed-normal distribution Y ∼ SN (µ, σ 2 , ξ), with σ > 0 [9].
An alternative representation of the skewed-normal distribution that is useful for financial (return) modelling was presented by Pourahmadi [32]. In this way, the expression Y ∼ SN (µ, σ 2 , ξ) is written as a weighted average of a half-normal and a standard normal variable: (10) is interpreted in the language of financial economics, the return Y is driven (with the inclusion of the location parameter µ) by a Gaussian element Z 2 modulated by σ √ 1 − δ 2 and a half-Gaussian element |Z 1 | modulated by σδ [9]. The skewness of the distribution becomes more pronounced to the left (right), the closer the value of δ is to −1(+1). The impact of δ can then be highlighted by the mean, variance, skewness and kurtosis of Y as follows: There is a quadratic link for the variance, whereas the mean is a linear increasing function in the skewness parameter δ. Unlike the normal distribution, the values of the skewed-normal distribution ranges from −1 to 1 and can therefore be calibrated to skewed data. However, the range of possible skewness values is still comparatively limited [9].

The Student's t distribution
Like the normal distribution, the Student's t distribution is also uni-modal, symmetric and bell-shaped, but with thicker (or heavier) tails than the normal distribution. As an alternative to the normal distribution, the Student's t distribution can be used to fit the standardized innovations (residuals). It is wholly described by a shape parameter ξ, but its 3-parameter representation are used for standardization as follows: where Γ is the Gamma function, while ξ, µ, and σ are the shape, location and scale parameters respectively. The mean (and mode) of the t distribution signifies the location parameter µ while the variance is stated as [16] VOLATILITY MODELLING OF THE BRICS STOCK MARKETS It is required that Var(X) = 1 for standardization purposes, hence With (ξ−2) ξ substituted into equation (15), the standardized Student's t distribution gives The Student's t distribution has excess kurtosis equal to 6/(ξ − 4) for ξ > 4 and zero skewness.

The skewed-Student's t distribution
The skewed version of the Student's t distribution simply termed the "skewed-Student's t" distribution was introduced by Branco and Dey [7], and further developed by Azzalini and Capitanio [4]. This distribution allows regulating both kurtosis and skewness, and that is why it is particularly valuable for modelling capital market data [9]. The standardized Student's t skewed distribution can be stated as where Z represents an independent SN (0, 1, ξ); but if N (0, 1) is used instead, it would yield the Student's t [9]. The parameter v denotes the degrees of freedom.
The random variable Y has a skew-t distribution, i.e. Y ∼ ST (µ, σ 2 , ξ, v), with parameters (µ, σ 2 , ξ, v) following linear transformation from X, such that Y = µ + σX. The mean and variance of the distribution can be calculated as follows (more complex presentation of the kurtosis and skewness can be seen in Azzalini and Capitanio [4] E(Y ) = µ + σwδ (20) Var , and δ is a skewness parameter. The variance is a quadratic function on δ, while the mean is a linear increasing function in δ. When the skewed-Student's t distribution is compared with the skew normal distribution, the former is known to take more extreme values for both skewness and kurtosis [9].

The generalized error distribution
The generalized error distribution (GED) is symmetric with a 3parameter distribution that belongs to the family of exponential distributions with conditional density stated as where µ, ξ, and σ are the location, shape and scale parameters respectively. The GED is also a uni-modal distribution where the location parameter µ is the mean of the distribution. The kurtosis, Kur and variance, V ar are stated as: The density becomes flatter and flatter as ξ decreases and it approaches the uniform distribution as the limit ξ → ∞. When ξ = 1 and ξ = 2, the distribution becomes the Laplace and the normal respectively. A unit standard deviation in equation (25) can be obtained by rescaling the density during standardization When this is substituted into the scaled density of z, it becomes

The skewed-GED distribution
The skewed variant of the generalized error distribution (GED) is called the "skewed-GED (SGED)" distribution. The standardized SGED distribution for a random variable X has a density function expressed as [24] f where The fat-tails and height of the density function are directed by the shape parameter ξ with constraint ξ > 0, whereas δ denotes the density's skewness parameter with −1 < δ < 1. Sign represents the sign function. The SGED distribution takes the form of the standard normal distribution when ξ = 2 and δ = 0. The density function skews to the left (right) with negative (positive) skewness [24].

The generalised hyperbolic distribution
The generalised hyperbolic (GHYP) distribution is a normal mixture of variance-mean in which the generalized inverse Gaussian (GIG) distribution is the mixing distribution. It is a continuous probability distribution introduced by Ole Barndorff-Nielsen [5], with the pdf (probability density function) given based on the second kind of modified Bessel function represented by K λ [30]. Furthermore, it has semi-heavy tails property that makes it relevant and regularly used in risk management and modelling of financial markets data.
Standardization and estimation of the density requires estimating two invariant parameters, i.e. the location and scale parameters (denoted by υ, ℵ), that denote the shape and skewness combined. Following this is a series of transformation stages to scale and translate the 2 parameters (υ, ℵ) into the parametrization of (ς, ϑ, , ı) which generates standard formulae for the likelihood function [16]. For details on the proof of the standardized generalized hyperbolic distribution, see Ghalanos [16].

Model selection measures
Representative statistical models are usually built by researchers to test hypotheses and theories, and then ascertain how well each of the models fit collected data. To determine the theory that best describes the observed data, the candidate models are compared using an objective method. The model that is considered the best by the stipulated method is chosen as the best model, and it favourably indicates that the theory can be represented by the model [8]. Some of the model selection methods include the likelihood ratio tests (LRT), coefficient of determination (R 2 ), adjusted coefficient of determination (R 2 adj), generalized cross-validation (GCV), the information criteria, among others.

VOLATILITY MODELLING OF THE BRICS STOCK MARKETS
Common information criteria as defined in equations (28) to (31) include the AIC (Akaike information criterion), BIC (Bayesian information criterion), HQIC (Hannan-Quinn information criterion) and SIC (Shibata information criterion) (see [16]). An information criterion is a function of the value of the log-likelihood (the goodness of fit component) and the number of the model's parameters (the model complexity component). Selection of models using information criteria, underpinned by the information-theoretic framework are more flexible than methods grounded in null hypothesis testing ( [18]).
As an advantage, information criteria do not necessitate contending models to relate with each other in any particular way, and the contending (candidate) models do not need to be nested. More so, the use of information criteria can accommodate simultaneous comparison of any number of models, i.e. models comparison is not limited to two at a time, nor is it compulsory to test models incrementally (like Model 1 versus Model 2, Model 2 versus Model 3, etc.). In conclusion, for the fact that information criteria are not based on the framework of null hypothesis testing, designation of any model as a null model is not needed for comparison purposes [8]. Based on these, the goodness of fit of the seven error distributions under the GARCH models will be assessed by these information criteria. In general, the smaller (bigger) the information criterion (the log-likelihood), the better the model's fit.
where n denotes the sample size, l is the log-likelihood of the maximum likelihood of unknown parameter vector L(Θ), log e represents the natural logarithm, and p denotes the number of estimated parameters [25].

Data description
The raw price data used for this study include the daily closing equity indices of the Brazilian, Russian, Indian, Chinese and South African stock markets. The data were obtained from Thomson Reuters Datastream and are for the period 5 th January 2010 to 6 th August 2018 with 2126 observations. That is, the data for each of the BRICS indices are recorded for 260 days per year, which is 5 trading days in a week.
The BRICS markets' indices are the IBOV (or Bovespa) index of Brazil Sau Paulo stock exchange, the IMOEX (Moscow Exchange) index of Russia, the Indian NIFTY (or NIFTY 50) index is the national stock exchange of India. Next is the SHCOMP (i.e. the Shanghai Stock Exchange Composite) index of China, and the JALSH (JSE Africa All Share) index of South Africa. Modelling of the volatility of the BRICS markets' returns under each of the selected error distributions and the selection of best fitting model (error distribution) to describe each market will be done using the R package "rugarch" developed by Ghalanos [16]. Since these were single day each, adjustment was made by using the average value of the day prior to and after each missing day as the values of the closing price (see [26]).   The daily closing prices of financial time series like that of the stock market usually exhibit non-stationarity, and empirical studies on such series have shown that approximation to stationarity can be obtained via log-daily returns. By this, logarithms of ratios of successive realizations are taken, and this can generate reasonable transformation to stationarity (from price to return). For convenience of presentation, the generated return series is further re-scaled by multiplying by 100 as follows

Missing values
where t is the time period in days, P t signifies the closing stock price index at time t, and previous day's closing market price index is P t−1 . The natural logarithm is denoted by ln, while r t represents the current returns.

Exploratory data analysis
The first requirement in the analysis of any statistical dataset is the exploratory data analysis (EDA) which gives first hand examination of the content of the dataset with regards to detecting outliers and anomalies in the data. Exploratory analysis helps to determine whether the data satisfy basic distributional assumptions and it can suggest useful normalizing transformation. Figures 1, 2, 3, 4 and 5 show visual inspection of the EDA of the BRICS equity markets' price and return series. For each of the markets, it can be observed that plots of the daily equity prices are not stationary as displayed in panels a, b, e and f. These panels show the non-stationary trend of the raw price series plot, the density plot, QQ (quantile-quantile) plot and the box plot. On the other hand, panels c, d, g and h display an approximate stationarity for the return series plot, the density plot, QQ plot and the box plot. Kernel density estimation is used to estimate the density [36]. The observations that are not aligned on the unit diagonal of the QQ plot at the extreme sides indicate extreme observations, hence the distribution is fat-tailed.

Descriptive statistics
The preliminary descriptive statistics for the BRICS equity prices and returns for the period under study are presented in Table 1. The table gives information on the sample mean, median,   The table also shows that the five markets return series are leptokurtic, i.e. their kurtosis are greater than 3. This indicates that extreme price changes take place more frequently during the sampled period [35]. Kurtosis relates to the tails of a distribution, where the value 3 is associated with the kurtosis of a normal distribution. In addition, the table shows negative skewness for the five returns, implying that the market indices have long left tail distribution. Negative skewness also denotes higher probability of large decreases in equity returns for the period sampled. The BRIC's (excluding the South African index) daily prices on the other hand show positive skewness that signifies long right tail distributions. The skewness further indicates that the markets' distributions are non-normal. The non-normality is supported by the significance of the Jarque-Bera (JB) test statistic at 1% level for the prices and returns in the five indices.

Test for stationarity
Stationarity of the raw price and returns was evaluated using the Augmented-Dickey Fuller (ADF), Phillips-Perron (PP), and Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) tests for unit root. The test was first implemented on the daily price index and it was found that the price indices are not stationary. This is in line with the visual inspections in Figures 1, 2, 3, 4

and 5.
Next, the raw price data were transformed to the returns by taking the first difference of logarithms of the price. After this, it is observed that in the ADF, PP and KPSS tests, the test statistics are less than the critical values at 1%, 5% and 10% levels of significance for the five BRICS markets as displayed in Table 2. These results indicate that the return series are stationary, since the null hypothesis of a unit root in the series is rejected under the ADF and PP tests, and the null hypothesis of stationarity cannot be rejected under the KPSS test at the three levels.

Test for serial correlation
Following stationarity of the data, it is required to test for the presence of short-range linear dependence termed serial correlation or autocorrelation in the residuals. Serial correlation may occur as a result of the relationship between a variable and its lagged version over various intervals of time. This test is carried out on both the price and return residuals using the Weighted Ljung-Box test [14]. The results are displayed under panels (A) and (B) in Table 3. For the price data, panel (A) of the table shows all the p-values < 0.05 at lags 1, 5 and 9 for the standardized residuals of the BRICS equity markets, which denotes strong autocorrelation in the price data. The price data were transformed to the log-returns and some candidates ARMA(p, q) models were fitted to the five BRICS markets' return series. From the candidate models, ARMA(1, 1) model as stated in equation (33) is found to be the most suitable to remove linear dependency (autocorrelation) in the markets' returns series.
Panel (B) of Table 3 shows the outcome of the "Weighted Ljung-Box" test that relates to the selected ARMA(1, 1) model for each of the BRICS markets' returns. The p-values (at lags 1, 5 and 9) are big (greater than 0.05), hence we fail to reject the null hypothesis of "no serial correlation" in the return residuals.

Test for ARCH effects
Following the filtering of short-range dependence from the data series, it is next required to test for the presence of heteroscedasticity or ARCH effects in the data before fitting the GARCH model on the residuals of the return series. The presence of ARCH effects can be tested using ARCH LM Test. Table 4 shows that the ARCH-LM test statistic in the five BRICS markets is highly significant with all the pvalues very small (lower than 0.05), hence, the null hypothesis of "no ARCH effect" is strongly rejected in the residuals of the returns series. This result is a confirmation of the presence of ARCH effects in the residuals of the five indices, which indicates the existence of volatility clustering as a result of the time varying variances of the return series. Based on this, a GARCH model can be fitted to remove the ARCH effects in the series. The modelling and filtration of volatility (heteroscedasticity effects) of the returns will be implemented using candidate GARCH models under each of the selected error distributions.

Empirical outcomes of the ARMA-GARCH models
This section focuses on the volatility dynamics of the five BRICS markets to determine the magnitudes of volatility in each and arrange them from the highest to the lowest for investors' decision making. To begin with, several candidate ARMA(p, q)-GARCH(k, v) models were run to obtain a combined model that can best remove linear dependency and heteroscedasticity in the return series. From the candidate models, ARMA(1, 1) and GARCH (1,1)

Residual diagnostic test
After fitting the GARCH models to the returns, residual diagnostics were carried out with the use of the weighted ARCH LM tests to determine whether ARCH (heteroscedastic) effects have been filtered out of the residuals or not. The results from the "ARCH LM test statistic (5)" in Table 4 show that at lag order 5, under all the stated error distributions, the p-values are sufficiently large (above 5%) for the five BRICS markets. Based on this, we fail to reject the null hypothesis of "no ARCH effect" in the residuals, and hence conclude that the variance equations are well specified.
Hence, volatility of the South African market is best described by the ARMA(1, 1)-GARCH(1, 1) model under the skew-Student's t distribution.
As a summary, it is concluded that using the ARMA(1, 1)-GARCH(1, 1) model, volatilities of the Brazilian Bovespa and the Russian IMOEX markets can both be well characterized (or described) by a heavy-tailed Student's  Note: "*", "**" and "***" are 1%, 5% and 10% levels of significance respectively. ARCH LM test statistic (5) denotes ARCH effects up to the 5 th order with p-values in parentheses, and 5% level of significance is used in every case.
models under the assumptions of three distributions of a normal, Student's t, and GED. That is, similar to our findings, the authors also concluded that the Student's t error distribution was the best to describe the volatility of the Brazilian and Russian stock markets. However, since we used a wider scope of error distributions, i.e. we used Note: "*", "**" and "***" are 1%, 5% and 10% levels of significance respectively. ARCH LM test statistic (5) denotes ARCH effects up to the 5 th order with p-values in parentheses, and 5% level of significance is used in every case. of the standardized residuals confirm that each of the best error distributions (i.e., the Student's t, GHYP, skew-Student's t and skew-GED) is a better fit to the distributions of the BRICS markets' return residuals than a normal distribution.

Persistence of volatility
We measure the persistence of volatility in the BRICS markets using the sum of the coefficients (α, β) in the conditional variance equation of the GARCH process, which is used to measure the speed of decay of shocks to volatility. This is referred to as the persistence of the GARCH model, and it indicates how fast large volatilities decline after a shock. Shocks to the conditional variability are highly persistent when the sum is greater than one (α + β > 1), suggesting that the forecasts of volatility are explosive. This implies the presence of volatility clustering in the series. If the sum of the coefficients equals one (i.e. α + β = 1), then the persistence of shocks to volatility   is felt forever, and the model will be unable to determine the unconditional variance of the process. Engle and Bollerslev [6] refer to this type of process as "Integrated-GARCH". Lastly, shocks to volatility displays long persistence into the future if the sum (α + β) is close to one. This occurs because the variance process reverts very slowly to the mean (normal) state in a process termed "mean reversion". The closer the sum (α + β) to 1, the longer it takes volatility to revert to the mean state.

Conclusion: Volatility hierarchy of the BRICS equity markets
The pattern of volatility persistence in each of the BRICS markets can be determined by a non-Gaussian distribution like the GED, Student's t and GHYP. Financial returns are known to exhibit fat tails or leptokurtosis, and Gaussian distributions like the normal distribution is not suitable for modelling such [28].
The values of volatility persistence, i.e. the sum of the coefficients (α + β), at various magnitudes is close to one under the Student's t, skew-Student's t, GED, skew-GED and GHYP distribution (see Tables 5, 6, 7, and 8). The volatility hierarchy of the BRICS equity markets is evaluated based on the relative outcome of the sum of the coefficients, where the highest summation gives the highest volatility persistence. The volatility persistence obtained from summing the coefficients in the tables are summarized in Table 9. From this summary, it can be observed that the persistence of volatility in the BRICS markets does not follow the same hierarchical pattern under the error distributions, except under the skew-Student's t and GHYP distributions where the pattern is the same. Under these two assumptions, i.e. the skew-Student's t and GHYP, in a descending hierarchical order of magnitudes, volatility with persistence is highest in the Chinese market, followed by the South African market, then the Russian, Indian and Brazilian markets respectively (see Table 9).
For the Student's t distribution, the Chinese market has the highest volatility persistence, followed by the Russian, South African, Indian and Brazilian markets in that order. For the GED, the pattern of the volatility persistence in a descending hierarchy is the Chinese, Indian, South African, Russian and Brazilian markets respectively. Lastly, for the skew-GED, the descending hierarchical pattern is the Chinese, South African, Indian, Russian and Brazilian markets respectively. However, under each of the error distributions, the Chinese market is the most volatile, while the least volatile is the Brazilian market.
This result is consistent with the findings of Ijumba [20] who also modelled the BRICS volatility, and observed the presence of volatility persistence in all the BRICS markets' stock returns using VAR and GARCH(1, 1) models. The study by the author also showed that Chinese market is the most volatile among the bloc but his conclusion on South African market being the least volatile market contradicts our findings of Brazil as the least volatile. The contradiction may be attributed to the scope and frequency of the data used for the two studies. The author [20] used sampled weekly data from 2000 to 2012, while our study used daily closing data from 2010 to 2018, and it should be noted that South Africa only joined the BRICS bloc in 2010. Hence, the author only had access to two years weekly data for the South African JSE whereas our study utilized eight years of daily closing South African JALSH data with wider coverage and much more market activities.
Based on the above findings, the main potential contribution of this paper to the growing studies on the volatility of the BRICS stock returns is twofold. First, this study gives new insights into the literature by providing a more robust explanation in more depth to the modelling and analysis of volatility of the BRICS stock returns by using the specified collective seven error distributions which as far as we know has not been done before this study. Second, as opposed to a major study carried out by Ijumba [20] for the entire five BRICS markets using weekly data that covered only 2 years of markets' activities from 2010 to 2012, our study used a wider coverage of activities from 2010 to 2018 using daily closing data, which can potentially give more accurate results.