Generalized Self-Similar First Order Autoregressive Generator (GSFO-ARG) for Internet Traffic

Internet traffic data such as the number of transmitted packets and time spent on the transmission of Internet protocols (IPs) have been shown to exhibit long memory property, often referred to as self-similarity. Simulating this type of dataset is an important aspect of delay avoidance planning, especially when trying to mimic real-life processing of packets on the Internet. Most of the existing procedures often assumed the process follows a Gaussian distribution, and thus long memory processes such as Fractional Brownian Motion (FBM) and Fractional Gaussian Noise (FGN) among others are used. These approaches often result in estimation errors arising from the use of inappropriate distribution. However, it has been established that the distribution of Internet processes are heavy-tailed. Therefore, in this paper, a new method that is capable of generating heavy-tailed self-similar traffic is proposed based on the first-order autoregressive AR (1) process. The proposed method is compared with some of the existing methods at varying values of the self-similar index and sample sizes. The imposed self-similarity indices were estimated using the Range/Standard deviation statistic (R/S). Performance analysis was achieved using the absolute percentage errors. The results showed that the proposed method has a lower average error when compared to other competing methods.


Introduction
Scientic experimentation has gained a powerful tool with the advent of computers. Nowadays, their immense processing capabilities are used to perform complex simulations of data in various real-world situations. In a few seconds, the (almost) tireless machines are capable of simulating and testing panoply of conditions that a man would take a lifetime to reproduce. Supported by the law of large numbers, their outputs are often considered as decision aids or as scientic arguments, if the codied simulation and resulting built model is guaranteed to be free of errors [9].
The term self-similarity was first described by Hurst [8] in his reports about the Nile River. It is now known to be embedded in many of the processes relating to natural and articial events [9], with its popularity that may be traced to the ndings that unfold the so-called self-similar nature of Internet traffic. This has brought attention to the fact that the behaviour of trafc in the network aggregation points should be understood by considering the self-similar and non-memoryless property. Quantication of self-similarity is often done by estimating the parameter known as the Hurst parameter described in [22], the simulation of sequences with the aforementioned property constitutes a rather defying task, mostly because of the retrospective nature of its denition. The generator of a sequence with 812 GENERALIZED SELF-SIMILAR FIRST ORDER AUTOREGRESSIVE GENERATOR (GSFO-ARG) The rest of the paper is organized as follows; Section 2 gives preliminaries of self-similar processes. Section 3 presents the main results of this paper. Section 4 presents the numerical examples from simulations. Section 5 presents the discussion of results while the conclusion and future work are presented in Section 6.

Development of self-similar process model
In this section, a brief consideration of some terminologies related to Internet traffic and particularly heavy-tailed distribution is presented.

Self-similarity
A self-similar object has been described to be exactly or approximately similar to a part of itself [16,17,18]. Selfsimilarity in Internet traffic occurs when packets of the same burst length arrive at the same time or when packets burst at the same inter-arrival period on the server [14,16,18]. Simply speaking, an Internet traffic process that exhibits self-similarity implies that the process is indistinguishable from its scaled versions, which are obtained by averaging the original process within different observation time scales. The mathematical description of selfsimilarity is given by [1,16].
Equation (1) describes an incremental process X j (j = 1, 2, . . . ) whose average in non-overlapping blocks of size m is another process X m j (j = 1, 2, ).Here, process X j is said to be self- is the scale parameter, and H, (0.5 < H ≤ 1) is the Hurst parameter, which is used to measure the burstiness of an Internet traffic process. Obviously, when H = 1 , processes X m j and X j have the same distribution without any decay since var(X m j ) = var(X j ).

Long-Range Dependence (LRD)
A self-similar process can also contain a property of long-range dependence [25]. Long-Range Dependence (LRD) describes the memory effect where a current value strongly depends upon the past values of a stochastic process, and it is characterized by its autocorrelation function at various time points usually called the lag points. Given the Hurst parameter H for 0 < H < 1, H ̸ = 0.5 the autocorrelation function r(k) for lag k is; For values 0.5 < H < 1 , the autocorrelation function r(k) decays hyperbolically to ck −2H−2 as k increases, which means that the autocorrelation function is not summable [6]. This is opposite to the property of shortrange dependence (SRD), where the autocorrelation function decays exponentially, and the equation (2) has a finite value. Short and long-range dependence have a common relationship with the value of the Hurst parameter (H) of the self-similar process [14,26,16,17,18,11,20,19,21]. When the value of H lies in the interval 0 < H < 0.5, the self-similar process is said to be SRD and if the value of H lies in the interval 0.5 < H < 1 the process is said to have LRD. The reviewed classical approaches that captures the behaviour of Internet traffic are detailed in the subsections that follows.

Fractional Brownian Motion and Fractional Gaussian Noise
The Fractional Brownian Motion (FBM) described by [12] is a Gaussian stochastic process in continuous time defined as: for t > 0 (and similarly for t < 0) where Γ(α) = ∫ ∞ 0 x α−1 exp(−x)dx and 0 < H < 1. The FBM B H (t) has the following covariance function: The Hurst exponent describes the raggedness of the resultant motion with a higher value leading to smoother motion. The value of H determines what kind of process the FBM is. For clarity, it was reported in [12] that: i. if H = 1/2 then the process is, in fact, a Brownian motion or Wiener process; ii. if H > 1/2 then the increments of the process are positively correlated; iii. if H < 1/2, then the increments of the process are negatively correlated.
The increment process, X = X k : k = 0, 1, ... of Fractional Brownian Motion is known as Fractional Gaussian Noise (FGN) defined by; It is clear that X k has a standard normal distribution for every k, but that there is (in general) no independence. To be precise, the corresponding auto-covariance function γ() is of the form for k ∈ Z. If H = 1/2, all the covariances are 0 (except, of course, for k = 0). Since Fractional Gaussian Noise is a Gaussian process, this implies independence. This agrees with the properties of ordinary Brownian Motion, which has independent increments.

Fractional Autoregressive Moving Average Process
Another widely used process with long-range dependence is Fractional Autoregressive Moving Average Process (F-ARMA). The parameters of this model control long-range dependence as well as the short term behaviour. The F-ARIMA model is based on the ARMA model [5]. An ARMA (p, q) process X = X k : k = 0, 1, ... is a short memory process that is the solution of; where ϕ and θ are polynomials of order p and q respectively, and ϵ is a white noise process, i.e., the ϵ k are i.i.d. standard normal random variables. The lag operator L is dened as LX k = X k−1 . A generalization of this model is the ARIMA (p, d, q) process for d = 0, 1, ..., dened by the property that (1 − L) d X k is an ARMA (p, q) process. As implied by its name, the fractional ARIMA model admits a fractional value for the parameter d. For this, we have to understand how (1 − L) d X k dened for fractional d. The fractional order is computed using the binomial expansion in (8); is defined as; Since the case d > 1/2 can be reduced to the case −1/2 < d ≤ 1/2 by taking appropriate differences, the latter case is particularly interesting. X is a stationary process for −1/2 < d < 1/2. Long-range dependence occurs for 0 < d < 1/2, implying that the process is also asymptotically second-order self-similar in this case. The corresponding Hurst parameter is H = 1/2 + d, [5]. 814 GENERALIZED SELF-SIMILAR FIRST ORDER AUTOREGRESSIVE GENERATOR (GSFO-ARG) Apart from the standard processes presented above, there are some other interesting hybrid approaches for simulating self-similar processes that are based on either the FBM or FGN. Some of these methods are presented in the subsections that follow.

Paxson method
Paxson [15] proposed a rather intuitive method for simulating the Fractional Gaussian Noise process. Paxson studied the output of the FGN process by statistically testing if the resulting samples satisfy the desired properties. The approach was found to be justification deficient as it is unclear why the obtained sample should be close to Gaussian. In the Paxson method, the approximate FGN sample is the Fourier transform of where R 1 , R 2 , R 3 , . . . , R k are independent exponentially distributed random variables with mean 1 for k ≥ 1, and the ( * ) denotes the complex conjugate.
In this case, t k equals 2πk/N .

Hosking Method
The Hosking method [9] (also known as the Durbin or Levinson method) is an algorithm to simulate a general stationary Gaussian process. Therefore, the focus here is on the simulation of Fractional Gaussian Noise X 0 , X 1 , . . . .The method generates X ( n + 1) given X n , . . . , X 0 recursively. It does not use specific properties of Fractional Brownian Motion nor Fractional Gaussian Noise and thus applicable to any stationary Gaussian process.
The key advantage of the approach is that the distribution of X ( n + 1) given the past can be explicitly computed.

Cholesky method
The Cholesky method [5] was developed using the Cholesky decomposition of the covariance matrix of Gaussian processes. The approach involves redefining the covariance matrix Γ(n) such that it can be written as L(n)L(n) ′ , where L(n) is an (n + 1) × (n + 1) lower triangular matrix. Denoting element (i, j) of L(n) by l ij for i, j = 0, 1, . . . n, then L(n) is said to be the lower triangular matrix if l ij = 0 for j > i. It can be proven that such a decomposition exists when Γ(n) is a symmetric positive denite matrix.Unlike the Hosking method, the Cholesky method can be applied to non-stationary Gaussian processes. The various self-similar generators discussed so far can be evaluated and assessed for their efficiencies by estimating the value of the Hurst index H initially imposed at the simulation step. The rescale range (R/S) statistics is one of the popular methods for estimating the H value of a self-similar process. The following subsection presents the procedure.

R/S analysis
The R/S statistics discussed by Rose [24] is particularly attractive because of its relative robustness against changes in the marginal distribution, even for long-tailed or skew distributions. Given an empirical time series of length (X k : k = 1, ..., N ), the whole series is subdivided into K non-overlapping blocks. The next is to compute the rescaled adjusted range R(t i , d)/S(t i , d) for several values d, where t i = [N/K](i − 1) + 1 are the starting points of the blocks which satisfy (t i − 1) + d ≤ N . Here, R(t i , d) is defined as: where Stat., Optim. Inf. Comput. Vol. 8, December 2020

The Development of GSFO-ARG Generator
Theorem 3.1: For any random AR(1) memoryless process X k , with an underlying distribution D and autocorrelation function ρ k = γ k /γ 0 . The transformed process X ′ k defined over X k is a self-similar process with the same distribution D and autocorrelation function ρ ′ k = (ck 2H−2 0 ) k k 0 , where k 0 is an optimal fractional index parameter, H is the Hurst index and c = exp(k 0 log H − (2H − 2) log k 0 ).

Proof:
As earlier defined, an ARMA (p, q) process X = X k : k = 0, 1, . . . is a short memory process that is the solution of the equation (13) [2].
with ϕ, θ, and lag operator L are as defined in subsection 2.4. The equation (13) can then be re-written as; Accordingly, following Cryer and Chan [3], an AR (1) process is; The corresponding autocorrelation property of the process is; and var(X k ) = γ 0 , which then implies from (16) that; Thus, If |ϕ| < 1, the magnitude of the autocorrelation function decreases exponentially as the number of lags k increases. This implies an AR(1) process is a short memory process. Our approach is to make the simple AR(1) process a long memory process by equating the established autocorrelation of any self-similar process to that of the AR (1) process and solving the resultant expression. The derivation follows as; The problem here is to obtain the value of ϕ that will ensure a specific self-similar index H given c and k. To obtain c that will ensure a specific H implies that ϕ = H; After the substitution of (23) in (21), we have, And finally, we have, Equation (25) shows that if k is chosen appropriately the autocorrelation will decay slowly since the autocorrelation of AR(1) is ϕ k . The next step in the simulation process is obtaining the value of k that will also ensure a specific H keeping in mind that 0 < ϕ < 1 is required to achieve a stationary AR(1) as well as positive autocorrelation property of a self-similar sequence. For k ≤ H and ϕ > 1, the AR(1) process will not be stationary. If k → ∞, then ϕ → 1 and thus k lies in the interval H < k < ∞. Therefore, to obtain an exact self-similar process with a specific H, k should be increased gradually until the target H is achieved. By direct substitution of (25) in (15), our proposed GSFO-ARG generator X ′ k can be obtained using: where k 0 is the optimal k that will ensure a specific H.The distribution of ϵ k determines the distribution of X ′ k , thus GSFO-ARG generator applies to any distribution. The corresponding autocorrelation property is; Equation (27) shows that the autocorrelation of the sequence X ′ k will decay hyperbolically and eventually yield the expected autocorrelation of any self-similar process when k 0 → k.

Numerical Examples and Results
We present the results of some five existing self-similar generators compared with the proposed GSFO-ARG method at different values of the Hurst parameter H. Since the values of H in the interval 0.5 < H < 1 are desirable for the self-similar process with LRD, which is the focus of this study. Therefore, we choose H = 0.6, 0.7, 0.8 and 0.9 values for the implementation of all the self-similar generators discussed in this study. Also, the optimal value of k 0 that is desirable for each of the chosen Hurst parameters H were determined and used for the proposed GSFO-ARG method to generate its own set of stochastic sequences.
Furthermore, the relative performance of each generator is assessed using the percentage error formula given by; In some instances, especially in cases for which the value of the Hurst parameter H is underestimated by a generator, the absolute percentage error can be used and it is given by; In equations (28) and (29),Ĥ represents the estimate of the Hurst parameter H provided by each generator. Results in Table 1 are the average estimates of the Hurst parameters based on the sequences generated by each of the existing self-similar generators compared with our proposed generator at sample size n = 2 8 averaged over 1000 iterations. Table 1 also reports the percentage error (PE) and absolute percentage errors (APE) of estimates as computed by all the six self-similar generators. The APE is used to assess the efficiency of the generators of the self-similar process. Based on this assessment, a method with the least APE is the most efficient among all the self-similar generators considered. Similarly, the results of all the six self-similar generators at sample size n = 2 1 0 over 1000 iterations are also provided in Table 2. A similar pattern of results were equally observed in both the Table 1 and Table 2 for all the six generators. Tables 1 and 2 also show that the proposed GSFO-ARG method has the least % absolute errors and thus adjudged as the most efficient of all the six methods considered. To have a clear overview and assessment of the level of closeness of the estimated Hurst parametersĤ as provided by the six selfsimilar generators including the proposed GSFO-ARG method to their true values H, we plotted H againstĤ and it is shown in Figures 1a and 1b using the results in Tables 1 and 2 Figures 1a and 1b) which is the reference line against which the line provided by each of the generators is compared. By this, a generator whose line is the closest to the reference line would be adjudged the best of the six generators.
Finally, we provide in Figure 2, the line graphs of the percentage absolute errors of all the six self-similar generators at the chosen Hurst parameter values as reported in Table 2 at sample size n = 2 10 . The method that provided a graph with points closest to the horizontal x-axis of the graph is adjudged to be the most efficient among the competing ones.   In summary, the efficiency of five existing self-similar Internet packets generators was examined. Furthermore, an alternative method, GSFO-ARG(1) for generating such a self-similar sequence, was proposed for efficiency gain. The efficiency of the five existing methods and the proposed method was assessed using PAE. The various results presented in Tables 1 and 2 showed that the method that provided the best estimates of the Hurst index H is the proposed GSFO-ARG method, which was simply shortened as AR(1) generator because its development originated from the first-order autoregressive process.

Conclusion
It can generally be concluded that whenever the interest centres on examining the self-similar processes of Internet traffic, the proposed GSFO-ARG(1) method should be employed. This is irrespective of whether the self-similar process is Gaussian or non-Gaussian.
In the future study, it might be imperative to consider computer systems for the impacts of different operating systems and other properties often employed by users systems for generating the sequences of Internet traffic data. This is desirable to determine the influence of different systems configurations on the properties of the self-similar processes as established here and elsewhere [16,17,18].