Informational Energy and Entropy Applied to Testing Exponentiality

The exponential distribution is widely used in reliability and life testing analysis. In this paper, two tests of fit for the exponential distribution based on Informational Energy and entropy are constructed. Consistency and other properties of the tests are proved. Using a simulation study, critical values of the proposed tests are obtained and then power values of tests are computed and compared with each other against various alternatives. Finally, we apply the tests for time between failures of secondary reactor pumps and waiting times for fatal plane accidents in the USA from 1983 to 1998.


Introduction
Suppose that the random variable X has distribution function F with density function f . The informational energy ε(f ) of the random variable is defined as Onicescu (1966) justified the name informational energy and its connection to Information Theory in the classical mechanics. Rao (1973) obtained distributions describing equilibrium states in statistical mechanics based on the informational energy. The informational energy has been used in many statistical problems, see Theodorescu (1977), Onicescu and Stefanescu (1979), Pardo and Taneja (1991) and references there in. In non-parametric statistics, an estimator of informational energy is useful for researcher. Pardo (2003) introduced an estimator of informational energy as follows. He noted that ε(f ) can be expressed as ) −1 dp .
Then he constructed its estimator by replacing the distribution function F by the empirical distribution function F n , and using a difference operator instead of the differential operator. The derivative of F −1 (p) is then estimated by a function of the order statistics. Assuming that X 1 , . . . , X n is a random sample, the proposed estimator by 221 Pardo (2003) is as where m is positive integer, m ≤ n 2 , and X (1) ≤ X (2) ≤ · · · ≤ X (n) are order statistics of the sample and X (i) = X (1) Consistency of ε mn is also proved by Pardo (2003). Pardo (2003) showed that among all distributions that possess a density function f and have a support (0, 1), the entropy ε(f ) is minimized by the uniform distribution and based on this property he constructed a test of fit for the uniform distribution. Its test statistic is given as Large values of ε mn indicate that the sample is from a non-uniform distribution. . Therefore, constructing a goodness of test for this distribution will be useful in practice. In this article, we apply the informational energy and introduce a powerful goodness of fit test for the exponential distribution. Then the properties of the test are stated and compared with the existing other tests.
In Section 2, we introduce two tests of fit for exponentality based on informational energy and entropy, respectively. Consistency and other properties of the tests are established. In Section 3, we obtain critical values and then compute power of the tests against a wide variety of alternatives and show that the test based on informational energy has a good performance. Finally, we analyze two real data sets to illustrate the tests.

Test construction
In this section, we explain two methods for testing exponentiality.

Testing exponentiality based on informational energy
Suppose X 1 , . . . , X n are a random sample from a continuous probability distribution F with density f over a non-negative support and with mean µ < ∞. We are interested to test the hypothesis against the general alternative where λ = 1 µ is unspecified.
Without loss of any generality, by the probability integral transformation U = F 0 (X), we can reduce the above 222 INFORMATIONAL ENERGY AND ENTROPY problem of goodness-of-fit, to testing the hypothesis of uniformity on the unit interval. Therefore, if U i = F 0 (X i ) , i = 1, 2, ...., n be the transformed sample, the hypothesis becomes Hence, test of exponentiality convert to test of uniformity.
Here, we apply the test introduced by Pardo (2003) for testing uniformity of the transformed sample, i.e.
Consequently, the proposed test statistic can be stated as It is obvious that the test statistic is invariant with respect to the scale transformations.

Remark 1
When the parameter of the distribution is specified as λ = λ 0 , the test statistic is Similar to the argument in Pardo (2003), the following theorems are stated and proved.

Theorem 1
Let X 1 , . . . , X n be a random sample, we have T mn ≥ 1.

Proof
We know that the geometric mean does not exceed the arithmetic mean, therefore In other hand, we have Therefore,

Theorem 2
Let X 1 , . . . , X n be a random sample from the exponential distribution, if m = o(n) and m ̸ = 1, then has a beta distribution with parameters j and n − j + 1 and we can obtain E (T mn ) as where ψ is the digamma function, we have Since for large value of x,

Therefore, T mn
Pr.

Testing exponentiality based on entropy
The entropy H(f ), of a continuous random variable X with a density function f (x) was defined by Shannon (1948) to be Let X 1 , . . . , X n be a random sample of size n, and X (1) ≤ X (2) ≤ ... ≤ X (n) denotes the order statistics of sample.

Many researchers has been
Vasicek (1976) first time introduced an estimator of entropy as: where the window size m is a positive integer smaller than n/2, He proved the consistency of HV mn for the population entropy H(f ). Gokhale (1983) proposed a test statistic for the exponential distribution based on entropy. Then Ebrahimi et al.
(1992) obtained a test statistic using Kullback-Leibler information for the exponential distribution. Also, Alizadeh Noughabi and Arghami (2013) showed that the tests based on entropy and Kullback-Leibler information are equivalent. We explain exponentiality test based on entropy as follows.
It is known that if X is a nonnegative random variable and its mean E(X) = λ −1 is given then and among all nonnegative random variables the exponential distribution Therefore, Gokhale (1983) proposed the following test statistic.
where HV mn is Vasicek entropy estimator andX is the sample mean. We reject the null hypothesis for small values of T V mn .

Simulation study
For small to moderate sample sizes, the critical values of the test based on informational energy with 30,000 replications and samples of size n are obtained. Table 1 presents the critical values of the T mn -statistic various sample sizes at significance level α = 0.05. Quantiles of T V mn are reported in Gokhale (1983) and we dont present them.
To comparisons of the power values of the considered tests, we select the same three alternatives listed in Ebrahimi et al. (1992) and their choices of parameters: (c) the log-normal distribution with density function We also chose the parameters so that E(X) = 1, i.e. λ = Γ(1 + 1 β ) for the Weibull, λ = β for the gamma and v = −σ 2 /2 for the log-normal family of distributions.
We compute the power values of the informational energy based test with the power values of the entropy based test, for samples of size equal to 10 and 20. Under each alternative, we generated 20,000 samples of size 10 and 20 and then computed the test statistics (T mn , T V mn ). By the frequency of the event the test statistic is in the critical region the power value of the corresponding test was obtained. Table 2 presents the estimated powers at significance levels α = 0.01 and α = 0.05. The power values of the entropy test are based on the window sizes reported in Ebrahimi et al. (1992), which give the maximum power for this test. For the proposed test, the maximum power was typically attained by choosing m = 5 for n = 10, and m = 10 for n = 20. Generally, we can say that with increasing n the optimal choice of m increases.
From Table 2, it is seen that the tests are differ in power. It indicates a superiority of the procedure based on informational energy to entropy test. It is observed that for small sample sizes the tests achieve the same power and for large sample sizes the informational energy test has the most power. The difference of power values of the tests T mn and T V mn are substantial. In the following examples, we use the tests for some real datasets. Histograms of the considered data sets are presented in Figure 1. Suprawhardana and Sangadji (1999)     The proposed tests for goodness of fit on inter-occurrence times of fatal accidents are used. After some computing the values of the proposed tests, we concluded that the distribution of the data of the inter-occurrence times of fatal accidents on scheduled large planes in the USA (1983C1998) does not differ significantly from the exponential.

Example 1
Therefore, the inter-occurrence times of fatal accidents suffered by scheduled large planes in the USA from (1983C1998) is exponentially distributed.

Conclusion
In this paper, we first proposed two tests for exponentaility based on the estimated informational energy and entropy, respectively. Consistency and other properties of the test statistics are presented. Then, we obtained the critical values of the proposed test and also computed the power vales of the considered tests using Monte Carlo computations for different sample sizes against various alternatives. We observed that the test based on informational energy performs very well compared with the test based on entropy for Weibull, gamma, and lognormal alternatives. Also, it can be seen that the relative superiority of the proposed test over entropy test increases with sample size.