On the Beta Exponential Pareto Distribution

In this article we propose and study the so-called beta exponential Pareto (BEP) distribution. Several lifetime distributions such as the beta Weibull, beta exponential, beta Rayleigh, generalized Weibull, Weibull among others are embedded in the proposed distribution. Various mathematical properties of the BEP distribution are presented. We also discuss the parameter estimation methods and simulation issues. The importance and flexibility of the proposed model is illustrated by means of real data analysis.


Introduction
The Pareto distribution is well known in the literature for its capability of modeling heavy-tailed data. Burroughs and Tebbens [7] discussed applications of the Pareto distribution in modeling earthquakes, forest fire areas and oil and gas field sizes. Newman [22] provided many other quantities measured in the physical, biological, and social systems of various kinds, where the Pareto distribution has been found to be useful to model data. For detailed review of Pareto distribution and related topics, readers are referred to Arnold [3] and the references therein. The Pareto distribution has a scale parameter which acts as a threshold value of the observations. This means Pareto distribution could be used only if it takes positive values greater than the threshold parameter. This restriction has limited the usefulness of the Pareto distribution. To add flexibility to the Pareto distribution various generalizations have appeared in the literature. For example, Alzaatreh et al. [5] proposed the Weibull-Pareto distribution, Bourguignon et al. [8] introduced the Kumaraswamy-Pareto distribution, Alzaatreh et al. [6] introduced the Gamma-Pareto distribution, Akinsete et al. [2] introduced the beta-Pareto distribution, Zea et al. [25] studied the beta exponentiated Pareto distribution, Mahmoudi [19] proposed the beta generalized Pareto distribution, Elbatal [13] studied the Kumaraswamy exponentiated Pareto distribution and Tahir et al. [24] studied a new Weibull-Pareto distribution.
In this paper we define and study the so-called beta exponential Pareto (BEP) distribution which is obtained by using the genesis of the Pareto distribution, beta distribution and exponential distribution. Although BEP will have five parameters, the main advantage of this formulation is that the support of the proposed distribution is (0, ∞) which adds versatility and applicability to model data under study. The rest of the paper is unfolded as follows. In the rest of Section 1, we provide the key ingredients of BEP distribution. In Section 2, we define the BEP distribution and discuss some of its sub-models. Section 3 discusses some structural and mathematical properties 418 ON THE BETA EXPONENTIAL PARETO DISTRIBUTION of the BEP distribution. The elements of reliability analysis are discussed in Section 4. Parameter estimation procedures using method of maximum likelihood are presented in Section 5. In Section 6, we discuss the simulation issues related to parameter estimation. Applications to model real world data are discussed in Section 7. Section 8 provides some concluding remarks. Our calculations make use of the following special functions: the gamma function defined by the incomplete beta function defined by where a > 0, b > 0 and 0 < y < 1; and the Gaussian hypergeometric function defined by 2 where (c) k is the ascending factorial or Pochhammer symbol defined by (assuming that (c) 0 = 1)

Review on Key Ingredients
In order to make this work more self contained we will briefly provide the key ingredients which will be used to construct the BEP distribution.
1.1.1. Beta generated distribution Eugene, Lee and Famoye [14] used the beta distribution as a generator to develop the so-called a family of beta-generated (BG) distributions based on the following formulation. Let G(x) be the cumulative distribution function (cdf) of a random variable X. Then the cdf of the beta-G random variable is given by where a > 0 and b > 0 are shape parameters. Note that I y (a, b) = By(a,b) B(a,b) is the incomplete beta function ratio, B y (a, b) is the incomplete beta function, and B(a, b) = Γ(a)Γ(b) Γ(a+b) is the beta function and Γ(.) is the gamma function as defined in Section 1.
The probability density function (pdf) of the Beta-G distribution has the form This class of generalized distribution has received considerable attention over the last few years and several classical distributions have been generalized using this formulation.

Exponential distribution
A random variable X is said to have exponential distribution if its pdf and cdf are given by , where x > 0 and α > 0 is the shape parameter.

Pareto distribution
A random variable X is said to have Pareto distribution if its pdf and cdf are given by , θ > 0 is the shape parameter and λ > 0 is the (threshold) scale parameter.

Exponential Pareto distribution
Al-Kadim and Boshi [4] introduced a new distribution that is dependent on both the exponential distribution and Pareto distribution, called exponential Pareto (EP) distribution. The cdf of exponential Pareto distribution is given by The corresponding pdf and hazard rate function (hrf) of EP distribution are, respectively, given by

The Beta Exponential Pareto Distribution
In this section we study the beta exponential Pareto distribution and the sub-models of this distribution. Now inserting (3) and (4) in (1) and (2), we have the cdf (F BEP ) and pdf (f BEP (x)) of beta exponential Pareto (BEP) distribution specified, respectively, by and

Special cases of beta exponential Pareto distribution
The BEP distribution is very flexible as it approaches to several different distributions when its parameters are chosen appropriately. If X is a random variable with cdf (5), then we have the following distributions as special case of BEP distribution.

Mixture representation
In this section we derive the series representations of the cdf and the pdf of the BEP distribution which will be useful to study its mathematical and statistical characteristics. As we shall see both pdf and cdf of BEP distribution can be expressed in terms of the exponential Pareto distribution. By using the power series expansion of ( for any real b. Therefore,for any b > 0, real non-integer, we write (5) as Using the incomplete beta function and the hypergeometric confluent function, the cdf of BEP can be expressed as below (for details see Cordeiro and Nadarajah [12] ): Hence, (e) k . z k k! is the Gaussian hypergeometric function with (c) k , the ascending factorial, defined by

422
ON THE BETA EXPONENTIAL PARETO DISTRIBUTION Also using the power series expansion Equation (6) becomes and g(x; θ, λ, α(j + 1)) is the pdf of exponential Pareto distribution with parameters θ, λ, and α(j + 1). Thus, the BEP density function can be expressed as an infinite linear combination of exponential Pareto densities. Thus, some of its mathematical properties can be obtained directly from those properties of the exponential Pareto distribution.

Statistical Properties
In this Section we study some statistical properties of the BEP distribution, specifically quantile function, moments, incomplete moment and moment generating function.

Quantile function
The quantile function for a probability distribution has many uses in both the theory and applications. It can be used to generate random numbers from an arbitrary distribution. On inverting (5), we have the BEP quantile function given by where I −1 q (a, b) is the inverse of the incomplete beta function with parameters a and b. The inverse incomplete beta function I −1 q (a, b) can be expressed in terms of beta functions as shown in Zea et al. [25] as below where q 1 = 1 and the remaining coefficients satisfy the following recursive relation We use the inverse transformation method to generate random numbers from the beta transmuted Pareto distribution as F (x) = u, where u ∼ U (0, 1). Solving the expression F (x) = u gives where is the inverse of the incomplete beta function. Further, we can use (7) to obtain the median, as well as octiles and then the measure of Bowley's skewness and Moors kurtosis. These measures are quartile alternatives to the traditional skewness and kurtosis, and are more robust estimation of these measures as described in Kenney and Keeping [16].
• Bowley skewness (B sk ): The Bowley skewness is defined using the quartiles as The Moors kurtosis is defined using the octiles as  The quantiles are also useful to study the quantile spread (QS) function of a random variable which describes how the probability mass is placed symmetrically about its median and hence can be used to formalize concepts such as peakedness and tail weight traditionally associated with kurtosis. The QS is defined as

ON THE BETA EXPONENTIAL PARETO DISTRIBUTION
The QS function for BEP is given by The QS function as a function of p for BEP distribution is displayed in Figure 4.

Ordinary and incomplete moments
In this subsection, we derive the expressions for the ordinary and incomplete moments of BEP. The moments of different orders are helpful to study different characteristics of a distribution and its usefulness to model real life data. Lemma 3.1, Lemma 3.2 and Lemma 3.3 summarize the moment properties of BEP distribution.
Lemma 3.1: If X has BEP distribution then the k th moment of X, k = 1, 2, .... has the following form: Proof: Let X be a random variable with density function (6). The k th ordinary moment of the BEP distribution is given by ) .
This completes the proof. Furthermore, the k th order moments can be expressed as ) .
In Figure 5, we display the effect of the choice of parameters a and b on the mean and the variance of BEP by choosing λ = 2, θ = 0.5 and α = 2.5. Similarly, the effect of the choice of the parameters a and b on the skewness and the kurtosis is displayed in Figure 6 for λ = 2, θ = 0.5 and α = 2.5. Lemma 3.2: If X has BEP distribution, then the moment generating function M X (t) has the following form Proof: The moment generating function M X (t) can be written as M . On using (9) the MGF of BEP distribution is given by which completes the proof.
Furthermore, using (10) the MGF of BEP distribution is given by ) .

Lemma 3.3:
If X has BEP distribution then the conditional moments for X is given by Proof: Using the definition of conditional moment we have is the upper incomplete gamma function which is also known as incomplete gamma function of second kind.

Mean deviation
Let X be a BEP random variable with mean µ = E(X) and median M . Note that we can find the mean µ by substituting r = 1 in equation (9). The mean deviation from the mean (µ) and the mean deviation from the median (M ) can be expressed as where F (.) is the cdf of the BEP distribution and J(t) = ∫ t 0 xf (x)dx. On proceeding similar to the proof of Lemma 3.3, we can compute J(t) as

Reliability Analysis
The reliability function, also known as the survival function, of a probability distribution is the characteristic of an explanatory variable that maps a set of events, usually associated with mortality or failure of some system onto time. It is the probability that the system will survive beyond a specified time. random variable is its hazard rate function which is also known as instantaneous failure rate of a random variable X which is an important quantity characterizing life phenomenon. The hazard function h(x) is defined as where F (.) and f (.) are, respectively, the cdf and pdf of the given distribution. Using Equations (5) and (6), the hazard rate function of BEP distribution can be expressed as The flexibility of BEP distribution to model reliability data is illustrated by varying shape of the hazard rate function in Figure 7. Note that the hazard rate possesses several different shapes by virtue of the choice of the parameters. In particular, Figure 7 illustrates the following shapes of the hazard rate function: constant, decreasing, increasing, upside-down bathtub shape, bathtub shape etc.  Since many standard distributions are embedded in BEP as its special case, the shape of the hazard rate function can be easily specified by choosing appropriate values of the parameters. For example, if we choose α = 1, the resulting distribution reduces to beta Weibull distribution. Therefore, based on the results of Glaser [15] as described in Lee et al. [18] BEP has • constant failure rate when a = α = θ = 1, • decreasing failure rate when α = 1, aθ ≤ 1 and θ ≤ 1, • increasing failure rate when α = 1, aθ ≥ 1 and θ ≥ 1, • a bathtub failure rate when α = 1, aθ < 1 and θ > 1, • upside down bathtub (or unimodal) failure rate when α = 1, aθ > 1 and θ < 1.

Parameter Estimation
In order to carry out statistical inference, estimation of the parameters plays the vital role. Several methods for parameter estimation have been proposed in the literature but the maximum likelihood estimation (MLE) method is commonly used as the MLEs are usually unbiased and have minimum variance. These estimators can be used to construct confidence intervals and hypothesis testing procedures. Here, we outline the method to estimate the parameters of BEP distribution for complete sample (no censoring). Let x 1 , x 2 , x 3 , · · · , x n be a random sample of size n from the BEP distribution given by (6). Let φ = (α, θ, λ, a, b) be 5 × 1 vector of parameters. Then the log-likelihood function for φ is given by ℓ = n ln(θ) + n ln(α) − nθ ln(λ) + n ln (Γ(a + b)) − n ln (Γ(a)) − n ln (Γ(b)) Equation (11) can be maximized directly by using the R (optim function), SAS (PROC NLMIXED), Ox program (MaxBFGS sub-routine) and MATH-CAD program or by solving the nonlinear likelihood equations obtained by differentiating (11). The score vector components, say U (φ) = ∂ℓ ∂φ = ( ∂ℓ ∂α , ∂ℓ ∂θ , ∂ℓ ∂λ , ∂ℓ ∂a , ∂ℓ ∂b ) , of BEP are given by where ψ(.) is the digamma function, i.e. ψ(x) = d dx (ln Γ(x)). The MLEs of φ, say φ , is obtained by solving the nonlinear system of equations U (φ) = 0. Usually, it is more efficient to obtain the MLEs by maximizing ℓ directly. We can use the optim function in R [23] software for direct numerical maximization of ℓ. optim is based on a quasi-Newton algorithm. Throughout this article we will use Nelder-Mead method using AdequacyModel library of the R-package written by Marinho et al. [20] to estimate the parameters. The initial values for numerical maximization can be determined by the method of moments; that 430 ON THE BETA EXPONENTIAL PARETO DISTRIBUTION is, by equating the first five moments of the BEP distribution with the corresponding empirical moments. The simultaneous roots of these five equations will be determined by the routine multiroot in the R software. The optim routine always converged when the method of moments estimates are used as initial values.
For interval estimation of the parameters, we obtain the 5 × 5 observed information matrix J(φ) = { ∂ 2 ℓ ∂r ∂s } (for r, s = α, θ, λ, a, b), whose elements can be computed numerically. Under standard regularity conditions when n → ∞, the distribution of φ can be approximated by a multivariate normal N 5 distribution to construct approximate confidence intervals for the parameters, where J ( φ) is the total observed information matrix evaluated at φ. An (1 − γ)100% asymptotic confidence interval for each parameter φ r (for r = 1, 2, · · · , 5) is given by where J rr denotes the (r, r)th element of J ( φ) −1 .

Simulation
In this Section, we shall investigate the stability of the MLE estimates of BEP distribution with different sample size (n) through a Monte-Carlo study. The simulation procedure as outlined below was performed in R (Statistical software) [23] using AdequacyModel library.
1. Simulate a random sample of size n from the BEP distribution with parameters a, b, α, λ, θ using the inversion method using Equation (8). 2. Set initial values for the parameters (a 0 , b 0 , α 0 , λ 0 , θ 0 ). 3. Compute the mle of the parameters of the BEP distribution. 4. Repeat steps 1-3 for 1000 (N) times. 5. Compute the mean, standard deviation (standard error), and mean square error (MSE) of the 1000 estimates of each parameter. 6. Repeat steps 1-5 with different sample sizes.
Similarly, we have performed the simulation for another combination of the parameters by choosing a = 1, b = 1, λ = 1.5, θ = 4 and α = 1. Table 2 provides the mean, the root mean square errors (RMSEs) and the mean absolute errors (MAEs) from 1000 repetitions. It can be observed that as sample size increases the root mean square error and mean absolute error decreases and the estimates are stable to the true value of the parameters. The effect of sample size on the bias of each parameter for both simulated data is displayed in Figure 8.

Applications
In this Section we illustrate the usefulness of BEP distribution to model real data sets. We consider two different data sets that have been studied by several authors.

The flood peaks of the Wheaton river
The data are the exceedances of flood peaks (in m 3 /s) of the Wheaton River near Carcross in Yukon Territory, Canada. The data consists of 72 exceedances for the years 1958-1984. These data have been analyzed by many authors including Choulakian and Stephens [11], Nadarajah [21], Akinsete et al. [2], and Chhetri et al. [9,10], among others. A summary of the descriptive statistics of the data set is given in Table 3 below. The fitting of BEP distribution is compared with the results from few commonly used distributions in the literature generated from Pareto distribution. In particular we compare the BEP with exponential Pareto (EP), the Kumaraswamy transmuted Pareto distribution (KwTP), the beta transmuted Pareto (BTP), Kumaraswamy Pareto distribution (KwP), the beta Pareto (BP), the transmuted Pareto distribution (TP), the exponentiated Pareto distribution (ExP) and the Pareto distribution (P). Using the method of maximum likelihood procedure we estimate the distribution parameters. The goodness of fit measures, including the log-likelihood function (−ℓ), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Consistent Akaike Information Criterion (CAIC), and Hannan-Quinn Information Criteria (HQIC) are obtained for the fitted models. As mentioned earlier, all required computations are carried out using the AdequacyModel script of R-package by Marinho et al. [20]. Table 4 lists the values the MLEs and their standard errors. Note that except for EP and BEP distribution the estimated value of the threshold parameter λ is the smallest value of the data which is 0.1 but for EP and BEP there is no such constraint so the value of λ is estimated. It should also be noted that for transmuted distributions (TP, BTP and KwTP) the transmuted parameter is denoted by θ. The values of − ℓ, AIC, CAIC, HQIC, BIC are given in Table 5. Observe that the BEP distribution has the lowest value of − ℓ. Note that KWTP and BTP and BEP has equal number of parameters and BEP yields significantly better model than BTP and KwTP whereas EP has smaller value of AIC, BIC, CAIC and HQIC due to fewer number of parameters. One can employ the likelihood ratio test statistic to check the superiority of the BEP distribution over the other distributions (as shown in next example, Section 7.2). The plots comparing the BEP distribution with other distributions are given in Figure 9.  Table 5 and the plots in Figure 9 all indicate that the BEP distribution is superior and fits the data more adequately than any other distributions listed in the Table 3.

Failure times of communication receiver
The data set given by Lawless [17] represents the repair times (in hours) for 46 failures of an airborne communications receiver. This data has been analyzed by several authors to fit different reliability models including the transmuted size-biased exponential distribution by Ahmad and Ahmad [1]. We apply the BEP to fit this data and compare the results with its sub-model Exponential Pareto (EP). We provide in Table 6 the estimated parameters and their standard errors for BEP and EP models using the subject data. The values of the test statistics including the value of AIC, Anderson-Darling (A * ) and Cramér-von Mises (W * ), Kolmogorov-Smirnov (KS) and p-value of KS test are provided in Table 7.
whereφ andφ 0 are the MLEs under H a and H 0 , respectively. The statistic ω is asymptotically distributed as χ 2 k , where k is the length of the parameter vector of interest. Note that for the subject data ω = 6.3882 with p-value = 0.0410. Therefore at 0.05 level of significance, the BEP is superior to EP for the subject data. Plots comparing the exact BEP distribution with EP distribution for repair time data is given in Figure 10 and Figure 11. It is evident that the BEP fits better than EP distribution for this data set. It should be noted that BEP also fits better than any of the distributions discussed in Ahmad and Ahmad [1].

Concluding Remarks
In this article we have studied the beta exponential Pareto (BEP) distribution. This is a generalization of the exponential Pareto distribution. Several lifetime distributions such as the beta Weibull, beta exponential, beta Rayleigh, generalized Weibull, Weibull among others are embedded in the proposed distribution. Various mathematical properties of the BEP distribution are presented. We also discuss the parameter estimation methods and simulation issues. The importance and flexibility of the proposed model is illustrated by means of analyzing two sets of real life data.