The Topp-Leone odd log-logistic Gumbel Distribution: Properties and Applications

In this article, the Topp-Leone odd log-logistic Gumbel (TLOLL-Gumbel) family of distribution have beenstudied. This family, contains the very flexible skewed density function. We study many aspects of the new model like hazard rate function, asymptotics, useful expansions, moments, generating Function, R´enyi entropy and order statistics. We discuss maximum likelihood estimation of the model parameters. Further, we study flexibility of the proposed family are illustrated of two real data sets.


Introduction
In many real-world examples, known classical distributions do not provide a good fit to real data. Several continuous univariate distributions have been extensively used in literature for modeling data in many areas such as economics, engineering, biological studies and environmental sciences. However, applied areas such as finance, lifetime analysis and insurance clearly require extended forms of these distributions. So, several classes of distributions have been constructed by extending common families of continuous distributions. These generalized distributions provide more exibility by adding one (or more) parameters to a baseline model. The main aim of the paper is to propose a new family of distributions from the Topp and Leone's distribution that can have the bathtub shaped hazard rate to be used for lifetime modeling. Recently, several properties of the Topp and Leone's distribution have been studied by several authors. We mention some of them: moments Nadarajah and Kotz [20], reliability measures and stochastic orderings Ghitany et al. [9], distributions of sums, products and ratios Zhou et al. [31], behavior of kurtosis Kotz and Seier [17]; record values Zghoul [30], moments of order statistics Genc [7], stress-strength modeling Genc [8] and Bayesian estimation under trimmed samples Sindhu [27]. The distribution Topp and Leone it is a one-parameter distribution with the cumulative distribution function (cdf) specified by where 0 < x < 1 and b > 0. Having only one parameter and its domain restricted to (0, 1), Topp and Leone's distribution is not very flexible. Consider starting from a parent continuous distribution function G(x). A natural way of generating families of distributions on some other support from a simple starting parent distribution with pdf g(x) = dG(x)/dx is to for a > 0, G(x) is a baseline cdf andḠ(x) = 1 − G(x). From an arbitrary parent cdf G(x), the cdf F(x) of the TLOLL-Gumble distribution is defined by where a and b > 0 are shape parameters and ξ is the vector of parameters of the baseline G distribution. The probability density function (pdf) corresponding to (1) is given by We denote by X ∼ T LOLL − G(a, b, ξ) a random variable having density function (2). The TLOLL-Gumbel model is obtained by inserting the Gumbel distribution as a parent model in the Topp-Leone odd log-logistic family. Furthermore, the basic motivations for using the TLOLL-Gumble family in practice are the following: • to produce a skewness and kurtosis distribution; • to construct heavy-tailed distributions for modeling real data; • to generate distributions with symmetric, left-skewed, right-skewed or reversed-J shape; • to provide consistently better fits than other generated models under the same underlying distribution.
Simulating the TLOLL-Gumble random variable is straightforward. If U has a uniform distribution on the unit interval (0, 1), the solution of the equation, 290 THE TOPP-LEONE ODD LOG-LOGISTIC GUMBEL DISTRIBUTION is also known as the extreme value distribution of type I. Some of its recent application areas in engineering include flood frequency analysis, network engineering, nuclear engineering, offshore engineering, risk-based engineering, space engineering, software reliability engineering, structural engineering, and wind engineering. In this section we introduce the new distribution. The Gumble distribution pdf has, and it's cdf is, The cdf of TLOLL-Gumbel distribution is, where a > 0 and b > 0. The pdf of TLOLL-Gumbel distribution is, The special cases of the proposed family is: Taking a = 1 in TLOLL-Gumbel family, we obtain TL-Gumbel distributions with the following pdf.
Some shapes of the TLOLL-Gumble pdf are displayed in Figures 1. This Figures shows the plots of pdf TLOLL-Gumble for selected parameter values. Different skewed density functions including mild and high skewed ones (positive and negative) for selected parameter values. The TLOLL-Gumble family contains the very flexible skewed density function (unimodal and bimodal) that is useful in fitting real data sets (see Section 6).

Some mathematical properties
In this section, we obtain some statistical properties such as : asymptotics, useful expansions, moments, generating function, Rényi entropy and order statistic Brito et al., [5] and Andrade et al., [1]. In addition, plots of the skewness, kurtosis and entropy are presented.

Corollary 1
The asymptotics of F (x), f (x) and τ (x) as x → −∞ are given by

Corollary 2
The asymptotics of F (x), f (x) and τ (x) as x → ∞ are given by

Useful expansions
First the binomial epansions, holds for −1 < u < 1 and any a > 0 any real non-integer. Second, using the generalized binomial expansion (5) we can rewrite (4), 292 THE TOPP-LEONE ODD LOG-LOGISTIC GUMBEL DISTRIBUTION Third, we can expand G(x) ja = exp (−jae −z ) as, Fourth, we can write, where c 0,j = a 0,j /b 0,j and, for k ≥ 1, we have, Then, the cdf of the TLOLL-Gumbel distribution can be expressed as, where Π k (x) = exp (−ke −z ) and, The corresponding TLOLL-Gumbel density function follows by differentiating equation (7) f where π k+1 (x) is given by In Table 1, for matching the power series expansion in equation (8) with the density f (x), we produced a table for multiple values of parameters and different values of x to compare the sum to the density. As shown in the Table  1, with increasing the value of the b parameter, the accuracy of the density function estimation is increased. This is due to the fact that in the combination (6), more sentences are considered together and, as a result, accuracy increases.

Moments
It is hardly necessary to emphasize the importance of calculating the moments of a random variable in statistical analysis, particularly in applied work. Some key features of a distribution such as skewness and kurtosis can be studied through its moments. The nth moment of X can be determined from (7) as, Using the binomial expansion for (µ − σ log(z)) n , E(X n ) can be expressed as, Using a result by Nadarajah (2006), By combining (9) and (10), the nth moment of X becomes, The measures of skewness and kurtosis of the TLOLL-Gumbel distribution can be obtained as, and respectively. Plots of skewness and kurtosis of the TLOLL-Gumbel distribution are displayed in Figure 2. As shown in this figure, when σ increases, skewness and kurtosis decreases. Also, When µ increases, skewnessandkurtosis does not change and for a and b values, there is no fixed pattern either.

Generating Function
The moment generating function (mgf) of X is computed as follows, Using a result by Cordeiro et al. [6]), we have, and then

Rényi entropy
The Rényi entropy of a random variable X represents a measure of variation of the uncertainty. The Rényi entropy is defined by, Then we have, where, which, on setting u = exp (−z), it reduces to, In Figure 3 one can see some curves of the entropy function of the TLOLL-Gumbel distribution for some parameters.

Order statistics
Order statistics make their appearance in many areas of statistical theory and practice. Suppose X 1 , . . . , X n is a random sample from any TLOLL-Gumbel distribution. Let X i:n denote the ith order statistic. The pdf of X i:n can be expressed as, We use the result 0.314 of Gradshteyn and Ryzhik [11] for a power series raised to a positive integer n (for n ≥ 1), where the coefficients c n,i (for i = 1, 2, . . .) are determined from the recurrence equation (with c n,0 = a n 0 ), We can demonstrate that the density function of the ith order statistic of any TLOLL-Gumbel distribution can be expressed as, where, where c r is given by (7) and the quantities f j+i−1,k can be determined given that f j+i−1,0 = b j+i−1 0 and recursively we have:

Inference
Several approaches for parameter estimation were proposed in the literature but the maximum likelihood method is the most commonly employed. The maximum likelihood estimators (MLEs) enjoy desirable properties and can be used when constructing confidence intervals and also in test statistics. The normal approximation for these estimators in large sample theory is easily handled either analytically or numerically. So, we consider the estimation of the unknown parameters from complete samples only by maximum likelihood. Let Θ be the (p × 1) parameter vector. Under standard regularity conditions when n → ∞, the distribution ofΘ can be approximated by a multivariate normal N p (0, J(Θ) −1 ) distribution to construct approximate confidence intervals for the parameters. Here, J(Θ) is the total observed information matrix evaluated atΘ.

Likelihood of the TLOLL-Gumbel family
Let x 1 , . . . , x n be the observed values from the TLOLL-Gumbel distribution with parameters a, b, µ and σ. Let Θ = (α, β, λ, σ) ⊤ be the r × 1 parameter vector. Then, the log-likelihood function for the vector of parameters Θ = (a, b, µ, σ), say L(Θ), is given by To simplify, assume that the distribution and density of Gumbel function is as follows: Then, the log-likelihood function can be written as follows: The log-likelihood function (11) can be maximized either directly by using the SAS (PROC NLMIXED) or the Ox program (sub-routine MaxBFGS) or by solving the nonlinear likelihood equations obtained by differentiating (11 Setting the nonlinear system of equations U a = U b = U µ = U σ and solving them simultaneously yields the M LE of Θ = (a, b, µ, σ) ⊤ , i.e.Θ = (â,b,μ,σ) ⊤ . These equations cannot be solved analytically and statistical software can be used to solve them numerically using iterative methods such as the Newton-Raphson type algorithms. For interval estimation of (a, b, µ, σ) and hypothesis tests on these parameters, we obtain the observed information matrix since the expected information matrix is very complicated and requires numerical integration. The 4 × 4 observed information matrix J(Θ), becomes as follows: .
whose elements are given in Appendix B. Under conditions that are fulfilled for parameters in the interior of the parameter space but not on the boundary, the asymptotic distribution of ( Θ − Θ) is N 4 (0, I(Θ) −1 ), where I(Θ) is the expected information matrix. The multivariate normal N 4 (0, J( Θ) −1 ) distribution, where I(Θ) is replaced by J( Θ), i.e., the observed information matrix evaluated at Θ, can be used to construct approximate confidence intervals for the individual parameters.

Simulation study
In this section, the Maximum likelihood estimators of parameters of purpose density function has been assessed by simulating: (a, b, µ, σ) = (2, 0.5, 2, 1). To examine the performance of the M LEs for the TLOLL-Gumbel distribution, similar to karamikabir et al. [15], we perform a simulation study: for θ = (a, b, µ, σ).
We repeat these steps for r = 1000 and n = 15, 16, . . . 90 with different values of (a, b, µ, σ), so computing Bias θ (n) and M SE θ (n). Figure 4, reveals how the four biases and mean squared errors vary with respect to n. The biases and mean squared errors for the estimates decrease to zero when n → ∞ (n > 50), as expected. The reported observations are for only one choice for (a, b, µ, σ) , namely that (a, b, µ, σ) = (2, 0.5, 2, 1). Figure 4, shows how the four biases and mean squared errors vary with respect to for (a, b, µ, σ) = (2, 0.5, 2, 1). The biases and MSEs for each parameter similar to the previous state decrease to zero when n → ∞. Finally figure 4 same as before shows how the four biases and MSEs with (a, b, µ, σ) = (2, 0.5, 2, 1) decrease to zero. The reported observations are for only above cases for (a, b, µ, σ), but the results are similar for a wide range of other choices for (a, b, µ, σ).

Empirical illustration
In this section we illustrate the flexibility of the TLOLL-Gumbel model using three data sets.

The Survival times data
This sub-section is related to study of the 72 survival times of guinea pigs injected with difierent doses of tubercle bacilli by Kundu et al. [18] and Leiva et al. [19]. The data represent the survival times of guinea pigs injected with different doses of tubercle bacilli. It is known that guinea pigs have high susceptibility to human tuberculosis and that is why they were used in this study. Here, we are primarily concerned with the animals in the same cage that were under the same regimen. The regimen number is the common logarithm of the number of    Tables 2 and 3, a summary of the fitted information criteria and estimated MLE's for this data with different models have come, respectively. Models have been sorted from the lowest to the highest value of AIC. As you see, the TLOLL-Gumbel is selected as the best model with all criteria. The histogram of the Survival times data and the plots of fitted pdf are displayed in Figure 5.

The strengths of 1.5 cm glass fibers data
This sub-section is related to study of 63 strengths of 1.5 cm glass fibers, measured at the National Physical Laboratory, England. It is obtained from Smith and Naylor [28] and also analysed by Barreto-Souza et al. [3]. The sample are experimental data of the strength of glass fibres of two lengths,   the TLOLL-Gumbel is selected as the best model with all criteria. The histogram of the strengths of 1.5 cm glass fibers data. and the plots of fitted pdf are displayed in Figure 8.

The logarithm of the ratio of received light data
This sub-section is related to study of the logarithm of the ratio of received light (logratio) in LIDAR (light detection and ranging) data set presented by Ruppert et al. [26] that include 221 observations. The technique known as LIDAR (light detection and ranging) uses the reflection of laser-emitted light to detect chemical compounds in the atmosphere. The LIDAR technique has proven to be an efficient tool for monitoring the distribution of several atmospheric pollutants of importance Similar to previous sub-section, in the Tables 6 and 7, a summary of the fitted information criteria and estimated MLE's for this data with different models have come, respectively. As you see, the TLOLL-Gumbel is selected as the best model with all criteria. The histogram of the logarithm of the ratio of received light data and the plots of fitted pdf are displayed in Figure 7.

Conclusions
In this article, we introduce the Topp-Leone odd log-logistic Gumbel distributions. Some of its various properties including explicit expansions, moment, entropies, order statistics and maximum likelihood estimator, are provided. The TLOLL-Gumbel is applied to fit three real data sets. Three applications to real data demonstrate the importance of the TLOLL-Gumbel family. These applications show that the TLOLL-Gumbel has the ability to fit skewed (left or right) and heavy-tailed data due to its flexibility. We can conclude that the TLOLL-Gumbel distribution is a very suitable model to the current data and it was always one of the best models. The results of tables and figures illustrate the importance of the new distribution to analyze of real data. Therefore, using of TLOLL-Gumbel distribution in the real examples is suggested.

Acknowledgement
The authors wish to thank the Editor for the thorough reading.
Appendix A: Two useful power series We present two power series required for the algebraic developments in Section 2. First, expanding z λ in Taylor series, we can write, where and (λ) k = λ(λ − 1) . . . (λ − k + 1) denotes the descending factorial. Second, we obtain an expansion for [G(x) a +Ḡ(x) a ] c as, . Then, using (12) , we have, Finally, we obtain,