The Zografos-Balakrishnan Odd Log-Logistic Generalized Half-Normal Distribution with Mathematical Properties and Simulations

In this paper, A new class of distributions called the Zografos-Balakrishnan odd log-logistic Generalized halfnormal (ZOLL-GHN) family with four parameters is introduced and studied. Useful representations and some mathematical properties of the new family include moments, quantile function, moment Generating function are investigated. The maximum likelihood equations for estimating the parameters based on real data are given. Different methods have been used to estimate its parameters such as maximum likelihood, Least squares, weighted least squares, Crammer-von-Misers, Anderson-Darling and right-tailed Anderson-Darling methods. We assesses the performance of the maximum likelihood estimators in terms of biases and mean squared errors by means of a simulation study. Finally, the usefulness of the family and fitness capability of this model, are illustrated by means of two real data sets.


Introduction
The statistics literature is filled with hundreds of continuous univariate distributions: see Johnson et al. [19,20]. Recent developments have been focused to define new families by adding shape parameters to control skewness, kurtosis and tail weights thus providing great flexibility in modeling skewed data in practice, including the twopiece approach introduced by Hansen [18] and the Generalized pioneered by Eugen et al. [14], Cordeiro and de Castro [7], Alexander et al. [2] and Cordeiro et al. [9]. Many subsequent articles apply these techniques to induce skewness into well-known symmetric distributions such as the symmetric Student t. For a review, see Aas and Haff [1].
Not that the ZBOLL family is easily simulated by inverting (2) as follows: if V has the γ(β, 1) distribution and G −1 (V ) = Q G (V ) where Q is quantile function, then the solution of the nonlinear equation has density (3). The parameters α and β have a clear interpretation.
Henceforth, a random variable X with density function (3) is denoted by X ∼ZBOLL-G(α, β, τ ). The ZBOLL-G family has the same parameters of the parent G plus the parameters α and β. For α = β = 1, it reduces to the baseline G distribution. For α = 1, we obtain the gamma-G family (Nadaraja, [22]) and for β = 1, we have the Odd log -logistic-G family ( Gleaton and Lynch, [16]). Gleaton and Lynch [16] introduced a new family of distribution which called generalized log-logistic G. (for short GLL). The cdf and pdf of this family for any baseline cdf G, are given by the following: 214 THE ZOGRAFOS-BALAKRISHNAN ODD LOG-LOGISTIC GENERALIZED HALF-NORMAL DISTRIBUTION

Statistical properties
In this section some statistical properties of the proposed distribution are investigated.

Moments
, then we can write as follows: where,

Proof
We have from equation (16) E By setting u = ( x θ ) λ and considering the error function as the cd of the GHN distribution, the nth moment of X can be obtained as the following: where, Inserting the power series for the error function: we obtain as the following: Further, for the very special case when r + n λ is even, the integral I ( n λ , r ) can be expressed in terms of the Varicella function of type A defined by follows: where (a) k = a(a + 1)...(a + k − 1) is the ascending factorial (with the convention that (a) 0 = 1). Numerical routines for the direct computation of the Varicella function of type A are available, see Extol [15] and Mathematica (Trott,[27]). Hence, E(X n ) can be expressed in terms of the Varicella functions of type A as the following:

Quantile function
The gamma regularized function is defined by The inverse gamma regularized function Q −1 (β, u) admits a power series expansion given by the following: By using Taylor expansion and Generalized binomial expansion, we have obtain Then, is obtained from the last equation as 218 THE ZOGRAFOS-BALAKRISHNAN ODD LOG-LOGISTIC GENERALIZED HALF-NORMAL DISTRIBUTION Further, we can write as follows: We use an equation by Gradshteyn and Ryzhik [17] for a power series raised to a positive integer j as follows: Here, for j ≥ 0, c j,0 = a j 0 , and the coefficients c j,i (for i = 1, 2, . . .) are determined from the recurrence equation So, the coefficient c j,i follows from c j,0 , . . . , c j,i−1 and then from a 0 , . . . , a i . Based on equations (19) and (20), we can rewrite (18) as the following: where, for k ≥ 0, the coefficients v k,i (for i = 1, 2, . . .) are determined from the recurrence equation Then, the quantize function (qt) of X reduces to In General, even when Q G (u) does not have a closed-form expression, this function can usually be expressed in terms of a power series where the coefficients s i 's are suitably chosen real numbers. For several important distributions, such as the normal, Student t, gamma and beta distributions, Q G (u) does not have a closed-form expression but it can be expanded as in equation (22). The qt of the GHN distribution can be obtained as the following: Using equation (22) and the generalized binomial expansion, we can write as follows: and is the falling factorial. Further, by using equations (19) and (20)we obtain as follows: We can write the qt of GHN distribution as the power series By combining equations (21) and (24) and using equations (19) and (20), we obtain as follows: Hence, equation (25) reveals that the qt of the ZBOLL-GHN distribution can be expressed as a power series. For practical purposes, we can adopt ten terms in this power series.
Let W (·) be any integrable function in the positive real line. We can write as follows: Equations (25) and (26) are the main results of this section. We can obtain from them various ZOLL-GHN mathematical properties using integrals over (0, 1), which are usually more simple than if they are based on the left integral.

Moment generating function
The moment Generating function (mf) M (t) of a random variable X provides the basis of an alternative route to analytical results compared with working directly with the pdf and cdf of X.
where v r,0 = e r 0 and for n ≥ 1, v r,

Proof
We can write from equations (19) and (20) Clearly for the kth moment of X follows from the last equation as µ

Maximum likelihood estimation
In this section, we determine the maximum likelihood estimates (MLE's) of the model parameters of the new family from complete samples only. Let x 1 , . . . , x n be observed values from the ZOLL-GHN family with parameters α, β, λ and θ. Let Θ = (α, β, λ, θ) ⊤ be the r × 1 parameter vector. The total log-likelihood function for θ is given by . The log-likelihood function can be maximized either directly by using the ASS (PROC UNMIXED) or by solving the nonlinear likelihood equations obtained by differentiating (28). The components of the score function U n (Θ) = (∂ℓ n /∂α, ∂ℓ n /∂β, ∂ℓ n /∂λ, ∂ℓ n /∂θ) ⊤ are given by the following: , and ψ(·) is the function. The MLE Θ of Θ is obtained by solving the nonlinear likelihood equations Uα(Θ) = 0, U β (Θ) = 0, U λ (Θ) = 0 and U θ (Θ) = 0. These equations cannot be solved analytically and statistical software can be used to solve them numerically. We can use iterative techniques such as a Newton-Raison type algorithm to obtain the estimate Θ. We employ the numerical procedure Unmixed in ASS.
For interval estimation of (α, β, λ, θ) and hypothesis tests on these parameters, we obtain the observed information matrix since the expected information matrix is very complicated and requires numerical integration. The 4 × 4 observed information matrix J(Θ), becomes as follows: .
whose elements are given in Appendix B.
Under conditions that are fulfilled for parameters in the interior of the parameter space but not on the boundary, the asymptotic distribution of ( Θ − Θ) is N 4 (0, I(Θ) −1 ), where I(Θ) is the expected information matrix. The multivariate normal N 4 (0, J( Θ) −1 ) distribution, where I(Θ) is replaced by J( Θ), i.e., the observed information matrix evaluated at Θ, can be used to construct approximate confidence intervals for the individual parameters.
We can compute the maximum values of the unrestricted and restricted log-likelihoods to obtain likelihood ratio (LF) statistics for testing some special models of the proposed family. Tests of the hypotheses of the type H 0 : ψ = ψ 0 versus H : ψ ̸ = ψ 0 , where ψ is a subset of parameters of Θ, can be performed through LF statistics in the usual way.

The Maximum Likelihood Estimator
In this section, the Maximum likelihood estimators of parameters in density function 10 has been assessed by simulating for The density function has been indicated in Figure 3. One can see three different states of ZBOLL-GHN density function. To verify the validity of the maximum likelihood estimator, the Bias of MALE and the Mean Square Error of MLE (MSE) have been used. For example for (θ, λ, α, β) = (1, 2, 0.5, 2), r = 1000 times have been simulated samples of n = 20, 21, · · · , 110. To estimate the numerical value of the maximum likelihood, the optim function (in the stat package) and Nelder-Mead method in R software has been used. If θ = (θ, λ, α, β), for any simulation by n volume and i = 1, 2, · · · , r, the maximum likelihood estimates are obtained as θ i = (θ,λ,α,β). To examine the performance of the MLE's for the ZBOLL-GHN distribution, we perform a simulation study: 1. Generate r samples of size n from equation (10). 2. Compute the MLE's for the r samples, say (θ,λ,α,β) for i = 1, 2, . . . , r. 3. Compute the standard errors of the MLE's for r samples, say (sθ, sλ, sα, sβ) for i = 1, 2, . . . , r.

Compute the MLE's, biases and mean squared errors given by
Stat., Optim We repeat these steps for r = 1000 and n = 5, 6, . . . n * (n * is different in each issue) with different values of (α, β, λ, σ), so computing Bias θ (n) and M SE θ (n). Figure4, 5 respectively reveals how the four biases, mean squared errors vary with respect to n. As expected, the Biases and MSE's of estimated parameters converges to zero while n growing.

The other estimation methods
There are several approaches to estimate the parameters of distributions that each of them has its characteristic features and benefits. In this subsection five of those methods are briefly introduced and will be numerically investigated in the simulation study when (θ, λ, α, β) = (1, 2, 0.5, 2). A useful summary of these methods can be seen in Dey et al. [13]. Here {t i ; i = 1, 2, ..., n} is the associated order statistics and F is the distribution function of ZBOLL-GHN.

Least squares and weighted least squares estimators The Least Squares (LSE) and weighted Least Squares
Estimators (WLSE) are introduced by Swain et al. [26]. The LSE's and WLSE's are obtained by minimizing the following functions:  [5]. The CME's is obtained by minimizing the following function:

Anderson-Darling and right-tailed Anderson-Darling The Anderson-Darling (ADE) and Right-Tailed
Anderson-Darling Estimators (RTADE) are introduced by Anderson and Darling [4]. The ADE's and RTADE's are obtained by minimizing the following functions: In order to explore the introduced above estimators we consider the one model that have been used in the subsection6.1 , and we investigate MSEC of those estimators for different samples. For instance according to what has been mentioned in above, for (θ, λ, α, β) = (1, 2, 0.5, 2). we have simulated r = 1000 times with sample size of the n = 50, 55, 60, · · · 550 and then the MSE formula that are mentioned in the subsection 6.1 are calculated for them. To obtain the value of the estimators, we have used the optima function and Elder-Mead method in R. The result of the simulations of this subsection is shown in figure 6. As it is clear from the MSE plot for two parameters with the increase in the volume of the sample all methods will approach to zero and this verifies the validity of the these estimation methods and numerical calculations for the distribution parameters ZBOLL-GHN.

The survival times data
The second data analyzed by Kundu et al. [21] and Cordeiro, et al. [8] correspond to 72 survival times of guinea pigs injected with different doses of tubercle bacilli. This data was analyzed by Cordeiro et al. [8] The data set is the following:  GHN(θ,λ,a,  In the Tables 1-4 , a summary of the fitted information criteria and estimated MLEs for two data sets with different models have come, respectively. Models have been sorted from the lowest to the highest value of AIC. As you see, the (ZBOLL-GHN) is selected as the best model with more criteria. Note that p-value for (ZBOLL-GHN) is also more than all other

Conclusions
In this paper, we introduce a new Zografos-Balakrishnan odd log-logistic generalized half-normal ZBOLL-GHN family of distributions. Some of it's properties are derived. Different methods have been used to estimate it's parameters. The maximum likelihood estimators are assessed with simulated real data from proposed model. We conclude from the Bias and MSE plots with the increase in the volume of the sample, all methods will approach to zero and this verifies the validity of the these estimation methods. The ZBOLL-GHN is applied to fit two real data sets. The flexibility of this distribution is assessed by applying it to real data sets and comparing it with other distributions. These applications show that it has the ability to fit skewed (left or right) and heavy-tailed data due to its flexibility. The results of tables and figures illustrate the new models provide consistently better fits than other competitive models for these data sets.

Appendix A: Three useful power series
We present three power series required for the algebraic developments in Sections 3 and 4.2. First, for b > 0 real non-integer and −1 < u < 1, we have the binomial expansion where the binomial coefficient is defined for any real. Second, expanding z λ in Taylor series, we can write and (λ) k = λ(λ − 1) . . . (λ − k + 1) denotes the descending factorial.
Third, we obtain an expansion for [G(x) a +Ḡ(x) a ] c . We can write from equation (30) and (29) and a j (a) is defined by (30). Then, using (30), we have Finally, using again equations (31) and (32), we have

Appendix B
The elements of the observed information matrix J(θ) for the parameters (α, β, τ ) are given by the following:

230
THE ZOGRAFOS-BALAKRISHNAN ODD LOG-LOGISTIC GENERALIZED HALF-NORMAL DISTRIBUTION