A Study on A New Type 1 Half-Logistic Family of Distributions and Its Applications

In this paper, we propose a new class of continuous distributions with two extra shape parameters called the a new type I half logistic-G family of distributions. Some of important properties including ordinary moments, quantiles, moment generating function and mean deviations are obtained. To estimate the model parameters, the maximum likelihood method is also applied by means of Monte Carlo simulation study. A new location-scale regression model based on the new type I half logistic-Weibull distribution is then introduced. Applications of the proposed family is demonstrated in many fields such as survival analysis and univariate data fitting. Empirical results show that the proposed models provide better fits than other well-known classes of distributions in many application fields.


Introduction
The essential limitations and problems of the classic statistical distributions in data modelling lead statistical researcher to introduce new flexible distributions. These new distributions are often made through the classic distributions and provide the necessary flexibilities with respect to the classic distributions. The readers are referred to Marshall-Olkin generated (MO-G) by Marshall and Olkin [16], Odd log-logistic-G by Gleaton and Lynch [12], Kumaraswamy-G (Kw-G) by Cordeiro and de Castro [7], McDonald-G (Mc-G) by Alexander et al. [1], Weibull-G by Bourguignon et al. [6], exponentiated half-logistic by Cordeiro et al. [10], transformer (T-X) by Alzaatreh et al. [2], Lomax generator by Cordeiro et al. [14], Kumaraswamy Marshal-Olkin family by Alizadeh et al. [3], Beta Marshal-OLkin family by Alizadeh et al. [4], and type I half-logistic family by Cordeiro et al. [11] to see some of the most important distributions with required flexibilities to apply a wide range of data sets.
In this paper, using T-X idea proposed by Alzaatreh et al. [2] and Odd log-logistic-G by Gleaton and Lynch [12], we introduce a new family of distributions. The cumulative distribution function (cdf) of this new extension family is given by The corresponding probability density function (pdf) and the hazard rate function (hrf) to (1) are and respectively. It is named the a new type I half logistic-G family of distributions (NT1HL-G for short) family of distributions using the half-logistic distribution as the generator. For α = 1, NT1HL-G implies Type I Half-Logistic family by Cordeiro et, al. [11]. Let X ∼ N T 1HL − G, then it is easy to show that Y = G(X;ξ) α G(X;ξ) α +Ḡ(X;ξ) α has half-logistic distribution with parameter λ. Proposing a new distribution requires to have a good motivation and physical representation. We propose a new family of distributions to decrease the modeling error of the highly interesting data sets such as bimodal and left skewed or bimodal and right skewed data sets. The NT1HL-G family provides a new possibility to model these kind of data sets. Moreover, by adding two extra shape parameters, we extend the well-known distributions by giving them more flexibility such as left skewness, heavy-tail property and so on.
Proof: First note that Using generalized binomial expansion, for any x ∈ R. Taking A = 1, implies λ 1 = λ 2 . Since both series are equal, The plan of this paper is as follows. In the introduction, the NT1HL-G is introduced. In Section 2, some special case of the NT1HL-G distributions are studied. Main properties of the proposed family are given in Section 3. In Section 4, the maximum likelihood method is discussed to estimate the model parameters. A log-new type 1 halflogistic-Weibull (LNT1HL-W) regression model is proposed in Section 5. In the Section 6, a simulation study is presented to show the performance of the proposed family and its estimators. Two real data sets are also employed to illustrate the methodology. The results are given in Section 7. Section 8 concludes.

Some special cases of NT1HL-G
In this section, we provide two special models of the NT1HL-G family which illustrate the flexibility of the new family; they correspond to the baseline Weibull (W) and Normal (N) distributions, respectively.

The NT1HL-W distribution
of the W distribution with scale b > 0 and shape a > 0 parameters. Inserting these functions in general form of family, the pdf of the NT1HL-W model (for x > 0) is given by Figure 1 displays the pdf and hrf shapes of the NT1HL-W distribution for some parameter values. From these figures, it is concluded that the pdf shapes of the NT1HL-W can be left and right skewed as well as nearly symmetric. Additionally, NT1HL-W has the following hrf shapes: increasing, decreasing, bathtub and unimodal. These hrf shapes of the NT1HL-W distribution reveal the flexibility of the NT1HL-W distribution in modeling the different lifetime data sets.

The NT1HL-N distribution
In the last two decades, some of researcher generalized normal distribution for obtaining uni-bimodal skew symmetric normal distribution by different procedures such as Arellano-Valle et al. [5] and Rasekhi et al [23]. The NT1HL-N distribution suggest another new way for creating flexible (uni-bimodal) skew-symmetric normal distribution. The pdf of NT1HL-N distribution is given by where ϕ(.) and Φ(.) are pdf and cdf of standard normal distribution and z = x−µ σ . Based on Figure 2, the pdf (5) includes right and left skew-symmetric unimodal and bimodal shapes.

Quantile function
The quantile function (qf) is a key function to generate a random variables from the any continuous probability distribution. Therefore, it takes an important place in probability theory. The quantile function is solution of F (x) = u for x where u is distributed as U (0, 1). The qf function of the NT1HL-g is given by where Q G (·) is the qf of the baseline distribution. We asses the effects of the parameters α and λ on the skewness and kurtosis measures of the NT1HL-G family for a baseline distribution, W with scale parameter 2 and shape parameter 9. To do this, we use the Bowley's skewness and Moors's kurtosis measures, given, respectively, by 938 A STUDY ON A NEW TYPE 1 HALF-LOGISTIC FAMILY OF DISTRIBUTIONS AND ITS APPLICATIONS The results are displayed in Figure 3. As seen from these figures, when the parameters α and λ increases, skewness and kurtosis decrease.

Expansions for NT1HL-G
The most of the statistical properties of the NT1HL-G distribution can be obtained by using the exponentiated-G ("Exp-G") distribution. The cdf and pdf of Exp-G distribution are given, respectively, by where g(x) and G(x) are the pdf and cdf of the baseline distribution, respectively. Here, we obtain an expansion for F (x) to derive the statistical of the NT1HL-G distribution. The power series for u λ is given by where λ > 0 and 0 < u < 1. We consider the generalized binomial expansion ( for α > 0) where Using the ratio of tewo power series we can write where c 0 = a0 b0 and the coefficients c k 's (for k ≥ 1) are determined from the recurrence equation The pdf of X follows by differentiating (9) as is the Exp-G density function with power parameter (k + 1). From (10), we conclude that the pdf of the NT1HL-G can be expressed as a linear combination of the pdf of Exp-G densities. Therefore, the statistical properties of the NT1HL-G can be obtained using this relation. The important properties of the Exp-G density have been analyzed by many researchers such as Mudholkar and Srivastava [17], Mudholkar et al. [18], Nadarajah [19,20] and Gupta and Kundu [13].

Moments
Assume that Y k be a random variable having a density function of the Exp-G distribution with power parameter k + 1, h k+1 (x). Using (10), the nth raw moment of the X ∼NT1HL-G is given by The expressions, given in Nadarajah and Kotz [21], can be used to obtain E(X n ). Additionally, using (11), the raw moments of the NT1HL-G can be redefined based on the G quantile function as The quantity, τ (n, k), was obtained for several baseline distribution such as beta, gamma and Weibull by Cordeiro and Ndarajah [9]. The results given in Cordeiro and Ndarajah [9] can be used to obtain raw moments of NT1HL-G.
The incomplete moments are useful tool and widely used in measuring the inequality based on the Lorenz and Bonferroni curves which are defined based on the incomplete moments. The nth incomplete moment of X is

Generating function
Now, we derive the moment generating function (mgf) of the NT1HL-G distribution based on the expansion, given in (10). Let M X (t) = E(e t X ) represents the mgf of the NT1HL-G density, given by The other choice is to derive the mgf based on the quantile function which is given by

Mean deviations
The mean deviations about the mean and median are given, respectively, by . Now, we provide two ways to obtain the δ 1 and δ 2 . The required equation for m 1 (z) is derived based on (10) as given by Using (17), the mean deviations, δ 1 (X) and δ 2 (X), are given, respectively, by The other way is to set u = G(x) in (10 to obtain the general formula for for m 1 (z). Then, we have which can be easily calculated for most of the quantile function of the baseline distribution.

Estimation
In this section, we discuss maximum likelihood estimation (MLE) and inference for the NT1HL-G distribution. Let x 1 , . . . , x n be a random sample from X ∼ N T 1HL − G where λ, α are the model parameters. The log-likelihood for the parameters of the NT1HL-G distribution given the data set x 1 , . . . , x n reduces to Then, the score vector components, U (θ) = ∂ln ∂θ = (U α , U λ , U ξ ) , are where g (ξ) (x i ; ξ) = ∂g (x i ; ξ) /∂ξ, G (ξ) (x i ; ξ) = ∂G (x i ; ξ) /∂ξ. The simultaneous solution of these score vectors gives the MLEs of the unknown parameters of the NT1HL-G. The other choice is to direct maximization of (19) by using iterative optimization algorithms for initial vector of parameters. Here, we obtain the MLEs of the parameters of the NT1HL-G by using the optim function of the R software.

The log-new type 1 half-logistic-Weibull (LNT1HL-W) regression model
The survival regression models are widely used to analyze the lifetimes of any device for some explanatory variables. In the literature, researchers have proposed several location-scale regression models for W or Burr-XII cases of G-class distributions (see, Cordeiro et al. [8]). Following the results of Cordeiro et al. [8], we introduce a new log-location-scale regression model based on the NT1HL-W distribution. Let X be a random variable having a density in (4) and consider the transformation Y = log(X) with re-parametrizations a = 1/σ and b = exp(µ). Then, we have where y ∈ ℜ, µ ∈ ℜ, σ > 0, α > 0 and λ > 0. Hereafter, the density in (21) is denoted as Y ∼ LNT1HL-W(α, λ, σ, µ). The corresponding survival function to (21) is The corresponding hrf to (21) can be easily obtained by h(y) = f (y)/S(y). Let Z = (Y − µ)/σ be a standardized random variable. Using this transformation, we have Now, using the density in (21), we propose a new location-scale regression model where the response variable follows the LNT1HL-W density, and v T i = (v i1 , ..., v ip ) represents the vector of the explanatory variable. Consider the below regression model where y i has the density in (21). The unknown regression parameter vector is β β β = (β 1 , . . . , β p ) T , and the scale parameter is σ > 0. The parameters α > 0 and λ > 0 are unknown shape parameters. The parameter µ i = v v v T i β β β represents the location of y i . The LNT1HL-W regression model contains the LHL-W regression model as its submodel.

• Log-half-logistic-Weibull (LHL-W) regression model
For α = 1 , the survival function is Now, we obtain the unknown parameters of the LNT1HL-W regression model by using MLE method. First, we define some required mathematical notations. Assume that we have a random sample y 1 , y 2 , ..., y n comes from the LNT1HL-W distribution. The response variable is defined as y i = min{log(x i ), log(c i )} where log(x i ) is the log-lifetime and log(c i ) is the log-censoring times. Also, we assume that log(x i ) and log(c i ) are independent. We define two sets: F and C. The F represents the log-lifetimes and C represents the log-censoring times. Under this specificiton, the general equation of the log-likelihood for location-scale regression models is where the unknown parameter vector is τ = (α, λ, σ, β T ) T for LNT1HL-W regression model. (26), the log-likelihood function of LNT1HL-W regression model is where u i = exp(z i ), z i = (y i − v T i β)/σ and r is the number of failures and c is the number of the censored observations. The MLE of the unknown parameter vector, τ , is obtained by direct maximization of (27) with optim function of R software.

Simulation Study
Here, we study the asymptotic properties of the MLEs of the NT1HL-W parameters by means of simulation study. The simulation replication is determined as N = 1, 000 and sample of sizes n = 50, 100, 200 and 500 are generated from NT1HL-W distribution by using the inverse transform method. The simulation results are evaluated based on the estimated biases, means square errors (MSEs), average lengths (ALs) and coverage probabilities (CPs). The required formula for these measures can be found in Cordeiro et al. [8]. For each generated sample, we obtain the MLEs of the parameters, (α i ,λ i ,â i ,b i ), and corresponding standard errors, (sα i , sλ i , sâ i , sb i ) where i = 1, 2, ..., N . Using estimated parameter values and corresponding standard errors, biases, MSEs, ALs and CPs are calculated.
Two parameter vectors are used. These are θ = (α = 2, λ = 0.5, a = 2, b = 0.5) and θ = (α = 2, λ = 0.5, a = 0.5, b = 2). The simulation results are summarized in Table 1. The simulation results show that the estimated biases and MSEs are near the zero, as expected. Also, the CPs are near the 0.95 and the AL decreases once the sample size increases. These results show that MLEs of the parameters of the NT1HL-W distribution are asymptotically unbiased and consistent.

Applications
In this section, we use one real data set to compare the fits of the NT1HL-G family of distributions with other wellknown distributions. For this data set, the parameters are estimated by maximum likelihood (Section 4) via the optim function in R program. Summary statistics of this data set is presented in Table 2. The MLEs, −ℓ, standard errors, Cramer von-mises and Anderson Darling statistics are given in in Tables 3. The lower the values of these criteria show the better fitted model on data set. Finally, we provide the histograms of this real data set to have a visual comparison of the fitted distribution functions in Figure 4.

Oits IQ Scores
The first data set is given by Roberts [15]), Skew generalized normal (Arellano valle et al. [5]) and submodel of NT1HL-N (α = 1) with name Type I Half Logistic-Normal (Cordiero et al. [11]) models. Table 3 illustrates that the NT1HL-N model gives a better fit to this data than the other rival models.  Figure 4 displays the estimated pdfs of the fitted models on the histogram of the used data set. Figure 5 displays the profile likelihood functions of the NT1HL-N distribution. These figures reveal that the estimated parameters are the maximizers of the log-likelihood function of the NT1HL-N distribution.

Censored lifetime data application for LN1HL-W regression model
Here, the importance of LN1HL-W regression model is illustrated by means of real data modeling. The data set consists of a production relay and on a proposed design change. The number of observations is 35. The used data set can be found in Nelson [22]. The LNT1HL-W regression model is adopted to these data set. The variables  are y i -observed thousands of cycles; cens i -censoring indicator (0=censoring, 1=lifetime observed) and x i1production (16 amps, 26 amps, 28 amps). Here, the research question is to explore the effects of production levels on the thousands of cycles. The following regression model is fitted where y i has the density in (21). The estimated MLEs of the fitted regression models and AIC, BIC values are given in Table 4. As seen from these results, LNT1HL-W regression model has lower values of AIC and BIC values than those of LHL-W regression model. Moreover, the regression parameter β 1 is found statistically significant at 5% level. It means that there is a significant difference between the levels of the production for the thousands of cycles.
The LR statistic is used to compare LNT1HL-W and LH-W regression models and the results are reported in Table 5. Based on this table, specially the p-values, indicate that the LNT1HL-W model provides better fit to these data than the LHL-W regression model. Figure 6(a) display the empirical survival functions and estimated survival functions of LNT1HL-W regression model. This figure reveal that LNT1HL-W regression model does its job well in modeling the current data. We conclude from this figure that there is no significant differences between the 26 and 28 amps levels survival functions. Figure 6(b) displays estimated hrfs of LNT1HL-W regression model.  Residual analysis of LN1HL-W regression model After fitting the model to corresponding data set, residual analysis should be performed to analyze departures from error assumption. To do this, we use two residuals. These are martingale and modified deviance residuals, introduced by Therneau et al. [25]. The martingale residuals for LN1HL-W regression model are where u i is defined in Section 5. Since the martingale residuals are not symmetrically distributed about zero, Therneau et al. [25]. introduced the modified deviance residual which is defined as wherer Mi is the martingale residual. The quantile-quantile (QQ) plot of the modified deviance residuals is displayed in Figure 7. As seen from this figure, none of the observation can be considered as a possible outlier. So, we conclude that LN1THL-W regression model provides reasonable fits to the current data. Figure 7. The QQ plot of the modified deviance residual for LNT1HL-W regression model.

Conclusions
This paper proposes a new family of distributions, called a new type I half-logistic-G (NTIHL-G) family. The statistical properties of the NTIHL-G are studied comprehensively. The maximum likelihood estimation method is considered to obtain the unknown model parameters. The simulation study is conducted to evaluate the finite sample performance of the estimation method. Two real data sets are analyzed to demonstrate the importance and flexibility of the NTIHL-G family against the well-known competitive models. The log-linear regression model of the proposed family is introduced and discussed using real data application. Empirical results show that the proposed family provides new opportunities to model data in many application fields.

Appendix: Three useful power series
We present three power series required for the algebraic developments in Section 2 and 3. First, for a > 0 real non-integer, we have the binomial expansion where the binomial coefficient is defined for any real. Second, expanding z λ in Taylor series, we can write where and (λ) k = λ(λ − 1) . . . (λ − k + 1) denotes the descending factorial. Third, we obtain an expansion for [G(x) α +Ḡ(x) α ] c . We can write from equation (32) and (31) where for j ≥ 0, t j (α) = f j (α) + (−1) j ( α j ) and f j (α) is as defined in (32). Then, using (32) , we have Finally, using again equations (33) and (34), we have