A New Family of Continuous Distributions: Properties, Copulas and Real Life Data Modeling

A new family of distributions called the Kumaraswamy Rayleigh family is deﬁed and studied. Some of its relevant statistical properties are derived. Many new bivariate type G families using the of Farlie-Gumbel-Morgenstern, modiﬁed Farlie-Gumbel-Morgenstern copula, Clayton copula and Renyi’s entropy copula are derived. The method of the maximum likelihood estimation is used. Some special models based on log-logistic, exponential, Weibull, Rayleigh, Pareto type II and Burr type X, Lindley distributions are presented and studied. Three dimensional skewness and kurtosis plots are presented. A graphical assessment is performed. Two real life applications to illustrate the ﬂexibility, potentiality and importance of the new family is proposed.


Introduction and motivation
Recently, there has been an exceptional eagerness for growing more flexible families of distributions by extending the classical cumulative distribution functions (CDFs). Many generalized families of distributions have been defied and studied for modeling different lifetime data in many applied areas such as insurance, engineering, economics, environmental sciences, medical sciences, biological studies and finance. So, several G classes of continuous probability distributions have been constructed by expanding the common families of distributions. These generalized distributions give more flexibility by to the baseline family. The well-known continuous probability distributions such as Weibull, Burr type X, gamma, normal, beta, Burr XII, beta, Kumaraswamy, Log-Logistic, Topp-Leone and Lindley are widely used because of theirs simple forms. Recently, many statisticians have focused on the more complex and flexible continuous probability distributions for increasing the applicable ability of these well-known models via adding one or more shape parameters. The well-known family of distributions can be cited as follows: Marshall and Olkin [42] (Marshall and Olkin family), Zografos and Balakrishnan [63] (gamma family), Cordeiro and de Castro [13] (Kumaraswamy family), Yousof et al. [57] (Burr type X family), Cordeiro et al. [12] (Burr type XII family), Merovci et al. [43] (exponentiated transmuted family), Aryal and Yousof [8] (exponentiated generalized-G Poisson family), Brito et al. [10] (Topp-Leone odd log-logistic family), Korkmaz et al. [33] (generalized odd Weibull generated family), Korkmaz et al. [35] (exponential Lindley odd log-logistic family), Korkmaz et al. [36] (Marshall-Olkin generalized-G Poisson family), Nascimento et al. [46] (Nadarajah-Haghighi family), Merovci et al. [44] (Poisson Topp Leone family), Karamikabir et al. [32] (Weibull 749 Topp-Leone generated family), Korkmaz et al. [34] (Hjorth family), Alizadeh et al. [4] (flexible Weibull generated family), Alizadeh et al. [5] (transmuted odd log-logistic family) and El-Morshedy et al. [16] (Poisson generalized exponential family) Consider a baseline CDF G Ψ (z) with parameter vector Ψ where Ψ = (Ψ k ) = (Ψ 1 , Ψ 2 , ...) . Then due to Yousof et al. [57], the survival function (SF) of the R-G family of distributions is defined as where and G Ψ (z) = 1 − G Ψ (z) is the SF of the baseline model. In this paper, we define and study a new family of distributions by adding two extra shape parameters to (1) to provide more flexibility to the new generated family.
Using the Kumaraswamy-G (K-G) family (Cordeiro and de Castro [13]), we construct a new family called the Kumaraswamy Rayleigh-G (KR-G) family. For an arbitrary baseline CDF H σ,Ψ (z), the K-G family by the CDF ] γ | (V=ζ,γ,σ,Ψ) . Following Cordeiro and de Castro [13], the SF of the KR-G family can be expressed as where The probability density function (PDF) corresponding to (2) can be derived as where g Ψ (z) refer to the baseline PDF with parameter vector Ψ. We are motivated to define and study the KR-G family for the following reasons:

751
Finally expanding D κ1,Ψ (z) using (6), equation (7) can be expressed as where π κ • (z) = κ • g Ψ (z) G Ψ (z) κ • −1 denotes the PDF of the exponentiated G (ExG) densities with power parameter κ • and Similarly, the CDF of the KR-G family can be expressed as Henceforward, we will consider the scale parameter σ = 1 for obtaining more simple family with less number of parameters.

Quantile function
Quantile functions are used in theoretical aspects, statistical applications and Monte Carlo methods. Monte-Carlo simulations employ quantile functions to produce simulated random variables for classical and new continuous distributions. The KR quantile function , say z = Q(u) can be obtained by inverting (2) , we have where q u * ,γ,ζ = log . We can easily generate z by taking u as a uniform random variable in (0, 1).

Moments
Let Z κ • be a random variable having the ExG with density π κ • (z) | κ • =2κ1+κ2+2 an dpower parameter κ • . The r th moment of KR-G family can be obtained from (11) as and where E (Z r κ • ) can be calculated numerically in terms of the baseline quantile function, i.e.,

Incomplete momemts
The s th incomplete moment of Z is given by 752 A NEW FAMILY OF CONTINUOUS DISTRIBUTIONS Using (11), the s th incomplete moment of KR-G family is m s, The m s,κ • (t) can be calculated numerically by using the software such as Matlab, R, Mathematica etc.

Moment generating function
Now we introduce two formulae for the moment generating function. The first formula where M κ • (t) is the moment generating function of Z κ • . Consequently, we can be easily determined M Z (t) from the ExG generating function. The secone formula which can be calculated numerically from the baseline quantile function , i.e., Q G (u) = G −1 (u). For the KRPTII model

Copulas
We derive some new bivariate type KR (Biv-KR) model using FGM copula, modified FGM copula, Clayton copula and Renyi's entropy. The Multivariate KR (MvKR) type is also presented. However, future works may be allocated to study these new models.

Biv-KR type via FGM copula
Consider the joint CDF of the FGM family δ (u, is a dependence parameter and for every u, w ∈ (0, 1), (u, 0) = (0, w) = 0 which is "grounded minimum" and (u, 1) = u and (1, Gumbel [27] and Gumbel [28]). A copula is continuous in u and w; actually, it satisfies the stronger Lipschitz condition, where The joint PDF can then derived from

BvOBGR type via modified FGM copula
Due to Rodriguez-Lallena and Ubeda-Flores [52]), the modified version of the bivariate FGM copula is defined as

Biv-KR-FGM (Type-I) model
Here, we consider the following functional form for both ϑ (u) and ω (w).

Biv-KR-FGM (Type-II) model
Let ϑ (u) and ω (w) be two functional form for satisfy all the conditions stated earlier where The corresponding Biv-KR-FGM (Type-II) can be derived from

Biv-KR-FGM (Type-III) model Letθ
for all ϑ (u) and ω (w) which satisfies all the conditions stated earlier. In this case, one can also derive a closed form expression for the associated CDF of the Biv-KR-FGM (Type-III) from Υ (u, w) = uw ) .

Biv-KR-FGM (Type-IV) model
According to Ghosh and Ray [26] the CDF of the Biv-KR-FGM (Type-IV) model can be derived from Then,

Biv-KR type via Renyi's entropy copula
Consider theorem of Pougaza and Djafari [47] where Then, the associated Biv-KR will be

MvKR extention via Clayton copula
The MvKR (m-dimensional extension) from the above can be derived from Then, the MvKR extention can expressed as

Maximum likelihood estimation
The MLEs enjoy desirable properties and can be used when constructing confidence intervals and regions and also in test statistics. We determine the maximum likelihood estimates (MLEs) of the parameters of the KR-G family of distributions from complete samples only. Let z 1 , z 2 , ..., z n be a random sample of size n from the KR-G family.
The log-likelihood function for V is given by The components of the score function U n (V) = (U n (ζ), U n (γ), U n (Ψ)) . Setting the nonlinear system of equations U n (ζ), U n (γ), U n (Ψ) equal to zero and solving the equations simultaneously yields the maximum likelihood estimation (MLE) of V, say V , where these equations cannot be solved analytically, so, we use any statistical software to solve these equations.

Special models
This section presents some special KR models based on Log-Logistic (LL), Exponential (E), Weibull (W), Rayleigh (R), Pareto type−II (PTII) and Burr X (BrX), Lindley (Li) distributions. Table 1 below presents some new submodels based on the new KR-G family. Figure 1 gives PDF and HRF plots of the Kumaraswamy Rayleigh Weibull (KRW) model. Figure 2 gives PDF and HRF plots of the Kumaraswamy Rayleigh Pareto type−II (KRPTII) model. Based on Figure 1 (right panel), the PDF of the KRPTII can be "symmetric" and "right skewed" with many useful shapes. Based on Figure 1 (left panel), the HRF of the KRPTII can be increasing", "bathtub", " J-shape", "decreasing", "decreasing-constant" and "increasing-constant". Based on Figure 1 (right panel), the PDF of the KRPTII can be "symmetric" and "heavy tailed right skewed" with many useful shapes. Based on Figure 2 (left panel), the HRF of the KRPTII can be "increasing", "bathtub", "decreasing", "constant" and " J-shape". Figure  3 and 5 gives the three dimensional skewness plots for KRW and KRPTII models respectively. Figure 4 and 6 provides the three dimensional kurtosis plots for KRW and KRPTII models respectively. For the KRPTII model, we have is the complete beta function and is the incomplete beta function.

Simulations
To assess of the finite sample behavior of the MLEs, we will consider and apply the following algorithm: 1. Use    to generate 1000 samples of size n from the KRPTII distribution; 2. Calculate the MLEs for the 1000 samples, say 4. Calculate the biases (B ) and mean squared errors (MSEs) given for = ζ, β, b. We repeated these steps for n = 50, 60, . . . , 500 with ζ = 1, 2, ..., 100, γ = 1, 2, ..., 100, b = 1, 2, ..., 100 so computing biases, mean squared errors (MSE (n)) for a, b, ζ and n = 50, 60, . . . , 500 where . Figure 7 (left panels) shows how the biases vary with respect to n. Figure 7 (right panels) shows how the MSEs vary with respect to n. From Figure 7 (left panels), the biases for each parameter decrease to zero as n → ∞. From Figure 7 (right panels), the MSEs decrease to zero as n → ∞. Based on this assessment, the maximum likelihood method performs well and can be used in estimating the model parameters.

Applications and comparing models
In this section, we provide two real life applications to two real data sets to illustrate the importance and flexibility of the KRPTII model. We compare the fit of the KRPTII with some well-known competitive models (see Table 2).
Other relevant models can be used in the comparison, see Gad et al. [21], Tahir et al. [56] and Yousof et al. [58] for more details.   Figure 9. Box plot, Q-Q plot, TTT plot and KDE for service times data.
Data set I (84 Aircraft Windshield): Failure times: The first real data set represents the data on failure times of 84 aircraft windshield given in Murthy et al. [45]. The data are: 0.0400, 3 [15], Gad et al. [21], Altun et al. [7], Refaie ([48], [49], [50], [51]),Yadav et al. [55], Mansour et al. [41] and Ibrahim and Yousof [19]. For exploring the outliers, the box plot is plotted in Figures 8(a) and 9(a). Based on Figures 8(a) and 9(a), we note that no outliers were found. For checking the data normality, the Quantile-Quantile (Q-Q) plot is sketched in Figures 8(b) and 9(b). Based on Figures 8(b) and 9(b), we note that the normality is nearly exists. For exploring the shape of the shape of the HRF for the used real data, the total time test (TTT) plot (Aarset [1]) is provided (see Figures 8(c) and 9(c)). Based on Figures 8(c) and 9(c), we note that the HRF is "increasing monotonically" for the two data sets. For exploring the initial shape of real data nonparametrically, kernel density estimation (KDE) is provided in Figures  8(d) and 9(d). Figures 10 and 11 give the estemated Kaplan-Meier survival (EKMS) plot, estemated PDF (EPDF), Probability-Probability (P-P) plot and estemated HRF (EHRF) for data set I and II respectively. The following  Tables  3 and 4. Table 3 gives the MLEs and standard errors (SEs) for failure times data. Table 4 gives the −l and goodnessof-fits statistics for failure times data. For data set II: the analysis results of are listed in Tables 5 and 6. Table 5 gives the MLEs and SEs for service times data. Table 6 give the −l and goodness-of-fits statistics for the service times data. Based on Tables 4 and 6, we note that the KRPTII model gives the lowest values for the C 1 , C 2 , C 3 , C 4 , C 5 and C 6 among all fitted models. Hence, it could be chosen as the best model under these criteria.

Conclusions
Following Cordeiro and de Castro (2011) and Yousof et al. (2016), a new family of distributions called the Kumaraswamy Rayleigh family is defied and studied. Some of its statistical properties including the quantile function, moments, incomplete moments are derived. Many new bivariate type G families using the copula of Farlie-Gumbel-Morgenstern, modified Farlie-Gumbel-Morgenstern, Clayton copula and Renyi's entropy copula are derived. The method of the maximum likelihood estimation is used. Some special models based on Log-Logistic, Exponential, Weibull, Rayleigh, Pareto type−II and Burr X, Lindley distributions are presented and studied. A graphical assessment is performed. Based on this assessment, the maximum likelihood method performs well and can be used in estimating the model parameters. Two real life applications to illustrate the flexibility, potentiality and importance of the new family is proposed.   Table 4. −l and goodness-of-fits statistics for failure times data. As a potential future work, we can use and apply many new beneficial goodness-of-fit (GOF) tests for right censored distributional validation such as the Nikulin-Rao-Robson goodness-of-fit test, Bagdonavicius-Nikulin  Table 6. −l and goodness-of-fits statistics for the service times data.   Figure 11. EKMS plot, P-P plot, EPDF plot and EHRF for data set II.  [17], we can also convert the Kumaraswamy Rayleigh family to a new discrete G family.