Testing the Number of Components in a Birnbaum-Saunders Mixture Model under a Random Censoring Scheme

This paper deals with testing the number of components in a Birnbaum-Saunders mixture model under randomly right censored data. We focus on two methods, one based on the modified likelihood ratio test and the other based on the shortcut of bootstrap test. Based on extensive Monte Carlo simulation studies, we evaluate and compare the performance of the proposed tests through their size and power. Moreover, a power analysis is provided as a guidance for researchers to examine the factors that affect the power of the proposed tests used in detecting the correct number of components in a Birnbaum-Saunders mixture model. Finally an example of aircraft Windshield data is used to illustrate the testing procedure.


Introduction
Birnbaum-Saunders (BS) distribution is a two-parameter, right-skewed, unimodal distribution and can be considered as an alternative to the normal distribution for modeling skewed data because it has developed from a normal distribution by using a monotone transformation. The distribution is named after Birnbaum and Saunders who introduced it to model fatigue life of metals subject to cyclic stress. Hence it is considered one of the most frequently used distributions to model failure times of fatiguing materials, as a potential alternative to different widely used distributions such as the log-normal, Weibull, gamma and inverse Gaussian. The BS distribution, with its generalizations, has been used to solve problems in areas such as biology, business, economics, engineering, industry, reliability, environmental and medical sciences and many others. A comprehensive review of the BS distribution, including genesis, various interpretations, statistical features and all the developments, can be found in [3] and [23]. A nonnegative random variable Y is BS distributed with parameters α, β > 0, denoted as BS(α,β), if its cumulative distribution function (cdf) and the corresponding probability density function (pdf) are given by F (y; α, β) = Φ(υ), y > 0, (1) and f (y; α, β) = 1 2αβ √ 2π exp 158 TESTING THE NUMBER OF COMPONENTS IN A BS MIXTURE MODEL respectively, where Φ(.) is the standard normal cumulative function, υ = α −1 ρ(y/β), ρ(z) = z 1/2 − z −1/2 , α is a shape parameter and β is a scale parameter. Finite mixture models are an effective tool for the analysis of heterogeneous data in many fields of applied science such as, agriculture, bioinformatics, physics, cell biology, economics, industrial engineering, genetics, geology, machine learning, medicine, social sciences and among many others. Extensive bibliographies on the finite mixture can be found in the books such as [37] and [30].
Finite mixture of BS distributions are widely used to model heterogeneous populations when the subpopulations have skewed distributions. [2] studied the characteristics and estimation procedure of three different twocomponent mixture models based on BS and length-biased BS distributions. [21] suggested a two-component mixture of two bivariate BS distributions and they studied its different properties and parameter estimation using EM algorithm. [4] proposed a g-component mixture of BS distributions for modeling multimodel populations as an extension of the two-component mixture of BS distributions introduced by [2]. They discussed the identifiability and proposed the k-bumps initialization algorithm in the EM algorithm for estimating the parameters of the twocomponent mixture of BS distributions. Also, they implemented bootstrap procedures for testing the hypotheses about the number of components g in a mixture of BS distributions through real data. [18] generalized the proof of the identifiability problem suggested by [4] for a g-component mixture of BS distributions. Also, they addressed parameter estimation and homogeneity testing for the finite mixture of BS distributions based on random censoring data using the EM algorithm and EM test, respectively. The cdf and pdf of finite mixture of BS distributions are defined as follows: where θ = (π j , α j , β j ), α j > 0, β j > 0, π j are the mixing weights with π j ≥ 0, ∑ g j=1 π j = 1 , and F (t; α j , β j ) and f (t; α j , β j ), the cdf and pdf of the jth component, are given by (1) and (2), respectively, j = 1, ..., g.
An important issue in a finite mixture is to decide whether the number of components g in the mixture, which indicates the number of distinct groups, describes the data adequately or not. Different ways have been described by several authors including [30], [24] and [31] to assess the number of components g in mixture models. One of the major ways is the hypothesis testing approach for testing the smallest number of components g that is compatible with the number of distinct groups in the data. The popular classical way in the testing hypothesis approach is using a likelihood ratio test (LRT), but in the context of mixture models the asymptotic null distribution of the LRT is not chi-squared distribution due to the violation of standard regularity conditions. [16], [9], [27] and [1] deduced that the asymptotic null distribution of the LRT includes Gaussian process, which is very complex to execute in data analysis.
For a simpler asymptotic distribution, [7], [10] and [11] developed the LRT by combining a penalty term based on mixing proportions to the log-likelihood function that partially restores the regularity conditions. The asymptotic null distribution of the modified likelihood ratio test (MLRT) is restricted to a mixture of chi-squared distributions when testing homogeneity for finite mixture models of a single parameter in the components. [39] showed that the general asymptotic null distribution of the MLRT for testing homogeneity of mixture models with constraints in parameters is the chi-bar-squared distribution. [12] and [26] introduced EM-test as the development of the MLRT to overcome the undesirable restrictions on the components. Various papers have focused on the development of the EM test for testing the number of components g of finite normal mixtures (see, [25], [13], [32], [15], [21], [35], [14], [8] and [6]). Although, the EM test is effective and has a simple asymptotic distribution under univariate normal mixture models for the homogeneity test, it is still less effective to test the number of components in a mixture model with multi dimensional parameters ( [24] and [8]).
An alternative approach for testing the number of components g in mixture models is to obtain an empirical distribution of the test statistic via Monte Carlo simulation, when the usual asymptotic null distribution of chisquared distribution of the test statistic is not applicable. [36] used this approach to simulate the LRT statistic for testing homogeneity in a mixture of two inverse Weibull distributions. Recently, [18] constructed the empirical distribution of the EM test statistic to compute the size and power for testing homogeneity in the mixture of BS distributions. A popular method in this approach is the bootstrap likelihood ratio test (BLRT) proposed by [28]. [19] and [33] extended the work of [28] to use the BLRT in the hypothesis testing in mixture models and focused on the p-value computation rather than the power computation in their studies. [20] and [38] presented two methods of power computation for the BLRT to assess the number of classes in latent models via Monte Carlo simulation. In the first, the power computation is based on the proportion of bootstrap p-values method, which is inapplicable in most cases. While the second method is an alternative computationally faster method called a shortcut. Recently, [40] proposed a BLRT for determining the optimal number of components in a Weibull mixture model for grouped data.
The mixture of BS distributions is important in many applications. However, no researches were done on determining its number of components, apart from the work of [4] previously mentioned, which was confined to the case of complete sampling and also the work of [18] which was for testing homogeneity (number of components g = 2 ) using EM test. Hence the objective of this paper is to assess the number of components g in the mixture of BS distributions for randomly right censored data as well as complete data which can be considered as a special case. Censoring is an inevitable feature in reliability and life testing. It may occur naturally if the objects under study are lost from the test before failure or incorporated into the design of a study to save time and cost of the test. Types of censoring are based on the different termination techniques of life test. Type I censoring, Type II censoring and random censoring arise if the test end at fixed time, or with fixed number, or randomly according to certain criteria, respectively, see [22].
Thus this paper contributes to the present literature on the mixture of BS distributions by proposing the MLRT and the shortcut method of the bootstrap test for testing the number of components g in a mixture of BS distributions under a random right censoring scheme. We evaluate and compare the performance of the proposed tests through their size and power. Furthermore, a power analysis is conducted to determine the factors that affect the power of the proposed tests.
The rest of this paper is structured as follows. In Section 2, we introduce the MLRT for testing number of components g in a mixture of BS distributions for randomly right censored data and describe the algorithms used for size and power computation. Section 3 reports the simulation results and a real data set is analyzed in Section 4. A conclusion can be found in Section 5.

Hypothesis testing procedure
In this section, we introduce the hypothesis testing approach for testing the number of components g in a finite mixture of BS distributions for randomly right censored data using the MLRT. After that, we present how to compute the empirical size and power of the MLRT and shortcut method of the bootstrap test by Monte Carlo simulation.

Modified likelihood ratio test procedure
Let T 1 , T 2 , ..., T n be a random sample of observations with cdf (3) and pdf (4), and C 1 , C 2 , ..., C n be randomly right censored observations which are independent of T ′ i s, i = 1, 2, ..., n, drawn from a suitable distribution. In this case, for each of the n individuals we observe random pairs (t i , δ i ), i = 1, 2, ..., n, where The likelihood function under random right censoring scheme, as given in [22], can be written as

TESTING THE NUMBER OF COMPONENTS IN A BS MIXTURE MODEL
The modified log-likelihood function isl where ℓ n (θ) is the natural log function of (6) called ordinary log-likelihood function and p(π) is the penalty function and it is used to overcome the boundary problem on LRT ([10]) by setting a soft condition on mixing weights such that p(π) gets its maximum at π = 0.5 and p(π) → −∞ when π → 0 or 1. The MLRT statistic is defined as whereθ 0 ,θ a are the modified maximum likelihood estimates under the null and alternative hypotheses, respectively, obtained by maximizing (7). Then, the MLRT rejects the null hypothesis if M n is large. Specifically, it is rejected when the calculated value of the MLRT statistic for observed data is greater than the critical value at some significance level α. We are interested in testing the null and alternative hypotheses as follows: For estimating the parameter values under the H 0 and H a , we apply the EM algorithm introduced by [17] which is an iterative method to find the maximum likelihood estimates in the cases where the data can be viewed as being incomplete or can be treated in a similar form like finite mixture models ( [29] and [30]). To constitute the incomplete data structure in mixture model, let us define the unobserved data Therefore, the modified log-likelihood for the complete-data (T, Z) is given bỹ In EM algorithm, each iteration contains two steps: the expectation step (E-step) and the maximization step (M-step). In first iteration we start with randomly initial value θ 0 of the parameter θ and for l ≥ 0, a sequence of estimated value θ (l) of the parameter θ are constructed by repeating alternately between the E-step and M-step until the difference of log-likelihoods between two steps is sufficiently small. In the case of the mixture of BS distributions with θ = (π j , α j , β j ), j = 1, 2, ..., g, the E and M steps in the (l + 1)th iteration can be written as follows: E-step: where A is a constant independent of θ, p(π) = C g ∑ g j=1 log(2π j ) with C g positive constant suggested by [10] and Stat., Optim. Inf. Comput. Vol. 9, March 2021 W. EL-SHARKAWY AND M. ISMAIL 161 M-step: Maximizing of (10) with respect to α j , β j and π j , j = 1, ..., g, we obtain where z We obtain α (l+1) j and β (l+1) j by setting (11) and (12) to zero and solving them for α j and β j , j = 1, 2, ..., g.

Size and power of the test
The size of the test or significance level is the probability of rejecting the null hypothesis when it is true and is also called the type I error rate. It is an important input to hypothesis testing, which controls the critical value and power of the test and therefore has a significant effect on the inferential result. The power of the test, the probability that the test correctly rejects the null hypothesis when it is false, is an important aspect in research study which can help to make conclusions about the factors that affect achieving a desired power level for the test. To compute the size and power of the MLRT and shortcut method of bootstrap test, we need to construct the empirical distribution of MLRT statistic under the null hypothesis to obtain estimates of the critical values and then construct the empirical distribution of MLRT statistic under the null and alternative hypotheses to determine the size and power based on the estimated critical values, respectively. In short, two algorithms are necessary for size and power computation by simulation. The first algorithm, "CV Algorithm", is used for computing the estimated critical valueĈ α and the other one, "Size & Power Algorithm", is used for computing the size and power using theĈ α . In the MLRT and shortcut method of bootstrap test there are some differences in the steps of CV Algorithm but the Size & Power Algorithm is the same. The CV Algorithm used in MLRT can be explained as follows: CV Algorithm 1: and respectively, where I(.) is an indicator function.

Simulation studies
In this section, we conduct extensive simulation studies to examine the performance of the MLRT and shortcut method of bootstrap test in controlling Type I error and compare the performance of the power of the proposed tests by detecting the factors that affect the power of the test for assessing the number of components in a mixture of BS distributions based on randomly right censored data. The simulations are performed using a program developed in R version 3.3.1.
In order to compute the size (13) and power (14) as illustrated in the Size & Power Algorithm in the previous section for the following hypotheses H 0 : g = 1 vs. H a : g = 2 and H 0 : g = 2 vs. H a : g = 3, we consider sets of parameter combinations under the null and alternative models shown in Table 1. For each model, we set nominal significance levels as α = 10%, 5%, 2.5%, 1%, censoring proportions as p = 0%, 10%, 30%, 50% and sample sizes as n = 25, 50, 100, 250. The proportion p = 0% corresponds to the case of complete data whereas the other proportions for p reflect the case of censored data. The number of simulation replicates are 10,000 and 1000 for the cases of complete data and censored data, respectively. The suggested distribution for generating random censoring is the uniform distribution.
The EM algorithm is used in our simulation to find the modified maximum likelihood estimates. It has a common drawback of getting trapped at local maxima and to avoid local maxima, we used several random starting values to increase the chance of reaching the global maximum through picking the highest likelihood.
In hypothesis testing, it is desirable to have a small deviation between the empirical size of the proposed tests and the nominal significance levels and also a higher power is desirable for the proposed tests. The estimated critical values used for size and power computation will not be presented, to save space. In Tables 2 -5, we present the values of the empirical size of the MLRT(C g = 1) and shortcut method. Overall in these tables, we noted the empirical sizes are reasonably close to the nominal significance levels, but for small sample sizes and for very high censoring levels the deviations were a little bit high. Hence, we conclude that the performance of the MLRT statistic is reliable.
We are concerned with studying some factors that may influence the power of the test: the first factor is Mahalanobis distance ∆ = |µ 1 − µ 2 |/σ which means the distance between the two components of the mixture model, where µ 1 and µ 2 are the two means of the two BS components in H a and σ is the common standard deviation for the mixture of BS distribution. The other factors are the mixing proportion π, the censoring proportion p, the penalty term C g in MLRT and the sample size n. In Table 1, models M1, M2, M3 and M6, M7, M8 are used to study the effect of ∆ on the power with H a : g = 2 and H a : g = 3, respectively. Table 1. parameter settings in the null and alternative models for mixture of BS distributions with g 0 = 1 and g 0 = 2. The results showing the effect of both ∆ and n on the power are presented in Tables 6 and 10 for the case of complete data. From Table 6, where the results of the power are reported for H a = 2, we observe that the power increases with the value of ∆. In other words, the proposed tests can not detect the true model when the value of ∆ is small. Also, we observe that the power increases with sample size. In all methods, when ∆ = 1.1, the power is low with n = 25, 50, 100 and is moderately high with n = 250. When ∆ = 2.5, the power is low with n = 25, 50, is moderately high with n = 100 and high with n = 250. When ∆ = 4.4, the power is high with n = 25, 50 and it is one with n = 100, 250. That is, as the value of ∆ decreases, we need to increase the sample size for detecting the true model.

Model parameter
In Table 10, the results of the power are reported for H a : g = 3 based on complete data. Similarly from the results shown in Table 6, it is observed that the power increases with ∆ and sample size. In all methods, when ∆ = 1.1, the power remains low even for n as large as 250. Whereas, when ∆ = 2.5, the power becomes moderately large with n = 250. However, when ∆ = 4.4, the power is moderately large for even a value as small as n = 25 and approaching one when n = 50, 100 and it is equal to one when n = 250. Comparing Tables 6 and 10, we observe that the values of the power of Table 6 are higher than those of Table 10. That is, if adjacent components have the same ∆ from each other, it is harder to detect the mixture model having more components. This same conclusion was given in [40].
Models M2, M4 and M5 in Table 1 are used to study the effect of mixing proportion on the power for the case of H a : g = 2 based on complete data. The results are reported in Table 7. It is observed that the power of the tests for models having equal mixing proportions is higher than for those models having unequal mixing proportions. Moreover, the power for the case π = 0.1 is considerably larger than for π = 0.9.
Also we study the effect of C g on the power as C g = 0, 1. For C g = 0, the M n given in (8) becomes the ordinary likelihood ratio statistic whereas for C g = 1, the M n becomes the modified likelihood ratio statistic. In Tables 6 -10, we present the values of the empirical power of LRT(C g = 0), MLRT(C g = 1) and shortcut method, for testing the number of components when H a : g = 2 and H a : g = 3. As noted from these tables, the power of the MLRT is slightly higher than of the LRT, consequently the value of C g has an effect on the power. Whereas, the power of the shortcut method is higher than of the MLRT. But the drawback of the shortcut method is that its computation time is higher than that of MLRT.
The results in Tables 8 -9 show the effect of the censoring proportions p on the power of the proposed tests for H a : g = 2. For fixed ∆ and sample size, the power decreases as censoring proportion increases, and for fixed ∆ and censoring proportion, the power increases with the sample size, whereas for fixed sample size and censoring proportion, the power increases with ∆. In all methods, when ∆ = 1.1, the power is moderately large with n = 250 and p = 10%. However for p = 30%, 50%, the power does not become moderately large even when n = 250. Whereas, when ∆ = 4.4 and p = 10%, 30% the power is high for n = 25, almost one for n = 50, 100 and equal to one for n = 250. The same behavior is encountered when ∆ = 4.4 and p = 50% with the only exception that the power is moderately large when n = 25. Consequently, to detect the true model for the case of random censoring, we need to increase the sample size whenever ∆ decreases and censoring proportion increases.

Application
We consider a real data set on failures of aircraft Windshields given by [5] and analyzed previously by [34]. The data set contains 153 observations which are classified as 88 failure times and 65 censored times. [34] proposed a twofold Weibull mixture model to fit the data. We suggest a mixture of BS distributions to fit this data rather than a twofold Weibull mixture model.

Conclusion
In this paper, we introduced the problem of testing the number of components in a mixture of BS distributions under a random right censoring scheme. The MLRT and shortcut method of bootstrap test were discussed in the hypothesis testing procedure. Both the MLRT and the shortcut method were evaluated and compared. Extensive Monte Carlo simulation studies were performed to compute the empirical size and power of the tests. The simulation results show that the empirical size is quite close to the nominal significance level in most cases, which indicates that the MLRT statistic performs well. The simulation results also show that the power obtained using the shortcut method is greater than that obtained using the MLRT. Moreover, the results are helpful in detecting the factors which affect the power. The overall findings for factors that affect power in the case of H a : g = 2 one compatible with previous studies of [18] and there is a strong indication for the influence of the Mahalanobis distance ∆ on power. Also, the sample size has a direct and clear influence on power, but its effect is less as compared to the Mahalanobis distance ∆. For the case of complete data and a low value of ∆, a large sample size is needed to get a value for the power greater than 0.5. Whereas, if a high proportion of observations is censored, a much larger sample size is required to obtain a value for the power greater than 0.5 using the same value of ∆. However, the increase in the power achieved if complete data used instead of censoring should be balanced against the time and the cost of the experiment which are greatly reduced if random censoring is used instead of complete sampling.