A flexible ranked set sampling scheme: Statistical analysis on scale parameter

A flexible ranked set sampling scheme including some various existing sampling methods is proposed. This scheme may be used to minimize the error of ranking and the cost of sampling. Based on the data obtained from this scheme, the maximum likelihood estimation as well as the Fisher information are studied for the scale family of distributions. The existence and uniqueness of the maximum likelihood estimator of the scale parameter of the exponential and normal distributions are investigated. Moreover, the optimal scheme is derived via simulation and numerical computations.


Introduction
Ranked set sampling (RSS) was introduced by [14] as a method for unbiased selective sampling. In this scheme, k independent sets each contains k units are randomly selected from the population of interest. Next, the units of each set are visually sorted in ascending order and only the jth (j = 1, 2, ..., k) judgment order statistic of the jth set is measured. When judgment ranking is accurate, we say the ranking is perfect and the selected units are actually a sample of k independent order statistics. This scheme may be repeated at some cycles to attain a reasonable data set. [23] proved that the mean of RSS is an unbiased estimator of the population mean and its variance is always smaller than that of simple random sampling (SRS). This method can reduce cost of sampling and increase the accuracy of the results.
The same results were obtained by [12] by considering errors in judgment ranking. Some results of parametric RSS were presented by [22], [15], and [8]. The RSS scheme has been developed into various procedures. In this paper, we consider three existing types of ranking sampling schemes which are briefly explained as follows: • Extreme ranked set sampling (ERSS): This procedure was introduced by [19] in estimating problem of the population mean which is similar to RSS but only minimum or maximum of each set is measured and it is assumed that they can be detected visually. Therefore, the judgment ranking error in ERSS is less than that of RSS. The procedure of obtaining such a sample depends on whether k is even or odd. When k is even, from the first k/2 sets the lowest ranked units and from the last k/2 sets the largest ranked units are measured. When k is odd, the minimum of the first [k/2] sets and also the maximum of the last [k/2] sets are selected, where [a] is the integer part of a, moreover, the median of the ((k + 1)/2)th set is measured. 191 respectively, where θ is a vector of parameters. In the following, the data sets and likelihood functions concerning to different sampling schemes with m cycles have been presented. • Let us denote the data set of RSS by X (m) RSS = { X 1(1:k)1 , X 2(2:k)1 , ..., X k(k:k)1 , ..., X 1(1:k)m , X 2(2:k)m , ..., X k(k:k)m } , where X j(j:k)i is the jth order statistic in a random sample of size k at the jth set of the ith cycle for i = 1, . . . , m and j = 1, . . . , k. The corresponding likelihood function is given by whereF (·) = 1 − F (·) and x j(j:k)i is the observed value of X j(j:k)i . For more details, see for example, [22] and [6]. Throughout this paper, the product ∏ 0 i=1 is defined to be 1. • When the set size k is even, assuming the data set of an ERSS is the appropriate likelihood function is given by (see, 3) respectively. See, [1] and [9].
• Moreover, if the data set of MRSS is denoted by is the pdf of the jth order statistic in a random sample of size k from the pdf f (·).
As previously mentioned, a FRSS data set is obtained, when all the above mentioned sampling schemes are performed together with different cycles. Therefore, the associated data set is as follows The sample size of a FRSS is equal to M k, where M = ∑ 5 i=1 m i . In practice, we can select a permutation of (m 1 , m 2 , m 3 , m 4 , m 5 ), that reduces the ranking error or cost of sampling. Here, (m 1 , . . . , m 5 ) is called the FRSS scheme and its optimality is studied in comparison with SRS method in view of maximum RE by considering costs of sampling. Since the various data sets are independently obtained, the likelihood function of the FRSS scheme is given by The MLE and FI under FRSS scheme are studied in the next sections.

MLE of scale family based on FRSS
We recall that for given pdf f (x; σ), σ is a scale parameter, when f (x; σ) = 1 σ f 1 ( x σ ) where σ > 0 and f 1 (·) is a pdf and does not depend on σ. Then, f (x; σ) is called a scale family. For more details, see for example, Rohatgi and Ehsans Saleh (2015, p.196). There are many papers on the scale family as well as location or location-scale families of distributions based on various procedures of RSS scheme. The research work by [22] was one of the primary papers on estimation of location and scale parameters of distributions using RSS. [21] compared the MLE of location and scale parameters based on RSS, MRSS and ERSS. The MLE of scale parameter and its properties was studied by [9] on the basis of the MERSS scheme. Here, we investigate existence and uniqueness of MLE in a scale family based on FRSS procedure as proposed in previous section. In this case, using (1), the log-likelihood function of σ is given by Now, if there exists the MLE of σ, it satisfies in the equation ∂l(σ) ∂σ = 0. In general, the solution of this equation does not have an explicit form. Nevertheless, we will show the existence and uniqueness of MLEs of scale parameters for the exponential and normal distributions.

Exponential distribution
Let the underlying distribution be exponential with mean λ. The exponential distribution is a famous member of scale family which is used in different fields such as reliability studies and lifetime data analysis. In the following, the existence and uniqueness of the MLE of the scale parameter is investigated based on the FRSS scheme. Using (2), the likelihood equation can be written as

194
A FLEXIBLE RANKED SET SAMPLING SCHEME Note that the left hand side of equation (3) is a continuous function. On the other hand, by taking h(λ) = ∂l(λ) ∂λ , we have: and also lim λ→∞ h(λ) < 0. Therefore, the likelihood equation in (3) has at least one solution. On the other hand, note that, which is simply a negative expression for all λ. Hence, the solution of the equation (3) and consequently the MLE of λ is unique, too.
In the next subsection, we investigate the existence and uniqueness of the MLE for scale parameter in normal distribution using X F RSS .

Normal distribution
Suppose that a FRSS data set is collected from the normal distribution with mean zero and variance σ 2 . Using (2), the MLE of σ is obtained by solving the following equation where ϕ(·) and Φ(·) stand for the pdf and cdf of the standard normal distribution. Note that the left hand side of (4) is a continuous function with respect to σ, and converges to ∞ and (−M k), a negative value, when σ tends to 0 and ∞, respectively. Hence, the Eq. (4) has at least one solution. To investigate the uniqueness of the MLE, we have 196 A FLEXIBLE RANKED SET SAMPLING SCHEME and ) . (8) The expression in the right hand side of (5) is exactly the same as the one in the left hand side of (4) and so, it is equal to zero at any root of the Eq. (4). On the other hand, it is easily seen that D 1 in (6) is always negative. Also, since ) < 0 for all x (see, for example, 9), the expressions D 2 and D 3 in (7) and (8), respectively, are negative. Hence, the second derivative of the likelihood function is always negative. These imply that the MLE of σ is unique.

Fisher information in FRSS
In this section, the FI contained in a FRSS data set is studied. It is easy to show that under regularity conditions (see, 11, p.68), the amount of FI in a FRSS data set about a vector of parameters θ equals to Let F (·; θ) be the cdf of the underlying population. [7] showed that the FI matrix contained in a RSS data set is given by where I SRS (·) stands for the FI matrix in a SRS. [15] compared the FI about the dependence parameter using RSS with SRS. [9] investigated the FI in MERSS procedure for a scale family of distributions. Although, there is not a closed form for the FI in general, but we can obtain the FI for special case of exponential distribution. [5, p. 166] Let X 1 , · · · , X k be a random sample from an exponential distribution with mean λ. Then, the FI contained in the jth order statistic, denoted by I j:k (λ), is given by The FI in various schemes can be obtained using Remark 1. Of course, the FI in RSS procedure may be computed using (10). For this purpose, we have

Remark 1
Now, using Remark 1, it can be shown that the FI in ERSS, MERSS type I, MERSS type II and MRSS schemes are, respectively, as follows and where Consequently, the FI contained in X The following result is concluded as trivial using the asymptotic properties of the MLE.

Remark 2
Letλ F RSS be the MLE of λ based on the FRSS scheme with M cycles. Then the asymptotic distribution ofλ F RSS is given by , as n → ∞, where n = M k is the size of measured data.
Since the first and second derivatives of log-likelihood function in normal distribution have complex functional forms, we can not derive a closed form for the FI in this case. So, numerical methods are needed to compute it.
The  Table 1.
From Table 1, the following outcomes deduced:  and cr 3 .
The most REs The least REs The most REs The least REs

Optimal FRSS scheme
In this section, the optimal scheme m = (m 1 , m 2 , m 3 , m 4 , m 5 ) is determined via defining a RE criterion. Toward this end, we use two separate criteria, the efficiency of estimators and the costs of sampling. The idea of 18 has been used in this section. It should be highlighted that on the basis of philosophy of RSS scheme, there are different costs such as cost of sampling one unit (c i ), cost of quantification of the interested variable for one unit (c q ), cost of one pairwise comparison in RSS and MRSS schemes (c r1 ), cost of one pairwise comparison in ERSS scheme (c r2 ) and cost of one pairwise comparison in MERSS type I and MERSS type II schemes c r3 . It is worth pointing out that from the philosophy of different mentioned schemes of RSS in this paper, one can easily find out that c i is less than c q and c r3 ≤ c r2 ≤ c r1 . Therefore, the total cost of FRSS scheme can be computed as stands for the number of needed pairwise comparisons for judgment ranking in both of RSS and MRSS schemes; also, g 2 (k) ≈ k − 1 represents the number of needed pairwise comparisons for judgment ranking in the ERSS, MERSS type I and MERSS type II schemes. Moreover, the total cost of SRS is given by Now, suppose that T F RSS and T SRS are two estimators based on the FRSS and SRS schemes, respectively. Then, the RE of T F RSS with respect to T SRS is defined as where MSE(T ) = E(T − θ) 2 stands for the mean squared error of the estimator T of the parameter θ. For more details, see also, [24]. In the sequel, we compare the REs of MLEs of the scale parameters of both exponential and normal distributions based on the FRSS and SRS schemes with same size n = M k.

Exponential distribution
Based on a simple random sample from the exponential distribution with mean λ, it can be easily shown that the MLE of λ is the sample mean, that is,λ SRS =X, and it is easy to see thatλ SRS is an unbiased estimator with variance λ 2 /n. On the other hand, using Remark 2, for large values of n, it is clear that approximately, V ar(λ F RSS ) = (nI F RSS (λ)) −1 . Therefore, one can obtain the RE(λ F RSS ,λ SRS ) using (17). Toward this end, some different costs have been considered. The REs are computed for M = 6 and k = 4, 6. We have just presented 10 permutations with the most REs and also 10 permutations with the least REs in Table 2.
From Table 2 the following outcomes can be deduced: • The estimatorλ F RSS is more efficient thanλ SRS for many permutations of m = (m 1 , m 2 , m 3 , m 4 , m 5 ).
• The schemes (0, 0, 5, 0, 1), (0, 0, 6, 0, 0) and (0, 0, 4, 0, 2) are common in Table 2 among 10 permutations which have the most REs. So, one of them may be considered as the optimal FRSS scheme for estimating the mean of exponential distribution. • Using the entries of Table 2, the RE of different schemes in the problem of estimating the mean of exponential distribution may also be compared together. For example, when k = 4, the RE of MERSS type I, the case of m = (0, 0, 6, 0, 0), with respect to MERSS type II, the case of m = (0, 0, 0, 6, 0), is 1.8466/0.8773 = 2.1049, which implies that MERSS type I is more efficient than MERSS type II.

Normal distribution
Suppose that the underlying distribution is N (0, σ 2 ). It can be easily shown that the MLE of σ 2 based on the SRS method, denoted byσ 2 SRS , has the variance 2σ 4 n . On the other hand, as mentioned in subsection 3.1, the MLE of σ 2 on the basis of the FRSS scheme, i.e.σ 2 F RSS , dose not have an explicit form. So, numerical simulations are required to study the behaviour of MSE. The simulation algorithm is performed with 2 × 10 5 repetitions for standard normal distribution. Moreover, to determine the values of RE(σ 2 F RSS ,σ 2 SRS ), some different costs have been considered. Values of RE(σ 2 F RSS ,σ 2 SRS ) are presented in Table 3 for some selected permutations of m. From Table 3, it is observed that: • The estimatorσ 2 F RSS is more efficient thanσ 2 SRS for many permutations of m. • The optimal scheme is a combination of ERSS, MERSS type I and MERSS type II schemes. More precisely, when k = 4 the optimal schemes (0, 1, 5, 0, 0), (0, 1, 3, 2, 0) and (0, 4, 2, 0, 0), are common for different values of costs in Table 3 and when k = 6 the optimal common schemes are (0, 1, 3, 2, 0) and (0, 0, 6, 0, 0). • When k = 6, the RE of ERSS, m = (0, 6, 0, 0, 0), with respect to MERSS type I, m = (0, 0, 6, 0, 0), in the problem of estimating σ 2 in normal distribution is 4.5179/4.4431 = 1.0168. The REs of other schemes may be derived, similarly.

Conclusions
In this paper, a FRSS scheme including ordinary RSS, ERSS, MERSS and MRSS was introduced and the associated likelihood function derived. The MLE was considered in scale family of distributions based on the FRSS. The existence and uniqueness of the MLEs of scale parameters were investigated in the exponential and normal distributions. Moreover, the amount of FI about the parameter of interest was studied and some results were presented in detail for the case of exponential distribution. A comparison was done among different FRSS schemes. To obtain the optimal FRSS scheme a criterion was defined based on both efficiency and cost considerations. It was deduced that combining some existing sampling schemes increases the RE. Moreover, the optimal scheme depends on the underlying distribution and the parameter of interest. The proposed scheme can be extended to some other cases: • Other versions of FRSS including more sampling schemes may be valuable to study. • As previously mentioned in the procedure of ERSS, from one half of sets the minima and from the others the maxima are recorded. If in all sets the maxima are observed, using some algebraic and numerical computations, we observe that the amount of FI about the mean of exponential distribution increases.