Shrinkage Difference-Based Liu Estimators In Semiparametric Linear Models

In this article, under a multicollinearity setting, we define difference-based Liu and non-Liu type shrinkage estimators along with their positive parts in the semiparametric linear model, when the errors are dependent and some nonstochastic linear restrictions are imposed. We derive the biases and exact risk expressions of these estimators and obtain the region of optimality of each estimator. Also, necessary and sufficient conditions, for the superiority of the difference-based Liu estimator over its counterpart, for choosing the Liu parameter d are established. Finally, we illustrate the performance of these estimators with a simulation study.


Introduction
Semiparametric linear models have received considerable attention in statistics and econometrics.Semiparametric models are by design more flexible than standard linear regression models since they combine both parametric and nonparametric components.In general the semiparametric linear model is defined by where y = (y 1 , ..., y n ), X = (x 1 , ..., x n ) is an n × p matrix, f (t) = (f (t 1 ), . . ., f (t n )) and ϵ = (ϵ 1 , ..., ϵ n ).We assume that in general, ϵ is a vector of disturbances, which is distributed as a multivariate normal, N n (0, σ 2 V ), where V is a symmetric, positive definite known matrix and σ 2 is an unknown parameter.In general it is assumed that, f (.) is an unknown function, the t ′ s have bounded support, say the unit interval, and have been reordered so that t 1 ≤ t 2 ≤ . . .≤ t n .Also the first derivative of f (.) is bounded by a constant, say L. Most of approaches for Semiparametric linear models are based on different nonparametric regression procedures.There have been several methods to estimate β and f (.).Surveys regarding the estimation and application of the model (1) can be found in the monograph of Hardle et al. [8] and Muller [15].
In linear regression analysis one of the standard assumption is that all the explanatory variables are linearly independent.When this assumption is violated, the problem of multicollinearity enters into the data.The presence of multicollinearity may produce estimates with wrong signs or linear combination of the parameters may lead to wide confidence intervals for the individual parameters.Alternative estimators has been proposed to combat multicollinearity which are generally biased.Among them ridge estimator of Horel and Kennard [9] and Liu estimator (see, Liu [11], [12]) which combined the Stein [22] estimator with the ridge estimator have received a great deal of attention in statistical literature.It is well known that the incorporation of prior information available in the form of restrictions provide better estimators than the ordinary estimator.However, when the prior information is doubtful, one may combine the restricted and ordinary estimators to obtain a better estimator, which leads to the preliminary tests and stein type shrinkage estimators.Recently these estimation techniques have been extended in semiparametr regression model (1).The approach of preliminary test estimation has been pioneered in Judge and Bock [10], Benda [5], Saleh [20], Saleh and Kibria [21] and very recently in Yuzbasi et al [27,28] and Wu and Asr [25].The main goal of shrinkage estimators is to develop necessary tools for computing the risk function of regression coefficient in a semiparametric linear model based on the eigenvalues of design matrix.Here we are looking for a new estimator for shrinkage parameter by making use of the existing ones in the literature.Our method is based on the differencing technique (see Yatchew [26] and Brown [6]).Compared to the estimation mentioned above, our method avoids the bandwidth choice.We consider difference-based shrinkage Liu estimators in comparison to the difference-based restricted Liu estimator.We give theoretical conditions that determine superiority among the estimation techniques in the mean squared error matrix sense.The paper is organized as follows.In Section 2, the model and the difference-based estimators are defined.We introduce difference-based shrinkage estimators in this section.In Section 3 bias and risk functions of the proposed estimators are obtained.In Section 4, the least/most values of the Liu parameter are identified for which the difference-based shrinkage Liu estimators dominate each other.Section 5 contains the simulation studies.Finally, some concluding results are stated in Section 6.

The model and difference-based shrinkage estimators
In this section, using the differencing technique we remove the nonparametric component in the semiparametric linear model and then propose some shrinkage estimates of the linear parameter in this model.The difference based technique has been used to by various authors.Let ℘ 0 , ..., ℘ m be difference sequence satisfying Now we define the (n − m) × n differencing matrix D to have first and last row ( where 0 an r vector of all zero elements. Applying the differencing matrix to (1) we have where Since f is an unknown function with bounded first derivative, then Df ≈ 0. To estimate β in (2), we use the difference-based generalized least squares estimator (GLSE) given by βG where Our primary interest is to estimate the linear parameters when it is a priori suspected but not certain that β may be restricted to the subspace where H is q × p non zero matrix with rank q < p and h is q × 1 vector.The difference-based generalized restricted estimator (GRE) of β is given by In fact, the coefficient parameter β can be regarded as a vector in p dimensions space.If there exists multicollinearity in X (or equivalently, X D is ill conditioned), the coefficients which is estimated by least square method, would be badly apart from the actual coefficient parameter in some directions of p dimensions space, since the variance of βGR depends on matrix C. In order to overcome the multicollinearity following Akdeniz et al [1], Swamy Mehta [23], Swamy et al [24], Roozbeh and Arashi [18] and Zhang and Yang [30] we minimize the sum of squared residuals with a spherical restriction (β ⊤ β ≤ ρ 2 ) and linear restriction.Therefore, a generalized optimization problem is given as follows: By using the Lagrangian method in relations (4), we can obtain the difference-based generalized restricted Liu estimation (GRLE) where is the difference-based generalized unrestricted Liu estimator (GULE) and d ≥ 0 is the Liu parameter.
From Saleh [20], the likelihood ratio criterion for testing the null hypothesis Hβ = h, is given by Under the null hypothesis and normal theory, £ n follows a central F-distribution with (q, n − p) degrees of freedom, while, under the alternative, it follows the non-central F-distribution with (q, n − p) degrees of freedom and non-centrality parameter 1 2 ∆ * , where In many practical situations, along with the model one may suspect that β belongs to the subspace defined by (2.4).In such situation, as a result, we combine the difference based GULE and GRLE to obtain difference-based preliminary test generalized restricted Liu estimator (PTGRLE) as: The difference-based PTGRLE has the disadvantage the it depends on α, the level of significance, and also it yields the extreme results, namely βGR (d) and βG (d) depending on the outcome of the test.Later, we will discuss in detail of difference-based Stein-type generalized restricted Liu estimator (SGRLE) defined by where is the difference-based Stein-type generalized restricted estimator (SGRE).Since the shrinkage factor (1 − d£ −1 n ) becomes negative for £ < d and difference-based SGRLE has strange behavior for small values of n, so we define the difference-based positive-rule Stein-type generalized restricted Liu estimator (PRSGRLE) as where , is the positive-rule Stein-type generalized restricted estimator (PRSGRE).The main objective of this study is to consider the performance of difference-based SGRLE and PRSGRLE.Recently the shrinkage estimation technique in various models has been considered by several researchers, for instance see, Arashi [2], Arashi et al [3], Roozbeh [17], Roozbeh and Arashi [4] and Norouzirad and Arashi [13] among others.

Bias and Risk functions
In this section, we provide the expressions for the bias and the quadratic risk of the estimators βS GR (d) and βS+ GR (d).

Biases of the estimators
Here, we first present expressions for the biases of the difference-based SGRLE and difference-based PRSGRLE Theorem 3.1: Biases of the SGRLE and PRSGRLE, respectively, are given by: where

Risk of estimators
In this subsection we will present the quadratic risk function.Suppose β * denotes an estimator of β, then for a given non-singular matrix Q, the loss function is defined as: and the corresponding risk function of the estimator β * is defined as: where M is the mean-square error matrix of the estimator β * .Theorem 3.2: Risks of SGRLE and PRSGRLE respectively are given by: .
2 is a symmetric idempotent matrix of rank q ≤ p.Therefore, there exists The matrices A 11 andA 22 are of order q and p − q respectively.Now, we define the random variable , where are independent sub-vectors of order q and p − q respectively, w 1 ∼ N q (η 1 , σ 2 I p ) and w 2 ∼ N p−q (η 2 , σ 2 I p−q ), we obtain Therefore, it follows that Similarly, Further, we conclude that Therefore, by making use of the [10] and (5) we can write: Therefore, gathering all terms we can obtain: Now, we use the following identities to obtain the final from of the risk βS GR (d) given by ( 9), . By using (6), we have . By, using ( 5)

Comparison of the estimators
In this section, we compare the underlying estimators.We compare the shrinkage type estimators of β based on the risk criterion as a function of the departure parameter.Comparisons needs the study of the derivatives of the Liu shrinkage estimators with respect to d.These procedures are adopted throughout in this section.Since C is a positive definite matrix.Thus we can find an orthogonal matrix Γ such that C = ΓΛΓ ′ and Λ = Γ ′ CΓ = Diag(λ 1 , λ 2 , . . . ,λ p ) where where Theorem 4.1.ii.A sufficient condition for SGRLE to have risk value less than or equal to GULE is that there exists a value of d ∈ (0, d * 3 ), where d * 3 is given by where, Theorem 4.1.iii.A sufficient condition for SGRLE to have smaller risk value than GRLE is that d ∈ (0, d * 4 ), where d * 4 is given by: Theorem 4.1.iv.A sufficient condition for SGRLE to have smaller risk value than PTGRLE is that d ∈ (0, d * 5 ), where d * 5 is given by where, ) such that PRSGRLE has smaller risk value than PRSGRE under H 0 : Hβ = h and H A : Hβ ̸ = h, respectively, where where, Theorem 4.1.ix.A sufficient condition for PRSGRLE to have smaller risk value than PTGRLE is that is that d ∈ (0, d * 9 ), where d * 9 is given by where,

Proof
We only prove the first and last theorems.The proofs of the others are similar and just require a straight forward calculation.
Proof of Theorem 4.1.i:Using Eqs.( 11)-( 13) under H 0 : Hβ ̸ = h, the risk expression is given by Differentiating with respect to d we find Thus, a sufficient condition for ( 16) to be negative is that 0 < d < d * 1 where . Now, underH A : Hβ ̸ = h,the risk expression is given by in which δ i is the ith element of δ, Differentiating with respect tod, gives Hence, a sufficient for (17) This difference is non positive(≤ 0), whenever

Simulation Study
In this section, we examine the risk and bias performances of the proposed estimators numerically.To achieve different degrees of collinearity, following McDonald [14] and Gibbons [7] the explanatory variables were generated using the following device where z ij are independent standard normal pseudo-random numbers, and γ is specified so that the correlation between any two explanatory variables is given by γ 2 .
The parameters β, H matrix and h vector in (1.3) are chosen as the following forms respectively

0
are the eigenvalues of C. It is easy to see that the eigenvalues of R d = (C + I p ) −1 (C + dI p ) and C d = C + I are λi+d λi+1 and λ i + 1, i = 1, . . ., p, respectively.With this background we get the following identities: Theorem 4.1.v.There always exist a positive d ∈ (0, d * 6 )and(0, d * 7

Figure 1 .Figure 2 .
Figure 1.The diagrams of R(.) and ∆ d versus d for different values of γ .

β = [ 1 0
− 1 1] ′ H = 0 0 0] ′ .Stat., Optim.Inf.Comput.Vol. 6, September 2018 us now compare the performance of the Liu shrinkage estimator with its usual counterpart.Comparison results concerning (GULE) ,(GRLE) and (PTGRLE) are well-known in the literature.Thus we focus on SGRLE and PRSGRLE in the sequel.The SGRLE and PRSGRLE are superior to other proposed estimators in semiparametric linear models under following theorems: Theorem 4.1.i.There always exist a positive d ∈ (0, d * 1 ) and (0, d * 2 ) such that SGRLE has smaller risk value than SGRE under H 0 : Hβ = h and H A : Hβ ̸ = h, respectively, where Theorem 4.1.viii.A sufficient condition for PRSGRLE to have smaller risk value than GRLE is that d ∈ (0, d * Theorem 4.1.vi.PRSGRLE has smaller risk value than SGRLE for all positive d.Theorem 4.1.vii.A sufficient condition for PRSGRLE to have smaller risk value than or equal to GULE is that d ∈ (0, d * 3 ), where d * 3 is given by (14).
to be negative is that 0 < d < d * 2 .Thus, βS GR (d) has risk value less than the risk of βS for 0 < d < d * 2 .Proof of Theorem 4.1.ix:By making use of risk difference we obtain

Table 2 .
Evaluation of PRSGRLE at different d values in model (20) with γ = 0.75.

Table 5 .
Evaluation of SGRLE at different d values in model (20) with γ = 0.95.

Table 6 .
Evaluation of PRSGRLE at different d values in model (20) with γ = 0.95.