A simulation study of semiparametric estimation in copula models based on minimum Alpha-Divergence

The purpose of this paper is to introduce two semiparametric methods for the estimation of copula parameter. These methods are based on minimum Alpha-Divergence between a non-parametric estimation of copula density using local likelihood probit transformation method and a true copula density function. A Monte Carlo study is performed to measure the performance of these methods based on Hellinger distance and Neyman divergence as special cases of Alpha-Divergence. Simulation results are compared to the Maximum Pseudo-Likelihood (MPL) estimation as a conventional estimation method in well-known bivariate copula models. These results show that the proposed method based on Minimum Pseudo Hellinger Distance estimation has a good performance in small sample size and weak dependency situations. The parameter estimation methods are applied to a real data set in Hydrology.


Introduction
The copulas describe the dependence between random vector components. Unlike marginal and joint distributions that are clearly observable, the copula of a random vector is a hidden dependence structure that connects the joint distribution with its margins. The copula parameter captures the inherent dependence between the marginal variables and it can be estimated by either parametric or semiparametric methods. Maximum likelihood estimation (MLE), which is used to estimate the parameter of any type of model, is the most effective method. It can also be applied to copula, but the problem becomes complicated as the number of parameters and dimension of copula increases, because the parameters of the margins and copula are estimated simultaneously. Therefore, MLE is highly affected by misspecification of marginal distributions.
A rather straightforward way at the cost of lack of efficiency is inference functions for margins (IFM), which is put forward by Joe [15]. Similar to MLE in this method the margins of the copula are important, because the parameter estimation is dependent on the choice of the marginal distributions. In IFM method, the parameters are estimated in two stages. In the first stage, the parameters of margins are estimated and then the parameters of copula will be evaluated given the values from the first step. Genest et al. [12] introduce a semiparametric method, known as maximum pseudo-likelihood (MPL) estimation, similar to MLE. The only difference between this method and MLE is that the data must be converted to pseudo observations. The consistency and asymptotic normality of this method is established in their paper. They established that this method is efficient for independent copula. The results of an extensive simulation studied by kim et al. [16] show that the ML and IFM methods are non-robust where F X and F Y are the marginal distributions of X and Y , respectively. A bivariate copula function C is a cumulative distribution function of random vector (U, V ), defined on the unit square [0, 1] 2 , with uniform marginal distributions as U = F X (X) and V = F Y (Y ). The authors shall write C(u, v; θ) for a family of copulas indexed by the parameter θ. If C (u, v; θ) is an absolutely continuous copula distribution on [0, 1] 2 , then its density function is c(u, v; θ) = ∂ 2 C(u,v;θ) ∂u∂v . As a result, the relationship between the copula density function (c) and the joint density function (f ) of (X, Y ) according to equation (1) can be represented as where f X and f Y are the marginal density functions of X and Y , respectively. Table 1 presents summary information of some well-known bivariate copulas such as the parameter space and Kendall's tau (τ ) of them. In this table, Clayton, Gumbel, and Frank copulas belong to the class of Archimedean copulas and Gaussian and T copulas belong to the class of Elliptical copulas. The copula-based Kendall's tau association for continuous variables X and Y with copula C is given by .., (X n , Y n ) be a random sample of size n from a pair (X, Y ). Empirical copula that was initially introduced by [8 whereŨ i = nF X (x i )/(n + 1),Ṽ i = nF Y (y i )/(n + 1) for i = 1, · · · , n, are the pseudo observations andF X and F Y are the empirical cumulative distribution function of the observation X i and Y i , respectively.

Semiparametric maximum likelihood estimation
In view of (2), the log-likelihood function takes the form ) .
Hence the MLE of θ, which we denote byθ M L is the global maximizer of L(θ) and √ n(θ M L − θ) converges to a Gaussian distribution with mean zero, where θ is the true value. Since we assume that the model is correctly specified and hence L(θ) is the correct log-likelihood, it follows that the MLE enjoys some optimality properties and hence is the preferred first option. If the model is not correctly specified so that L(θ) is not the correct loglikelihood, then the maximizer of L(θ) is not the MLE and hence it may lose its preferred status.
In MPL method, the marginal distributions have unknown functional forms. Estimation of marginal distributions are estimated non parametrically by their sample empirical distributions. Then, θ is estimated by the maximizer of the pseudo log-likelihood,θ is the inverse of the standardized univariate Gaussian distribution and Φ 2 is the standardized bivariate Gaussian distribution with correlation parameter θ. ‡ t −1 ν is the inverse of the standardized univariate Student's t distribution with ν degree of freedom and t 2,ν is the standardized bivariate Student's t distribution with correlation coefficient θ and ν degree of freedom.
where (Ũ i ,Ṽ i ), i = 1, · · · , n, are the pseudo observations. The authors shall refer to (4) as the maximum pseudo likelihood (MPL) estimator of θ. [12] and [29] showed thatθ M P L is consistent estimator. This non-linear optimization problem can easily be solved by Statistical programming language R or Mathematica.

Local likelihood probit transformation estimation
Transformation method was introduced to kernel copula density estimation by [5]. The simple idea is to transform the data so that it is supported on the full R 2 (instead of the unit cube). On this transformed domain, standard kernel techniques can be used to estimate the density. An adequate back-transformation then yields an estimate of the copula density. The inverse of the standard Gaussian CDF is most commonly used for the transformation since it is known that kernel estimators tend to do well for Gaussian random variables.
..,n are independent and identically distributed observations from the bivariate copula C and the purpose is to estimate the corresponding copula density function. Denote Φ as the standard Gaussian distribution and ϕ as its first order derivative.
is a random vector with Gaussian margins and copula C. According to (2), the corresponding density function can be written as f (s, t) = c(Φ(s), Φ(t))ϕ(s)ϕ(t). Thus, an estimation of the copula density function can be given bŷ However, as the (U i , V i ) are unavailable and one has to use the pseudotransformed sample, instead. As a first natural idea, the standard kernel density estimator forf n in (5) can be considered as follows:f where K : R 2 → R is a kernel function, and H ST = is a bandwidth matrix.
This kernel estimator has asymptotic problems at the edges of the distribution support. To remedy this problem, local likelihood probit transformation (LLPT ) method was recently suggested by [11]. Instead of applying the standard kernel estimator, they locally fit a polynomial to the log-density of the transformed sample. The advantages of estimating f (s, t) by local likelihood methods instead of raw kernel density estimation are related to the detailed discussion in [10]. This method can fix the boundary issues in a natural way and able to cope with unbounded copula densities. The notations are similar to ones used in [11]. Recently, [22] with a comprehensive simulation study has shown that LLPT method for copula density estimation yields very good.
Around (s, t) ∈ R 2 and (s ′ , t ′ ) close to (s, t), the local log-quadratic likelihood estimation of log f (s, t) from the pseudo-transformed sample is defined as: The vector a 2 (s, t) ≡ (a 2,0 (s, t), · · · , a 2,5 (s, t)) is then estimated by solving a weighted maximum likelihood problem asâ Therefore, the estimation of f (s, t) isf p (s, t) = exp{â 2 (s, t)} and thus LLPT estimator of a copula density iŝ When the underlying density is on [0, 1] 2 , the performance of the kernel estimator depends on the choice of the kernel function and the bandwidth (smoothing parameter). For bandwidth choice, a practical approach is to consider the minimization of the AMISE on the level of the transformed data. In this article, the bandwidth choice based on nearest-neighbor method (see, [11], Section 4).

Semiparametric Alpha-Divergence estimation
Initially, Chernoff [7] proposed the Alpha-Divergence, which is a generalization of the KL divergence. For some Alpha-Divergence investigations see, e.g., [1,4,24]. Alpha-Divergence measure can be derived from Csiszár f- , t ≥ 0, α ̸ = 0, 1. The Alpha-Divergence (AD) between two probability density functions f 1 and f 2 of a continuous random variable can be defined as: The AD divergence is non-negative and true equality to zero holds if and only if f If α → 1, the Kullback-Leibler divergence (KLD) can be obtained from equation (7). The Kullback-Leibler (KL) divergence between two densities f 1 and f 2 that was introduced by Kullback and Leibler [17] is given by Also, two other special cases of Alpha-Divergence are Hellinger distance and Neyman divergence that will be used in practice. The well-known Hellinger distance (HD) and Neyman (Neyman Chisquare) divergence (ND) can be obtained from equation (7) for α = 0.5 and α = 2, respectively as It is well known that maximizing the likelihood is equivalent to minimizing the KL divergence. Let c(u, v; θ) be true copula density function associated with copula C. The MPL estimator is equivalent to minimum pseudo KL divergence (MPKLD) between copula density estimationĉ(u, v) and true copula density c(u, v; θ) and given bŷ The factor 1/n in the equation (8) does not affect the attained arg max with respect to θ, and the two approaches MPL and MPKLD gives the same result. The Alpha-Divergence between copula density estimationĉ(u, v) and true copula density c(u, v; θ) to obtain MPAD estimation defined asθ M P AD = arg min θ AD(ĉ||c).

839
The minimum pseudo Hellinger distance (MPHD) is given bŷ Similarly, the minimum pseudo Neyman divergence (MPND) defined aŝ In practice, instead ofĉ in equations (9) and (10), the local likelihood probit transformation estimation of copula density (ĉ (LLPT ) n ) , which obtain from equation (6), will be used. Tsukahara [29] explores the asymptotic properties of minimum distance estimators based on copula. He followed [3] closely in investigating these properties.

Simulation study
A simulation study was performed to compare the MPL estimator to the MPHD and MPND estimators as special cases of minimum Alpha-Divergence estimator described in the Section 4. All computations were performed using copula and kdecop packages in R software. The aim of this simulation study is to compare the true parameter θ with the parameter estimateθ, under the assumption that the copula's parametric form is correctly selected. This aim is accomplished by comparing the Bias, mean square error (MSE) and relative efficiency (rMSE) of the three approaches of copula parameter estimations that given by The data are generated from three Archimedean copulas such as Clayton, Gumbel, and Frank and two Elliptical copulas such as Gaussian and T (ν=2 and ν=10) copulas with Kendall's tau 0.1, 0.2, 0.4, 0.6, and 0.8 that are presented in Table 1. These copulas cover different dependence structures. Gaussian and Frank copulas exhibit symmetric and weak tail dependence in both lower and upper tails. The Clayton copula exhibits strong left tail dependence and the Gumbel copula has strong right tail dependence. In T copula with positive dependency and small degrees of freedom (ν < 10) tail dependency occurs in both lower and upper tails and as the degree of freedom increases, dependency in the tail areas decreases (see, [9]). Moreover, 1000 Monte Carlo samples of sizes n = 30, 75, and 150 are generated from each type of copulas and the three estimates are computed: MPL, MPHD, and MPND.

Results
Results of the simulation study are presented in Tables 2-7      sample sizes greater than 150 were in line with our expectation that the increase in sample size will improve the parameter estimation, the corresponding results were omitted from the tables for brevity. Also, the results show that the MPL method outperforms MPHD and MPND for sample sizes greater than 150. The results for the T copula with 4 and 7 degrees of freedom were omitted as well as the results did not differ from those for the two other T copulas with 2 and 10 degrees of freedom. The results given in Tables 2-7 show that estimated Bias and MSE of parameter estimation of the Archimedean and Elliptical copulas decrease as sample size increases and parameter estimates improve. The estimated Bias and MSE of parameter estimation increase with increasing Kendall's tau for Archimedean copulas. Also, estimated MSE of parameter estimation decrease with increasing Kendall's tau, whereas estimated Bias of parameter estimation has no clear trend for Elliptical copulas. Furthermore, the results for estimated MSE of MPL estimator relative to the MPHD and MPND estimators (rMSE) in percent for Archimedean and Elliptical copulas in Tables  6-7 show that rMSE increase with increasing sample size or Kendall's tau.
The results given in Tables 2-5 show that the MPL yields the best results for the large sample size (n ≥ 100) and high dependency (τ ≥ 0.5). For the small sample size (n < 100) and weak dependency (τ < 0.5) , Minimum Hellinger distance estimation outperforms MPL estimation method. Among the two new minimum distance estimators, the results show thatθ M P HD is better thanθ M P N D based on MSE in always. This advantage for 842 IMAGE RECONSTRUCTION FROM INCOMPLETE CONVOLUTION DATA Table 7. estimated MSE of MPL estimator relative to the MPHD and MPND estimators (rMSE) in percent for Elliptical copulas  θ M P HD is clearer in Archimedean copulas than in Elliptical copulas. Thus, there is no evident reason why one would be inclined to use anθ M P N D . In addition to these results, the estimated bias seem to be considerably higher for Archimedean copulas than for Elliptical copulas. In all tables, the biases of the MPL estimators are almost always lower than the biases of the MPHD and MPND estimators for the large sample size (n > 100). Finally, it is necessary to note that although the time required to compute the MPHD method is longer than the MPL method, the MPHD method has accurate and acceptable results for small sample size and weak dependency.

Application in Hydrology
An application of estimation methods is demonstrated to a given dataset in Hydrology. [31] established a joint distribution function of drought intensity, duration, and severity by using Gaussian and Gumbel copulas. [27] used several meta-elliptical copulas in drought analysis and found that meta-Gaussian and T copula had a better fit. [18] investigated the drought events in the Weihe river basin and selected the Gaussian and T copulas to model the joint distribution among drought duration, severity, and peaks. Recently, a very comprehensive book on the application of copula in Hydrology has been published by [6] and the concepts in this section are taken from this book. [19] proposed the concept of standardized precipitation index (SPI) based on the long-term precipitation record for a specific period such as 1, 3, 6, 12, months, etc. [14] recommended the use of SPI as a primary drought index because it is simple, spatially invariant in its interpretation, and probabilistic. Therefore, the SPI series is used for this article. Fitting this long-term precipitation record to a probability distribution is the first step to calculate SPI series. Once the probability distribution is determined, the cumulative probability of observed precipitation is computed and then inverse transformed by a standard Gaussian distribution is equal to SPI series. A drought event is thus defined as a continuous period in which the SPI is below 0.
The objective of this section is the estimation of copula parameter between drought characteristics (events) based on SPI, including drought duration, drought severity, and drought interval time. Drought characteristics are recognized as important factors in water resource planning and management. Drought duration (D d ) is defined as the number of consecutive intervals (months) where SPI remains below the threshold value 0 (see, [25]). Drought severity (S d ) is defined as a cumulative SPI value during a drought period, S d = ∑ D d i=1 SP I i where SP I i means the SPI value in the ith month (see, [21]). The drought interval time (I d ) is defined as the period elapsing from the initiation of drought to the beginning of the next drought (see, [28]).
The monthly precipitation data of Mashhad station, located in Iran, from 1985 to 2017 (http://www.irimo.ir/eng/index.php) is used as an example to illustrate the proposed methodology. The monthly precipitation of Mashhad can be fitted by a gamma distribution. The monthly SPI series is then calculated and  Table 8. A goodness of fit testing procedure based on parameter estimations methods is applied. In the large scale Monte Carlo experiments carried out by Genest et al. [13], the CvM statistic as gave the best results overall, where C n is the empirical copula defined in (3) and Cθ is an estimator of C under the hypothesis that H 0 : C ∈ C θ holds. The estimatorsθ of θ appearing in (4) and (9). An approximate P-Value for S n can be obtained by means of a parametric bootstrap-based procedure as described in [13].
One of the challenges that we face is the specification of a suitable copula. Since there are a large number of copulas, specifying one that would suit a particular case in practice is not easy. Therefore, a reasonable strategy is to consider different copulas and evaluate their goodness of fits. To this end, the Archimedean and Elliptical copulas in Table 1 are considered that have attracted considerable interest because of its flexibility and simplicity. The diagnostic checks to investigate the dependence structure for pairs (S d , I d ) and (D d , I d ) suggested that Gumbel and Gaussian copulas fit well and better than the others considered. The Gumbel and Gaussian copulas are fitted by the MPL and MPHD methods. The estimates and various relevant quantities are presented in Table 8.
The scatter plots for the empirical distributions of pair ( Figure 1. This figure shows that the points tend to concentrate near (1, 1). Thus, the Gumbel copula that have upper tail dependence appears to be more appropriate for both two pairs. On the other hand, according to the values of the Akaike Information Criterion (AIC) in Table 8, it can be concluded that for both pairs (S d , I d ) and (D d , I d ), the Gumbel copula is better suitable than Gaussian copula, because it has the least value of AIC. The P-Values and values of statistic S n can be used to compare the goodness of fits. These are given here just as a point of reference but we recognize that they do not have the usual meaning of the P-Value. The large P-Values, for pair (S d , I d ) based on S n would be 0.6418 for the Gumbel copula with parameter estimation by MPHD. Also, the large P-Values, for pair (D d , I d ) based on S n would be 0.3390 for the Gumbel copula with parameter estimation by MPHD. The values of the copula parameter are difficult to interpret, but the corresponding values of the Kendall's tau have more intuitive interpretations. By using the relations in Table 1, the values the Kendall's tau corresponding to the different estimates of θ (τ (θ)) are given in Table 8. Note that for pair (S d , I d ), the Gumbel copula based on MPHD method hasθ M P HD = 1.3047 and τ (θ) = 0.2335. The fact that τ (θ) is nearly identical to the nonparametric sample estimate,τ n = 0.2394, implies that the MPHD approach handles this dependency aspect well. This provides additional support to previous observation that the MPHD method estimated well and better than the MPL. Overall, the results suggest that the Gumbel copula estimated by MPHD provides an acceptable fit for both pairs of drought variables.

Conclusion
In this paper, two methods of copula parameter estimation based on Alpha-Divergence were presented for some bivariate Archimedean and Elliptical copulas. The minimum of Kullback-Leibler divergence, Hellinger distance, and Neyman Divergence as special cases of Alpha-Divergence based on pseudo observations were used to obtain the copula parameter estimation. The simulation results suggests that the minimum pseudo Hellinger distance estimation method has good performance in small sample size (n < 100) and weak dependency (τ < 0.5) situations when compared with the MPL estimation methods for Archimedean and Elliptical copulas. Also, the simulation results show thatθ M P HD is better thanθ M P N D in almost always. The estimation methods were developed in the Goodness of fit test based on CvM distance for a data set in Hydrology and the results show that the MPHD method is more accurate than MPL method.