A Fast Algorithm for Using Semi-Parametric Random Effects Model for Analyzing Longitudinal Data

Mixed effects models are frequently used for analyzing longitudinal data. Normality assumption of random effects distrbution is a routine assumption for these models, violation of which leads to model misspecification and misleading parameter estimates. We propose a semi-parametric approach using gradient function for random effect estimation. In the approach, we relax the normality assumption for random effects by estimating the random effects distribution over a pre-specified grid. Unknown parameters of the marginal model are estimated using maximum likelihood method. Some simulation studies and analyzing of a real data set are performed for illustration of the proposed semi-parametric method.


Introduction
Longitudinal studies are frequently encountered in many areas such as biology, medical, epidemiology and social sciences.In these studies, individuals are followed over time and some repeated measurements for individuals are collected.
For analyzing longitudinal data, Gaussian mixed effects models [5] are commonly used.These models are flexible and widely applicable, in addition,

Notation and model specification
Let Y i , i = 1, 2, ..., n denote the vector of n i longitudinal measurements for the i th individual such that Y i = {y i (s ij ), j = 1, 2, ..., n i }, where y i (s ij ) = y ij denotes the longitudinal measurement for the i th subject at time s ij .We consider the following linear mixed effect model for describing the longitudinal outcomes: where Y i is the longitudinal vector of response variable for the i th subject.ε i = (ε i1 , ..., ε ini ) ′ is the vector of measurement errors, β = (β 1 , ..., β p ) ′ is a p-dimensional vector of longitudinal fixed-effect parameters.b i = (b i1 , ..., b iq ) ′ is a q-dimensional vector of random effects and is independent of ε i .X i = (x i (s i1 ), ..., x i (s ini )) ′ and Z i = (z i (s i1 ), ..., z i (s ini )) ′ , where x and z are pdimensional and q-dimensional vectors of explanatory variables, respectively.We assume that ε i ∼ N ni (0, Σ i ), where we assume that Σ i = σ 2 I ni , also let θ y = (β ′ , σ 2 ).

Details of the approach
In this paper, we do not make any parametric assumption for the random effects distribution, let b i be the random effects such that b i ∼ g(b i ) where g(.) belongs to the set of all distribution functions on the parameter space of b i .The marginal distribution of Y i is given by The nonparametric maximum likelihood estimate of G is discrete with finite support [4,8].In the following, the left side of equation ( 2) is called ℓ i (G) to ensure that the marginal distribution of Y i is obtained using G.
The algorithm includes three steps.The first step consists of considering some initial values and the other two steps are weights updating and parameter estimations, respectively.The later two steps are repeated iteratively until convergence are achieved.The details of the steps are as follows: Step 1: Initial values The algorithm needs an initial values for θ y , say θ 0 y , and an initial value for the weights π, say π 0 .
• A linear mixed effects model under normal distribution assumption for random effects is considered, the parameter estimates in this stage is used as initial values to have the vector θ 0 y .• The grid µ = {µ j1j2...jq } for j 1 , j 2 , ..., j q = 1, 2, ..., C is kept fixed through the algorithm and only the corresponding weights are updated.
Step 2: Weight updating Let k be the iteration number.For each point µ j1j2...jq , the directional derivative ..jq is evaluated.This is given by , where Ĝ(k) is the current estimate of G and G µj 1 j 2 ...jq is a degenerated distribution at point µ j1j2...jq .
After directional derivative evolution bi = (b i1 , ..., b iq ), the estimated value of random effect for i th individual, is defined as follows: and the corresponding weight of π j1j2...jq is estimated as the frequency of µ j1j2...jq in the estimated values of random effects.To ensure that the mean of the random effects is zero, model (1) can be written as Step 3: Let θ (k+1) = (β (k+1) , σ 2 (k+1) ) ′ is the vector of parameters of k + 1 th step, where they are estimated in this step.The marginal distribution is given by The model parameters θ y are estimated using quasi-Newton method.Note that π = {π j1j2...jq } and µ = {µ j1j2...jq } for j 1 , j 2 , ..., j q = 1, 2, ..., C in this step are considered as fixed parameters.
The two later steps are repeated iteratively until convergence.The algorithm has converged when Stat., Optim.Inf.Comput.Vol. 2, December 2014.
A FAST ALGORITHM USING SEMI-PARAMETRIC RANDOM EFFECTS MODEL 343

Simulation studies
For investigating the proposed method, we perform some simulation studies, 1) with univariate random effect as random intercept, 2) with bivariate random effect as random intercept and random slope.
In the first scenario, the random intercept is sampled from three different distributions: a normal distribution, a uniform distribution and two components normal mixture.Also in the second scenario, the random intercept and the random slope are sampled from three different bivariate distributions: bivariate normal distribution, bivariate uniform distribution and a two components bivariate normal mixture.The details of these simulation studies are given in the following.

Univariate random effects
The data are simulated from the following linear mixed effect model: where β 0 = 5, β 1 = 1 and β 2 = 1, N = 300, t i1 = 2, t i2 = 6.x i s are generated from a Bernoulli distribution with success probability 0.2 and ε ij ∼ N (0, σ 2 )where σ 2 = 1.The random intercept, b i , is generated from three different distributional assumption: i) a normal distribution with variance σ 2 b = 4, ii) a uniform distribution on [−4, 4] and iii) a mixture of two normal distributions, i.e., 0.5 N (−1, 1) + 0.5 N (1, 1).Results of this simulation study are presented in Table 1.This table contains estimated value of parameters, standard errors, relative biases and root of mean square errors.The two later criteria are defined as where θi is the estimated value of θ for the i th sample.In this simulation study, we consider a sequence with length 100 in interval [−4, 4].This table shows that in all scenarios the estimated values of parameters, standard deviations, relative biases and root of MSEs confirm the well performance of the proposed method.Also, figure 1 shows the estimated weights of randomly selected samples for all scenarios, this figure shows that the estimated weights are similar to density plot of the real value: the estimated weights for the normal random intercept (panel a) have unimodal and symmetric form, the estimated weights for the uniform distribution have approximately equal weights at all points of the region.Also, panel (c) shows a bimodal distribution.To ensure zero-mean random effect model, we also consider the following model: where μi is the estimated value of random intercept for i th individual and π and μ are the vector of estimated weights and the vector of grid points, respectively.One of the advantages of the proposed method is its ability for considering wide intervals with favorite grid points, but a grid can be chosen for the scaled random where Ŝb is an approximation of the random effects covariance matrix of b i (similar to that of Tsonaka et al., 2009).For comparison of the results of the proposed method and the Gaussian model, we also analyze the generated data set using the pure normal model.

Bivariate random effects
The data are generated from the following linear mixed effects model with random intercept and random slope: where β 0 = 5, β 1 = 1 and β 2 = 1, N = 300, t i1 = 2, t i2 = 6, x i s are generated from Bernoulli distribution with success probability 0.2 and ε ij ∼ N (0, σ 2 ) where σ 2 = 1.The vector of random effects b i = (b 0i , b 1i ) ′ is generated from three different distributional assumptions: i) a standardized bivariate normal distribution, ii) a bivariate uniform distribution on [−4, 4] 2 , iii) a mixture of two bivariate normal distribution with mean (−1, −1) ′ and (1, 1) ′ and an identity covariance matrix.The results of this simulation study are presented in Table 2.This table shows the well performance of proposed method.Also, Figure 2 Table II shows estimated values of the weights for grid points under different distribution assumptions for random intercept and random slope.

Application
We consider a longitudinal study on 467 HIV infected patients who had failed or were intolerant of zidovudine (AZT) therapy.The aim of the study was to compare the efficacy and safety of two alternative antiretroviral drugs, namely didanosine (ddI) and zalcitabine (ddC).Patients were randomly assigned to receive either ddI or ddC, and CD4 cell counts were recorded at study entry, when randomization took place, as well as at 2, 6, 12 and 18 months thereafter.
The distribution of CD4 cell counts is right skewed, so we analyze the square root of the CD4 cell counts.Figure 3 shows √ CD4 trajectories for fifty randomly selected individuals given each drug.Panels of this figure show a sharply increasing degree of missing data over time due to death, dropout, and missed clinic visits.In this figure the profiles of those individuals who remain and those  of individuals who do not remain are indicated using different colors.The missing values belong to people who leave the study for different reasons.This figure underlines that those who do not remain had smaller √ CD4 than others.More details about this data set can be found in [2].
The following longitudinal model with random intercept is: In this model, y ij is the square root of the j th CD4 count measurement on the i th individual in the trial, j : 1, 2, ..., 5 and i : 1, 2, ..., 467.Sex i is a gender indicator (0=female, 1=male).The other three explanatory variables are Drug i (0=ddC, 1= Also, the following longitudinal model with random intercept and random slope is used for analyzing the data set: We consider the proposed method with a grid defined on [−15, 15] l , l = 1, 2 and C = 450 (we begin with a grid for random effects in [−5, 5] l , l = 1, 2, the results show that a wider grid is needed, therefore we refit the model with wider grid).Also, a sensitivity analysis of the results with respect to different value of C for random effects is performed and the results are not sensitive with respect to this change.The results of these models using the proposed approach and normal mixed effects model are summarized in Table 3.In this table, models are indicated by notations SPA and GM, where SPA is used for the proposed semi-parametric approach and GM is the ordinary Gaussian model.The standard deviation of parameters in semi-parametric approach are calculated using Bootstrap approach [13].The results are compared using AIC (Akaike information criterion), BIC (Bayesian information criterion), HQC (Hannan-Quinn criterion).Let infection are significant predictors in the model, such that the increase in time reduces CD4 count.Also, previous opportunistic infection leads to decreasing CD4 count.Figures 3 and 4, respectively, show the estimated weights for the univariate and bivariate random effects for HIV data set.These figures present the weights of the grid points in the semi-parametric fitted models.

Conclusion
In this paper, we have developed a semi-parametric mixed effects model for analyzing longitudinal data.The approach uses the directional derivative of each individual for finding the random effects estimation under any distributional assumption and leads to reliable parameter estimates.
The main advantage of this approach is in the use of the directional derivative of the individual log-likelihood.The method uses the relative frequency of grid points as an estimate of their weights, therefore, the speed of the algorithm increases.The proposed approach can be easily extended to shared random effects and be used for joint modeling of longitudinal measurements and event time.In these models the association between two process is captured by latent random effects.

Figure 1 .
Figure 1.Estimated values of the weights for grid points under different distribution assumptions for random intercepts.(a) normal, (b) uniform, (c) mixture of two normals.

Figure 2 .
Figure 2.Estimated values of the weights for grid points under different distribution assumptions for random effects.(a) bivariate normal, (b) bivariate uniform, (c) mixture of two bivariate normals.
Figure 3. Profiles of √ CD4 measurements over time for fifty randomly selected individuals from each drug, bold red lines are mean profile for all observed individuals on each drug.

Figure 4 .
Figure 4.The estimated weights for the univariate random effects for HIV data set.

Figure 5 .
Figure 5.The estimated weights for the bivariate random effects for HIV data set.

Table I .
2, ..., N, j = 1, 2, ..., n i , Results of simulation studies for mixed model with univariate random effects . Results of simulation studies for mixed model with bivariate random effects

Table III .
Parameter estimates and standard deviations for the HIV data set.SPM: semiparametric mixed effect model, PM: parametric model.