On the linear combination of independent logistic random variables

In this work the exact distribution of the linear combination of p independent logistic random variables is studied. It is shown that the exact distribution may be represented as a shifted infinite sum of independent random variables distributed as the difference of two independent Generalized Integer Gamma distributions. In addition, two near-exact approximations are developed for this distribution. Numerical studies are conducted to access the degree of precision and also the computational performance of these approximations. The developed methodology is used to derive near-exact approximations for the linear combination of independent generalized logistic random variables.


Introduction
The logistic distribution is an important distribution in statistics and is used in several areas of research.For example, in logistic regression to model categorical variables [7] and in physics, survival analysis, growth models, medical diagnosis and public health [3,16,8,2,10,1] among others.This distribution has many similarities with the Normal distribution but with heavier tails.Problems related with the use of linear combinations of independent logistic random variables may arise naturally from the applications addressed in the above references when these are considered in the multivariate setting.Despite the importance of this distribution, as far as the author knows, there are few results available for the distribution of the linear combination of independent logistic random variables.[9] addresses the sum of logistic random variables when the variables are independent and identically distributed.The probability density and cumulative distribution functions for the linear combination of n independent logistic random variables were obtained in [18] in terms of the H-function, which is difficult to use in practice.[15] develops approximations for the distribution of the sum of random variables with a generalized logistic distribution also for the independent and identically distributed case.In [17] it is defined that a random variable Y has a generalized logistic distribution if Y = log[X/(1 − X)] with X ∼ Beta(p, q) and the moment generating function is given as Γ(p + t)Γ(q − t) Γ(q)Γ(p) .
We will denote this fact by Y ∼ GLogistic(p, q) .It will be shown, in Section 5, that the results in this paper can also be used to address the distribution of linear combination of independent generalized logistic random variables.We should also note that this distribution was already presented in Table A, page 155 of [14].We should point In this section we present two results on the exact distribution of the linear combination of independent logistic random variables.Theorem 2 is the basis for the near-exact approximations developed in Section 3. Let X 1 , . . ., X p be p independent logistic random variables, with parameters µ j ∈ R (the set of real numbers) and σ j ∈ R + (the set of positive real numbers), that is ∼ Logist(µ j , σ j ), for j = 1, . . ., p.It is known that the characteristic function of X j is given by [1] where B(., .)denotes the usual Beta function.Therefore, the characteristic function of the linear combination of p independent logistic random variables, W = ∑ p j=1 α j X j , for α j ∈ R, is defined as [1] Φ Theorem 1 Let X 1 , . . ., X p be p independent logistic random variables, with parameters µ j ∈ R and σ j ∈ R + .Then the characteristic function of W = ∑ p j=1 α j X j with α j ∈ R may be written as Proof: We may write the expression of the characteristic function of W in (2) as then using the equality [11, p. 9, expression (12)] , z ∈ C (the set of complex numbers), we have

}
which gives rise to the desired result. 2 In the expression of the characteristic function of W in (3) in Theorem 1, we may identify the following: corresponds to the characteristic function of the sum of independent Exponential distributions multiplied by σ j α j with parameters n + 1, which corresponds to a Generalized Integer Gamma (GIG) distribution [4], • the expression is the characteristic function of the sum of independent Exponential distributions multiplied by −σ j α j with parameters n + 1, which corresponds to a negative GIG distribution, • thus expression is the characteristic function of a DGIG distribution, • finally, from the expression of the characteristic function of W in (3) we may say that the exact distribution of W may be represented as a shifted infinite sum of independent DGIG distributions.
The result provided by Theorem 1 provides an interesting insight about the exact distribution of W however is not useful in practice due to the infinite product in expression (3).We intend to overcome this problem by developing near-approximations for the distribution of W .In order to develop these approximations we have to consider a different representation of the exact characteristic function of W which is given in the following theorem.

Theorem 2
Let X 1 , . . ., X p be p independent logistic random variables, with parameters µ j ∈ R and σ j ∈ R + .Then, for δ ∈ N\{1} (N denotes the set of positive integers numbers), the characteristic function of W = ∑ p j=1 α j X j with α j ∈ R may be written as with and ON THE LINEAR COMBINATION OF INDEPENDENT LOGISTIC RANDOM VARIABLES Proof: If in the expression of the characteristic function of W in (2) we multiply and divide by Γ(δ), by Γ(δ − σ j α j it) and by Γ(δ + σ j α j it), for some δ ∈ N, we obtain Γ( 1) Now using the equality [11, p. 8-9, expressions ( 6) and (7)] for z ∈ Z, the result provided by this theorem follows after some simplifications.2 For a matter of simplicity, from now on, one will only consider the case α j > 0 for j = 1, . . ., p.However, the general case can be addressed using the same procedure.
From Theorem 2 we may conclude that with W 1 and W 2 independent random variables and where W 1 has a DGIG distribution and W 2 has the distribution of a shifted sum of independent random variables whose distribution corresponds to the difference of two independent LogGamma random variables or equivalently the distribution of a shifted sum of independent generalized logistic random variables, according to the definition provided in [17].Addressing now with more detail the distribution of W 1 ; if some of the positive or negative Exponential distributions in (6) have the same parameter we can sum them, obtaining in this way Gamma distributions, so that equation ( 6) can be written as where r + = (r + 1 , . . ., r + ℓ + ) and λ + = (λ + 1 , . . ., λ + ℓ + ), are respectively the shape and the rate parameters corresponding to the positive Exponential distributions, and r − = (r − 1 , . . ., r − ℓ − ) and λ − = (λ − 1 , . . ., λ − ℓ − ) are respectively the shape and the rate parameters corresponding to the negative Exponential distributions, and where ℓ + is the number of positive Exponential distributions with different rate parameters and ℓ − is the number of negative Exponential distributions with different rate parameters.Clearly, in this case, we have ℓ + = ℓ − .As already referred, the exact distribution of W 1 is a DGIG distribution which using the notation in Appendix 1 of [13] may be denoted by , In the next section we will show how it is possible to derive near-exact approximations using the result in Theorem 2 .

Near-exact distributions for W
The idea behind the development of near-exact approximations is to approximate just a 'part' of the characteristic function of W , in the present work Φ W2 in (7), by another characteristic function in such way that the resulting characteristic function corresponds to a known and manageable distribution.Similar to [13] we will consider the two following approaches.

First near-exact approximation
In the first approach we will approach the distribution of W = W 1 + W 2 by the distribution of W 1 + E(W 2 ), which is the distribution of W 1 with a shift.It is easy to show that E(W 2 ) = ∑ p j=1 µ j α j , therefore the approach is completely defined, and one will have as an approximating distribution of the distribution of W a shifted DGIG distribution.The result is established in the following theorem.
Please see Appendix A, for the expression of the probability density function of the SDGIG distribution which is used to build the densities in Figure 1.

Second near-exact approximation
To develop the second near-exact distribution we approximate the distribution of W 2 in (7) with a shifted Gamma distribution denoted by W ⋆ 2 ∼ SGamma(ρ, λ, θ), where ρ is the shape parameter, λ is the rate parameter and θ is the shift parameter, and whose characteristic function is The parameters ρ, λ, and θ are determined by solving the system of equations The following theorem holds.

Theorem 4
Let X 1 , . . ., X p be p independent logistic random variables, with parameters µ j ∈ R and σ j ∈ R + .If we use as an asymptotic approximation of Φ W2 in (7) the characteristic function Φ W ⋆ 2 in (11), we obtain as near-exact distribution for W = ∑ p j=1 α j X j with and α j ∈ R + the distribution of with W 1 distributed as in (10) and W ⋆ 2 ∼ SGamma(ρ, λ, θ), where ρ, λ, and θ are given as solutions of the system in (12).
In Appendix A we present the expression of the cumulative distribution function of W 1 + W ⋆ 2 .Please see also [13] for more details.
Remark: We could have also considered a mixture of shifted Gamma distributions to approximate the distribution of W 2 instead of a single shifted Gamma distribution (11).This would give rise to even more accurate approximations.However, these approximations would be more difficult to implement and more time consuming in computational terms.For these reasons we have decided to leave this approach out of the present work.

Measuring the accuracy and computational performance of the approximations
All the calculus in this section and also the implementation of the approximations developed in the previous section were made in the software Mathematica 10.0.We should emphasize that these approximations are only possible due to the strong connection between the theoretical results and the computational power available today.

ON THE LINEAR COMBINATION OF INDEPENDENT LOGISTIC RANDOM VARIABLES
To illustrate the properties and qualities of these approximations one will consider the following scenarios.In Figure 1 we present: i) the smooth empirical density determined from a simulation of 5 000 000 values of W (solid line), ii) the probability densities functions of the near-exact approximations given in Theorem 3 (see Appendix A for details) for δ = 4 (dotted line), 10 (dashed line) and 40 (Dot-dashed line).We see from Figure 1 that, for the near-exact approximations developed in Theorem 3 and for values of δ = 40 there is a fairly reasonable adjustment between the exact and approximating distribution.Clearly, a more perfect fit can be reached by considering higher values of δ.

Scenario I Scenario II
To access the precision of the approximations we use a measure of proximity between characteristic functions which is also a measure of proximity between cumulative distribution functions.The measure is defined as, where Φ W represents the exact characteristic function of W and Φ app represents an approximate characteristic function for Φ W .This measure has already been used in several related studies, for further details please see Stat., Optim.Inf.Comput.Vol. 6, September 2018 F.J. MARQUES 389 [12,13] .In Table 1, we computed the values of the measure ∆ between the exact characteristic function of W in (2) and the approximating characteristic function corresponding to the distribution in Theorem 3 which is given by (6).In Table 2 we considered the same exact characteristic function of W and the approximating characteristic function corresponding to the distribution in Theorem 4 and which is given by with Φ W1 in ( 6) and Φ W ⋆ 2 in (11) .
Table 1.Values of ∆ for the first type of near-exact approximations given by Theorem 3 From Tables 1 and 2 we may see that the second near-exact approximation, corresponding to the result in Theorem 4, tends to give smaller values of the measure ∆ than the first near-exact approximation.Both approximations improve their precision when δ increases.Thus, the parameter δ can be used to control the quality of these approximations.To study the efficiency of these approximations in computational terms we have determined the empirical 0.90 and 0.95 quantiles from a simulated sample of size 5 000 000 and evaluated the cumulative distribution functions corresponding to the first and second near-exact approximations.We present, in Tables 3  and 4, the results and also the computing time in seconds for the approximating values of P (W ≤ q) obtained using the first and second type of near-exact approximations given in Theorems 3 and 4 and denoted in the tables, respectively, by F1,q and F2,q .
From Tables 3 and 4 we may observe, in the scenarios considered, that i) the second near-exact approximation, although more accurate, it can be quite time consuming, therefore we suggest the use of small values of δ for this approximation, ii) the first near-exact approximation is quite fast but not so accurate, iii) in both approximations the computing time increases with the value of δ and with the number of variables, iv) finally, for practical purposes, we suggest the use of the second near-exact approximation in cases where high accuracy is required and the first near-exact approximation for cases where it is preferable to choose speed rather than precision.In Table 4 we only considered values δ ≤ 10 because, for bigger values of δ, the computing time starts to be very high and thus the approximation is of reduced interest in practical terms.Table 3. Computing times, in seconds, for the approximating values of P (W ≤ q) obtained using the first type of near-exact approximations given by Theorem 3 where q is the 0.90 or 0.95 empirical quantile

Application of this procedure to the linear combination of independent generalized logistic distributions
The near-exact approximations developed in Section 3 were obtained by approximating Φ W2 in (7).The expression exp { it ∑ p j=1 µ j α j } in (7) corresponds to a shift in the distribution with characteristic function given by which, according to the definition in [17], is a linear combination of independent generalized logistic random variables.Thus, to derive near-exact approximations for the linear combination of independent logistic random variables we have approximated a shifted linear combination of independent generalized logistic random variables by a shifted Gamma distribution.Thus, it seems appropriate to consider the same procedure developed in Section 3 to develop near-exact approximations for the linear combination of generalized logistic distributions.We will briefly explain the procedure since it is very similar to the one developed in Section 3. Let us consider Y 1 , . . ., Y p independent generalized logistic random variables, with parameters p j , q j ∈ R + that is for j = 1, . . ., p.In [17] the authors give the moment generating function for Y j from which it is possible to derive easily the corresponding characteristic function of Y j .Thus, the characteristic function of Y j is given by Therefore, the characteristic function of the linear combination of p independent logistic random variables, Z = ∑ p j=1 α j Y j , for α j ∈ R, is defined as In order to develop near-exact approximations for the distribution of Z we may represent the exact characteristic function of Z as in the following theorem.

Theorem 5
Let Y 1 , . . ., Y p be p independent generalized logistic random variables, with parameters p j , q j ∈ R + .Then, for δ ∈ N, the characteristic function of Z = ∑ p j=1 α j Y j with α j ∈ R may be written as with and Γ(q j + δ + α j it) Γ(q j + δ) (18) Proof: The proof is similar to the one of Theorem 2. 2 Similar to what was referred for W 1 in Section 3 but now for the distribution of Z 1 ; if some of the positive or negative Exponential distributions in (17) have the same parameter we can sum them, obtaining in this way Gamma ON THE LINEAR COMBINATION OF INDEPENDENT LOGISTIC RANDOM VARIABLES distributions, so that where, again, r + = (r + 1 , . . ., r + ℓ + ) and λ + = (λ + 1 , . . ., λ + ℓ + ), are respectively the shape and rate parameters corresponding to the positive Exponential distributions, and r − = (r − 1 , . . ., r − ℓ − ) and λ − = (λ − 1 , . . ., λ − ℓ − ) are respectively the shape and rate parameters corresponding to the negative Exponential distributions, and where ℓ + is the number of positive Exponential distributions with different rate parameters and ℓ − is the number of negative Exponential distributions with different rate parameters.In this case ℓ + may not be equal to ℓ − .Very succinctly, noticing that Z = Z 1 + Z 2 and following a similar procedure to the one used in Section 3 and stated in Theorems 3 and 4, we will consider the following approximations: 1. in the first approach for the distribution of Z = Z 1 + Z 2 we consider the distribution of Z 1 + E(Z 2 ).The resulting distribution is a shifted DGIG distribution ) with corresponding characteristic function given by 2. For the second near-exact distribution it may happen that the solution of a similar system to the one in ( 12) may provide a negative value for λ.In this case we will consider a negative shifted Gamma distribution for Z ⋆ 2 to approximate the distribution of Z 2 .One should note that, in this case, the resulting approximating distribution of Z 1 + Z ⋆ 2 was also studied in [13].Thus, the second near-exact approximation will be obtained approximating the distribution of Z 2 in (18) with a positive or negative shifted Gamma distribution.For the positive case the characteristic function is given by and for the negative case by Φ Z ⋆ 2 (−t).This procedure was also adopted in [13] for the linear combination of independent Gumbel random variables.The parameters ρ, λ, and θ are determined by solving the system of equations in (12), but now replacing Φ W ⋆ 2 (t) by Φ Z ⋆ 2 (t) and Φ W2 (t) by Φ Z2 (t).Thus, the distribution of Z = ∑ p j=1 α j Y j will be approximated by the distribution of with characteristic functions given, respectively, by Details on the distribution of Z 1 + Z ⋆ 2 and of Z 1 − Z ⋆ 2 can be found in Appendix 1 of [13] To illustrate the quality of these approximations one will also use the measure ∆ in (13).In this measure one will consider the exact characteristic function of Z in (15) and the characteristic function corresponding to the first near-exact approximation in (20) and to the one corresponding to the second near-exact approximation in (21).In Tables 5-8 ahead we consider the following three scenarios: -Scenario IV: p IV = (1, 2), q IV = (5, 6), and α IV = (2, 3); -Scenario V: p V = (1, 2, 3), q V = (1, 5, 3), and α V = (4, 5, 6); In scenarios IV and V we considered for the distribution of Z ⋆ 2 a shifted Gamma distribution and in Scenario VI we used a negative shifted gamma distribution for the approximating distribution.In Tables 5-8 we may observe, for both approximations, the same kind of behaviour already described in Tables 1-4 for the near-exact approximations developed in Section 3. The second near-exact approximation is again more precise but requires more computing power and time.Again, in this case we have only consider positive α j (j = 1, . . ., p) however the general case with α j ∈ R can be addressed in the same manner.Further research must be done in order to analyse all the details of this second near-exact approximation, for example it is important to find out in which cases should we use the positive or the negative shifted Gamma distribution.Another issue is what moments should be matched?We have chosen to match the first, second and fourth moments because when we tried to match the first three moments we found out that the system in (12) had no solution, in the case of the linear combination of independent logistic random variables.However, for the linear combination of independent generalized logistic distributions it is possible to use the first three moments.For both near-exact approximations, when the number of variables and/or δ increases the computing time also increases, this trend is more notorious for the second near-exact approximation.Final remark, to emphasize that the approximations developed in this work are only possible due to the fundamental combination of theory and computation techniques and can only be implemented, for practical purposes, because of the computing power available today.

Figure 1 .
Figure 1.Plots of the smooth empirical distribution of W (solid line) and of the near-exact approximations in Theorem 3, which density functions are given in Appendix A, for δ = 4 (dotted line), 10 (dashed line) and 40 (Dot-dashed line).In these plots f (.) stands for the density functions and y is the running value.