Big Data/Analytics Zone is brought to you in partnership with:

Arthur Charpentier, ENSAE, PhD in Mathematics (KU Leuven), Fellow of the French Institute of Actuaries, professor at UQàM in Actuarial Science. Former professor-assistant at ENSAE Paritech, associate professor at Ecole Polytechnique and professor assistant in economics at Université de Rennes 1. Arthur is a DZone MVB and is not an employee of DZone and has posted 156 posts at DZone. You can read more from them at their website. View Full User Profile

Bounding Sums of Random Variables: Part 2

09.30.2012
| 2289 views |
  • submit to reddit

It's possible to go further -- much further -- on bounding sums of random variables (mentioned in the previous post). For instance, if everything has been defined, in that previous post, on distributions on , it is possible to extend bounds of distributions on . Especially if we deal with quantiles. Everything we've seen remain valid. Consider for instance two  distributions. Using the previous code, it is possible to compute bounds for the quantiles of the sum of two Gaussian variates. And one has to remember that those bounds are sharp.

> Finv=function(u) qnorm(u,0,1)
> Ginv=function(u) qnorm(u,0,1)
> n=1000
> Qinf=Qsup=rep(NA,n-1)
> for(i in 1:(n-1)){
+ J=0:i
+ Qinf[i]=max(Finv(J/n)+Ginv((i-J)/n))
+ J=(i-1):(n-1)
+ Qsup[i]=min(Finv((J+1)/n)+Ginv((i-1-J+n)/n))
+ }

Actually, it is possible to compare here with two simple cases: the independent case, where the sum has a  distribution, and the comonotonic case where the sum has a  distribution.

>  lines(x,qnorm(x,sd=sqrt(2)),col="blue",lty=2)
>  lines(x,qnorm(x,sd=2),col="blue",lwd=2)

On the graph below, the comonotonic case (usually considered as the worst case scenario) is the plain blue line (with here an animation to illustrate the convergence of the numerical algorithm)

Below that (strong) blue line, then risks are sub-additive for the Value-at-Risk, i.e.

but above, risks are super-additive for the Value-at-RIsk. i.e.

(since for comonotonic variates, the quantile of the sum is the sum of quantiles). It is possible to visualize those two cases above, in green the area where risks are super-additive, while the yellow area is where risks are sub-additive.

Recall that with a Gaussian random vector, with correlation http://latex.codecogs.com/gif.latex?r then the quantile is the quantile of a random variable centered, with variance http://latex.codecogs.com/gif.latex?2(1+r). Thus, on the graph below, we can visualize case that can be obtained with this Gaussian copula. Here the yellow area can be obtained with a Gaussian copulas, the upper and the lower bounds being respectively the comonotonic and the countermononic cases.

http://freakonometrics.blog.free.fr/public/perso6/sum-norm-G-bounds2.gif

But the green area can also be obtained when we sum two Gaussian variables ! We just have to go outside the Gaussian world, and consider another copula.

Another point is that, in the previous post, http://latex.codecogs.com/gif.latex?C^- was the lower Fréchet-Hoeffding bound on the set of copulas. But all the previous results remain valid if http://latex.codecogs.com/gif.latex?C^- is a lower bound on the set of copulas of interest. Especially

http://latex.codecogs.com/gif.latex?\tau_{C^-,L}(F,G)\leq%20\sigma_{C,L}(F,G)\leq\rho_{C^-,L}(F,G)

for all http://latex.codecogs.com/gif.latex?C such that http://latex.codecogs.com/gif.latex?C\geq%20C^-. For instance, if we assume that the copula should have positive dependence, i.e. http://latex.codecogs.com/gif.latex?C\geq%20C^\perp, then

http://latex.codecogs.com/gif.latex?\tau_{C^\perp,L}(F,G)\leq%20\sigma_{C,L}(F,G)\leq\rho_{C^\perp,L}(F,G)

Which means we should have sharper bounds. Numerically, it is possible to compute those sharper bounds for quantiles. The lower bound becomes

http://latex.codecogs.com/gif.latex?\sup_{u\in[x,1]}\left\{F^{-1}(u)+G^{-1}\left(\frac{x}{u}\right)\right\}

while the upper bound is

http://latex.codecogs.com/gif.latex?\sup_{u\in[0,x]}\left\{F^{-1}(u)+G^{-1}\left(\frac{x-u}{1-u}\right)\right\}

Again, one can easily compute those quantities on a grid of the unit interval,

> Qinfind=Qsupind=rep(NA,n-1)
> for(i in 1:(n-1)){
+  J=1:(i)
+  Qinfind[i]=max(Finv(J/n)+Ginv((i-J)/n/(1-J/n)))
+  J=(i):(n-1)
+  Qsupind[i]=min(Finv(J/n)+Ginv(i/J))
+ }

We get the graph below (the blue area is here to illustrate how sharper those bounds get with the assumption that we do have positive dependence, this area been attained only with copulas exhibiting non-positive dependence)

For high quantiles, the upper bound is rather close to the one we had before, since worst case are probably obtained when we do have positive correlation. But it will strongly impact the lower bound. For instance, it becomes now impossible to have a negative quantile, when the probability exceeds 75% if we do have positive dependence...

> Qinfind[u==.75]
[1] 0

 

Published at DZone with permission of Arthur Charpentier, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)