Big Data/Analytics Zone is brought to you in partnership with:

John Cook is an applied mathematician working in Houston, Texas. His career has been a blend of research, software development, consulting, and management. John is a DZone MVB and is not an employee of DZone and has posted 173 posts at DZone. You can read more from them at their website. View Full User Profile

Beta Inequalities in R

  • submit to reddit

Someone asked me yesterday for R code to compute the probability P(X > Y + δ) where X and Y are independent beta random variables. I’m posting the solution here in case it benefits anyone else.

For an example of why you might want to compute this probability, see A Bayesian view of Amazon resellers.

Let X be a Beta(a, b) random variable and Y be a Beta(c, d) random variable. Denote PDFs by f and CDFs by F. Then the probability we need is

P(X > Y + \delta) &=& \int_\delta^1 \int_0^{x-\delta} f_X(x) \, f_Y(y)\, dy\,dx \\ &=& \int_\delta^1 f_X(x)\, F_Y(x-\delta) \, dx

If you just need to compute this probability a few times, here is a desktop application to compute random inequalities.

But if you need to do this computation repeated inside R code, you could use the following.

beta.ineq <- function(a, b, c, d, delta)
    integrand <- function(x) { dbeta(x, a, b)*pbeta(x-delta, c, d) }
    integrate(integrand, delta, 1, rel.tol=1e-4)$value

The code is as good or as bad as R’s integrate function. It’s probably accurate enough as long as none of the parameters a, b, c, or d are near zero. When one or more of these parameters is small, the integral is harder to compute numerically.

There is no error checking in the code above. A more robust version would verify that all parameters are positive and that delta is less than 1.

Here’s the solution to the corresponding problem for gamma random variables, provided delta is zero: A support one-liner.

And here is a series of blog posts on random inequalities.

Analytical results
Numerical results
Cauchy distributions
Beta distributions
Gamma distributions
Three or more random variables
Folded normals




Published at DZone with permission of John Cook, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)