Beyond simple addition: The nuances of V(X+Y) and covariance
Why adding only individual variances can lead to underestimation
In statistics and probability, what is the variance of the sum of 2 random variables?
Expressing my question in mathematical notation, what is
It is very tempting to conclude that,
but this is true ONLY if the random variables X and Y are uncorrelated. (An even stronger condition is independence, which is what many practitioners think about in their real-life work.)
If X and Y are positively correlated, then simply adding V(X) and V(Y) will underestimate the variance of their sum, because it ignores the covariance between X and Y. This can lead to disastrous consequences in all types of applications, such as making business investments, performing medical treatments, or building public infrastructure.
If X and Y are correlated, then
Notice the extra term of the covariance multiplied by 2. It is also possible to overestimate V(𝘟 + 𝘠), because the covariance can be negative.
More generally, if you have scalar constants multiplying the random variables, then the ensuing variance of the sum is
This is a common mistake for students who first learn mathematical statistics, so make sure to remember that last term of 2Cov(X, Y)!