One of the most common tasks in statistical computing is computation of sample variance. This would seem to be straightforward; there are a number of algebraically equivalent ways of representing the sum of squares \(S\), such as
\[
S = \sum_{k=1}^n ( x_k - \bar{x})^2
\]
or
\[
S = \sum_{k=1}^n x_k^2 + \frac{1}{n}\bar{x}^2
\]
and the sample variance is simply \(S/(n-1)\).

What is straightforward algebraically, however, is sometimes not so straightforward in the floating-point arithmetic used by computers. Computers cannot represent numbers to infinite precision, and arithmetic operations can affect the precision of floating-point numbers in unexpected ways.

7 hours ago