janmr blog

Equality of Floating-Point Numbers

When using floating-point numbers then exact, bit-for-bit, equality is almost never what you want. The result of most floating-point operations like addition, multiplication and trigonometric functions cannot be represented exactly due to the limited precision of floating-point numbers. Furthermore, in most practical situations we are just interested in a result that is "close enough" and not correct with every digit available.

Say you want to compare two floating-point numbers uu and vv and consider the error uv|u-v|. It is natural to compare this error to some bound which is relative to the size of the numbers,

uvϵrelmax(u,v)|u-v| \leq \epsilon_{\text{rel}} \cdot \max(|u|, |v|)

Using max(u,v)\max(|u|, |v|) ensures that this relation is symmetric in uu and vv. This is a nice property to have as it would be unfortunate if we could have that uu was close to vv, but that vv was not close to uu. We could also use min(u,v)\min(|u|, |v|), which would result in a stronger requirement, or 12(u+v)\tfrac{1}{2}(|u|+|v|), which would lead to a behaviour between the min\min and max\max expressions.

In the inequality above, the quantity ϵrel\epsilon_{\text{rel}} controls how close the numbers must be to be considered approximately equal. Using ϵrel=10k\epsilon_{\text{rel}}=10^{-k} means that roughly the kk most significant decimal digits are correct. For example, 3108299792458ϵrel3108|3 \cdot 10^8 - 299792458| \leq \epsilon_{\text{rel}} \cdot 3 \cdot 10^8 is true for ϵrel=103\epsilon_{\text{rel}}=10^{-3} but not for ϵrel=104\epsilon_{\text{rel}}=10^{-4}. Also, π3.14159ϵrelπ|\pi-3.14159| \leq \epsilon_{\text{rel}} \cdot \pi is true for ϵrel=106\epsilon_{\text{rel}}=10^{-6} but not for ϵrel=107\epsilon_{\text{rel}}=10^{-7}.

This way of checking closeness brakes down, however, when comparing numbers to zero or close to zero. For instance, is 10810^{-8} close to zero? Since (1080)/108=1(10^{-8} - 0)/10^{-8} = 1 we see that it would require a relative tolerance of at least 11 to be viewed as approximately equal according to the test above. In such cases it makes sense to look at the absolute error instead,

uvϵabs.|u-v| \leq \epsilon_{\text{abs}}.

Combining these two inequalities we get that uu and vv are approximately equal, uvu \sim v, when

uvmax(ϵrelmax(u,v),ϵabs).|u-v| \leq \max\Big( \epsilon_{\text{rel}} \cdot \max(|u|, |v|), \epsilon_{\text{abs}} \Big).

This is the function suggested for approximate equality in a Python Enhancement Proposals from 2015. It is implemented as isclose in the math module (CPython implementation).

Some rules of thumb for choosing ϵrel\epsilon_{\text{rel}} and ϵabs\epsilon_{\text{abs}}:

  • Use ϵrel=10k\epsilon_{\text{rel}}=10^{-k} when you want (roughly) kk correct decimal digits.
  • Let ϵabs\epsilon_{\text{abs}} determine when a number is considered (close to) zero, uϵabsu0|u| \leq \epsilon_{\text{abs}} \Leftrightarrow u \sim 0. Use ϵabs=0\epsilon_{\text{abs}}=0 if you don't need to consider numbers close to zero.

Some extra resources to check out:

Feel free to leave any question, correction or comment in this Mastodon thread.