Precision and Reproducibility in Computing

Run-to-run reproducibility in computing is often assumed as an obvious truth. However software running on modern computer architectures, among many other processes, particularly when coupled with advanced performance-optimized libraries, is often only guaranteed to produce reproducible results only up to a certain precision; beyond that results can and do vary run-to-run. Reproducibility is interrelated with the precision of floating-point point types and the resultant rounding, operation re-ordering, memory structure and use, and finally how real numbers are represented internally in a computer’s registers.

This issue of reproducibility arises with NMath users when writing and running unit tests; which is why it’s important when writing tests to compare floating point numbers only up to their designed precision, at an absolute maximum. With the IEEE 754 floating point representation which virtually all modern computers adhere to, the single precision float type uses 32 bits or 4 bytes and offers 24 bits of precision or about 7 decimal digits. While the double precision double type requires 64 bits or 8 bytes and offers 53 bits of precision or about 15 decimal digits. Few algorithms can achieve significant results to the 15th decimal place due to rounding, loss of precision due to subtraction and other sources of numerical precision degradation. NMath’s numerical results are tested, at a maximum, to the 14th decimal place.

A Precision Example

As an example, what does the following code output?

      double x = .050000000000000003;
      double y = .050000000000000000;
      if ( x == y )
        Console.WriteLine( "x is y" );
      else
        Console.WriteLine( "x is not y" );

I get “x is y”, which is clearly not the case, but the number x specified is beyond the precision of a double type.

Due to these limits on decimal number representation and the resulting rounding, the numerical results of some operations can be affected by the associative reordering of operations. For example, in some cases a*x + a*z may not equal a*(x + z) with floating point types. Although this can be difficult to test using modern optimizing compilers because the code you write and the code that runs can be organized in a very different way, but is mathematically equivalent if not numerically.

So reproducibility is impacted by precision via dynamic operation reorderings in the ALU and additionally by run-time processor dispatching, data-array alignment, and variation in thread number among other factors. These issues can create run-to-run differences in the least significant digits. Two runs, same code, two answers. This is by design and is not an issue of correctness. Subtle changes in the memory layout of the program’s data, differences in loading of the ALU registers and operation order, and differences in threading all due to unrelated processes running on the same machine cause these run-to-run differences.

Managing Reproducibility

Most importantly, one should test code’s numerical results only to the precision that can be expected by the algorithm, input data, and finally the limits of floating point arithmetic. To do this in unit tests, compare floating point numbers carefully only to a fixed number of digits. The code snippet below compares two double numbers and returns true only if the numbers match to a specified number of digits.

private static bool EqualToNumDigits( double expected, double actual, int numDigits )
    {
      double max = System.Math.Abs( expected ) > System.Math.Abs( actual ) ? System.Math.Abs( expected ) : System.Math.Abs( actual );
      double diff = System.Math.Abs( expected - actual );
      double relDiff = max > 1.0 ? diff / max : diff;
      if ( relDiff <= DOUBLE_EPSILON )
      {
        return true;
      }

      int numDigitsAgree = (int) ( -System.Math.Floor( Math.Log10( relDiff ) ) - 1 );
      return numDigitsAgree >= numDigits;
    }

This type of comparison should be used throughout unit testing code. The full code listing, which we use for our internal testing, is provided at the end of this article.

If it is essential to enforce binary run-to-run reproducibility to the limits of precision, NMath provides a flag in its configuration class to ensure this is the case. However this flag should be set for unit testing only because there can be a significant cost to performance. In general, expect a 10% to 20% reduction in performance with some common operations degrading far more than that. For example, some matrix multiplications will take twice the time with this flag set.

Note that the number of threads that Intel’s MKL library uses ( which NMath depends on ) must also be fixed before setting the reproducibility flag.

int numThreads = 2;  // This must be fixed for reproducibility.
NMathConfiguration.SetMKLNumThreads( numThreads );
NMathConfiguration.Reproducibility = true;

This reproducibility run configuration for NMath cannot be unset at a later point in the program. Note that both setting the number of threads and the reproducibility flag may be set in the AppConfig or in environmental variables. See the NMath User Guide for instructions on how to do this.

Paul

References

M. A. Cornea-Hasegan, B. Norin. IA-64 Floating-Point Operations and the IEEE Standard for Binary Floating-Point Arithmetic. Intel Technology Journal, Q4, 1999.
http://gec.di.uminho.pt/discip/minf/ac0203/icca03/ia64fpbf1.pdf

D. Goldberg, What Every Computer Scientist Should Know About Floating-Point Arithmetic. Computing Surveys. March 1991.
http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Full `double` Comparison Code

private static bool EqualToNumDigits( double expected, double actual, int numDigits )
    {
      bool xNaN = double.IsNaN( expected );
      bool yNaN = double.IsNaN( actual );
      if ( xNaN && yNaN )
      {
        return true;
      }
      if ( xNaN || yNaN )
      {
        return false;
      }
      if ( numDigits <= 0 )
      {
        throw new InvalidArgumentException( "numDigits is not positive in TestCase::EqualToNumDigits." );
      }

      double max = System.Math.Abs( expected ) > System.Math.Abs( actual ) ? System.Math.Abs( expected ) : System.Math.Abs( actual );
      double diff = System.Math.Abs( expected - actual );
      double relDiff = max > 1.0 ? diff / max : diff;
      if ( relDiff <= DOUBLE_EPSILON )
      {
        return true;
      }

      int numDigitsAgree = (int) ( -System.Math.Floor( Math.Log10( relDiff ) ) - 1 );
      //// Console.WriteLine( "x = {0}, y = {1}, rel diff = {2}, diff = {3}, num digits = {4}", x, y, relDiff, diff, numDigitsAgree );
      return numDigitsAgree >= numDigits;
    }

A Precision Example

Managing Reproducibility

Full double Comparison Code

Leave a Reply

Full `double` Comparison Code