Fun with NaN

HIMEM.SYS

Fun with NaN

2019-10-03 12:00:00 +0000 ·

Ok, sorry for the (bad) pun, but naming is hard. In this post I just wanted to dump some stuff I recently did in an attempt to figure out, were in a couple of dozen lines of code of double-arithmetic a Double.Nan occurred.

First attempt: tinker with the FPU exception settings

First I thought it would be kind of cool, if the debugger just hit a breakpoint when a floating point value comes out to NaN. Unfortunately there is no such kind of setting (at least not for managed code). However, there is another thing that looks promising:

If the FPU encounters some types of results (NaN, Overflow, Underflow, etc.) it raises a hardware exception. Back in the day those would come up and result in real “exceptions” (or whatever the mechanism of choice of the respective language / platform was). However in most modern environments this no longer happens (the exceptions are disabled) and such conditions are translate into other constructs (like a Double.NaN in the CLR).

One can tinker with these settings by P/Invoking _controlfp_s() or some of its variants. However, there are some things to consider:

1.) The occurance of NaN is converted into an exception (but different ones for the 32-bit vs. the 64-bit CLR) 2.) Differences in behavior after an exception is thrown between 32bit (X86) and 64bit (X64) CLR. 3.) Double.NaN is used quite often inside the BCL itself and the value can occur in lots of places (for example in Hashtable, Dictionary, etc. when calculating the load factor, inside WPF when dealing with coordinates, etc.). This can lead to an annoying number of false positives you’d have to deal with.

Let’s look at an example. The following is just a simple .NET Core 3.1 console application (the full framework CLR behaves the same):

using System;
using System.Runtime.InteropServices;

namespace ConsoleApp1
{
    class Program
    {
        [DllImport("msvcrt.dll", CallingConvention = CallingConvention.Cdecl)]
        private static extern uint _controlfp_s(out uint currentWord, uint a, uint b);

        [DllImport("msvcrt.dll", CallingConvention = CallingConvention.Cdecl)]
        private static extern uint _clearfp();

        private const uint _MCW_EM = 0x0008001f; // From float.h

        [Flags]
        public enum FloatingPointExceptionMask
        {
            _EM_INVALID = 0x00000010,
            _EM_DENOMRAL = 0x00080000,
            _EM_ZERODIVIDE = 0x00000008,
            _EM_OVERFLOW = 0x00000004,
            _EM_UNDERFLOW = 0x00000002,
            _EM_INEXACT = 0x00000001
        }

        [CLSCompliant(false)]
        public static uint DebuggingSetThrowOnFPError(bool enable)
        {
            const FloatingPointExceptionMask all = FloatingPointExceptionMask._EM_INVALID |
                                      FloatingPointExceptionMask._EM_DENOMRAL |
                                      FloatingPointExceptionMask._EM_ZERODIVIDE |
                                      FloatingPointExceptionMask._EM_OVERFLOW |
                                      FloatingPointExceptionMask._EM_UNDERFLOW |
                                      FloatingPointExceptionMask._EM_INEXACT;
            return _controlfp_s(out _, (uint)(enable ? ~all : all), _MCW_EM);
        }

        [CLSCompliant(false)]
        public static FloatingPointExceptionMask DebuggingGetThrowOnFPError()
        {
            uint res = _controlfp_s(out var current, 0, 0);
            if (res != 0)
            {
                Console.WriteLine("Could not get FP status {0}.", res);
                return 0;
            }

            return (FloatingPointExceptionMask)(current & _MCW_EM);
        }

        static void DoNaN()
        {
            double x = 0.0;
            double y = x / 0.0;
            Console.WriteLine("Double.IsNan: " + Double.IsNaN(y));
        }

        static void Main(string[] args)
        {
            Console.WriteLine("BEFORE 1:" + DebuggingGetThrowOnFPError());
            DoNaN();
            Console.WriteLine(DebuggingSetThrowOnFPError(true));

            Console.WriteLine("BEFORE 2: " + DebuggingGetThrowOnFPError());
            try
            {
                DoNaN();
            }
            catch (ArithmeticException ex)
            {
                Console.WriteLine("GOT AN ArithmeticException: " + ex.Message);
            }
            catch (SEHException ex)
            {
                Console.WriteLine("GOT AN SEHException: " + ex.Message);
            }

            // On x86 this will reset the FPU state to what the CLR expects it to be
            // (i.e. all exceptions are disabled.) 
            try { throw new Exception("STOP"); } catch { }

            Console.WriteLine("BEFORE 3: " + DebuggingGetThrowOnFPError());
            try
            {
                DoNaN();
            }
            catch (ArithmeticException ex)
            {
                Console.WriteLine("GOT AN ArithmeticException: " + ex.Message);
            }
            catch (SEHException ex)
            {
                Console.WriteLine("GOT AN SEHException: " + ex.Message);
            }

            Console.WriteLine(DebuggingSetThrowOnFPError(false));

            Console.WriteLine("BEFORE 4: " + DebuggingGetThrowOnFPError());
            DoNaN();
        }
    }
}

The output for an X64 build is:

BEFORE 1:_EM_INEXACT, _EM_UNDERFLOW, _EM_OVERFLOW, _EM_ZERODIVIDE, _EM_INVALID, _EM_DENOMRAL
Double.IsNan: True
0
BEFORE 2: _EM_DENOMRAL
GOT AN ArithmeticException: Overflow or underflow in the arithmetic operation.
BEFORE 3: _EM_DENOMRAL
GOT AN ArithmeticException: Overflow or underflow in the arithmetic operation.
0
BEFORE 4: _EM_INEXACT, _EM_UNDERFLOW, _EM_OVERFLOW, _EM_ZERODIVIDE, _EM_INVALID, _EM_DENOMRAL
Double.IsNan: True

As you can see the 64 CLR converts NaN into System.ArithmeticException (albeit with a slightly confusing or generic message about “overflow or underflow”, which is not the FPU state in this case.) Note also that the “dummy” throw/catch has no influence on the FPU settings we forced on using _controlfp_s.

The X86 build outputs:

BEFORE 1:_EM_INEXACT, _EM_UNDERFLOW, _EM_OVERFLOW, _EM_ZERODIVIDE, _EM_INVALID, _EM_DENOMRAL
Double.IsNan: True
0
BEFORE 2: _EM_DENOMRAL
GOT AN SEHException: External component has thrown an exception.
BEFORE 3: _EM_INEXACT, _EM_UNDERFLOW, _EM_OVERFLOW, _EM_ZERODIVIDE, _EM_INVALID, _EM_DENOMRAL
GOT AN SEHException: External component has thrown an exception.
0
BEFORE 4: _EM_INEXACT, _EM_UNDERFLOW, _EM_OVERFLOW, _EM_ZERODIVIDE, _EM_INVALID, _EM_DENOMRAL
Double.IsNan: True

Note that the kind of exception throw is an System.Runtime.InteropServices.SEHException, which gives the whole thing a much more naughty feeling as it typically an indicator for a much more serious issue. As note that “BEFORE 3” outputs the default FPU settings again, that is the dummy throw/catch actually resets those settings inside the CLR (see this comment https://stackoverflow.com/a/25206025/21567 and the respective (source code)[https://github.com/dotnet/runtime/blob/master/src/coreclr/src/vm/excep.cpp#L7857]).

This alone makes to whole practice much more unreliable when looking for issues than in the 64-bit build.

All in all, the whole idea seems to be more of an academic exercise and probably not reliable enough to track down bugs. YMMV, of course.

Second attempt: watch FPU register while debugging

When the FPU encounters a NaN the status of some registers will change. In general, you should watch the MXCSR register. To do so, open from “Debug / Windows” the “Registers” window, and make sure you select “SSE” from the context menu.

When starting and hitting a breakpoint the registers window should look something like this:

XMM0 = 0000000000000000-000000008C9F9CE8 
[...]
XMM15 = 0000000000000000-0000000000000000 
MXCSR = 00001FA0 

Note the initial values of MXCSR as 0x00001FA0. Now single step the region of code you suspect contains or produces the NaN value in question and watch those to registers. When you encounter the respective statement, their values change to MXCSR 0x00001FA1.

The MXCSR register is only the place to look for .NET Core applications (32- and 64-bit) or 64-bit .NET full framework applications. When using 32-bit full .NET framework applications, the JIT compiler being used is not RyuJIT, but the legacy JIT. That one doesn’t use the SSE-instructions for floating point, but still the x87 FPU stack (see https://github.com/dotnet/roslyn/issues/7333#issuecomment-560197038), thus one would need to rather look at the STAT register (initial value 0x4020, NaN value 0x4021; select “Floating Point” from the Registers window context menu to display the register set for the x86 FPU stack).

This really does work reliably, but requires you to able to narrow down the location at least to some extend.