Floating point to integer conversion

This post serves mainly as a personal reminder and came up due to a question on stackoverflow.com. The question was:

How exactly does C++ casting between numeric types work?

In my code, I do something like the following:

double a = 3.0;
uint64_t b = static_cast(a);
double c = static_cast(b);

Interestingly, this works as I would expect (a == c), as long as a is positive, but if a is negative, c ends up as an arbitrarily large positive number. (It must be wrapping somewhere or something.)

My questions are:

  1. Why does this happen?
  2. Why doesn’t this code break strict aliasing rules?

Note: double and uint64_t are the same size on my system.

The question about strict aliasing was easy enough to answer, it doesn’t apply here. And I was certain that floating-point to unsigned conversions are undefined, at least I thought I was certain.

But a simple comment threw me off:

Actually, casting to an unsigned type is always well defined – you’ll always get the value modulo 2^k where k is the size of the unsigned type in bits. Casting an out-of-range value to a signed integer type gives you an undefined value (but not undefined behavior)

Saying that it is well defined. I guess I always have the safe modulo conversion of unsigned types at the back off my head, even at times I was almost certain that it doesn’t apply. Let’s look at what the standard says.

C++ Standard (N3797) 4.7 [conv.integral]/2:

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2 n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]

There are 2 important parts in this one. First it’s in the section conv.integral (integral conversions) 2nd it talks about an source integer. That already is enough to know that it doesn’t apply. So what does apply?

C++ Standard (N3797) 4.9 [conv.fpint]/1:

A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type. [ Note: If the destination type is bool, see 4.12. — end note ]

This is the information we wanted. Conversions from floating-point numbers to unsigned types don’t have some special cases, they are undefined if the truncated value can’t be represented.