Large Negative Integer Literals in C++
Suppose that you're using a system on which sizeof(long long) is 8 (which you probably are). The smallest number which can be represented by a long long is then -263, which will have the bit pattern 1000000000000000000000000000000000000000000000000000000000000000 and the decimal value -9223372036854775808. Now suppose that we try to use this value explicitly in a program, maybe something like this:
#include <iostream>
int main(){
std::cout << -9223372036854775808LL << std::endl;
return(0);
}
You might reasonably expect this program to output -9223372036854775808, but in fact, using both clang (trunk 3.0) and gcc (4.2) I find that the result is instead 9223372036854775808. Note the missing negative sign.
What happened? One thing I glossed over above is that when compiling this program, both compilers issued a warning:
warning: integer constant is so large that it is unsigned
From reading the relevant parts of the clang source, it seems that what's going on here is that the negative sign is not being considered part of the literal, which then looks like positive 92233720368547758081. This value cannot be represented using a signed long long, so the compiler uses unsigned long long instead. The '-' has meanwhile been interpreted as a unary negation operator, and according to the C++0x FDIS (N3290):
[Paragraph 5.3.1.8] The operand of the unary - operator shall have arithmetic or unscoped enumeration type and the result is the negation of its operand. Integral promotion is performed on integral or enumeration operands. The negative of an unsigned quantity is computed by subtracting its value from 2n, where n is the number of bits in the promoted operand.[Empahsis added] The type of the result is the type of the promoted operand.
So, -9223372036854775808LL computes 264-263 and gives the result 9223372036854775808ULL. As best I can tell, this is exactly what is supposed to happen, but seems strange to me that no one cared about the way that these rules make it apparently impossible to write the most negative integer explicitly in a piece of code. In the end, though, it isn't a particularly big deal since there are a number of ways to work around this, such as explicitly casting the literal to a signed type or using a less error-prone way of writing it in the first place, like numeric_limits<long long>::min().
-
This appears to be as required by the language standard: "[Paragraph 2.14.2.1] An integer literal is a sequence of digits that has no period or exponent part. An integer literal may have a prefix that specifies its base and a suffix that specifies its type." From reading paragraphs 2.14.2.2 and 2.14.2.3, I'm not entirely sure that the compiler's choice to use and unsigned type as an 'extended integer type' for a decimal constant with the
LLsuffix is legal, but I'm starting to get in over my head here. ↩