最近在看AVChat的源代碼,它的GlobalDefs.h文件裡用了以下代碼:
// TCP pack types const long PT_AudioMediaType = 10001; const long PT_VideoMediaType = 10002; const long PT_Payload = 10003; // Messages const long msg_FilterGraphError = 'avct' + 1; const long msg_MediaTypeReceived = 'avct' + 2; const long msg_TCPSocketAccepted = 'avct' + 3; const long msg_UDPCommandReceived = 'avct' + 4; const long msg_ModifyFilterGraph = 'avct' + 5; // Let the main thread modify filter graph #define WM_ModifyFilterGraph (WM_USER+123) // UDP command defines const long MAX_COMMAND_SIZE = 100; const long cmd_ClientCalling = 'avct' + 100; const long cmd_DeviceConfig = 'avct' + 101; const long cmd_BuildFilterGraph = 'avct' + 102; const long cmd_DisconnectRequest = 'avct' + 103; // TCP pack types const long PT_AudioMediaType = 10001; const long PT_VideoMediaType = 10002; const long PT_Payload = 10003; // Messages const long msg_FilterGraphError = 'avct' + 1; const long msg_MediaTypeReceived = 'avct' + 2; const long msg_TCPSocketAccepted = 'avct' + 3; const long msg_UDPCommandReceived = 'avct' + 4; const long msg_ModifyFilterGraph = 'avct' + 5; // Let the main thread modify filter graph #define WM_ModifyFilterGraph (WM_USER+123) // UDP command defines const long MAX_COMMAND_SIZE = 100; const long cmd_ClientCalling = 'avct' + 100; const long cmd_DeviceConfig = 'avct' + 101; const long cmd_BuildFilterGraph = 'avct' + 102; const long cmd_DisconnectRequest = 'avct' + 103;
調試時,VS告訴我 'avct' = 1635148660 ——這令我很疑惑。我在整個solution裡查找了一遍,沒發現任何地方有定義。然後我跑去了StackOverflow,因為不知道該查什麼關鍵詞,就提了個問題,然後搞懂了。
C++ standard, §2.14.3/1 - Character literals
(...) An ordinary character literal that contains more than one c-char is a multicharacter literal . A multicharacter literal has type int and implementation-defined value.
因為multicharacter literal被視為int,所以VS做了這樣的轉換:
'a'=0x61
'v'=0x76;
'c'=0x63
't'=0x74
'avct'=0x61766374=1635148660這就解釋了'avct'的值是哪裡來的。不過James Kanze對此做了修正:
“Historically, the original C accepted multi-character character constants, and both C and C++ still do, on historical grounds. Unlike single character constants, the type is int, and the value is implementation defined (but will typically consist of some sort of combination of the characters involved).
Practically speaking, they should be avoided in new code, and cannot be used in portable code (because implementations do vary as to what they mean).”
EDIT:
For what it's worth: the most typical implementation would be more or less the equivalent of:
union { char c[sizeof(int)]; int i; }; union { char c[sizeof(int)]; int i; }; , placing the characters in order in c (and ignoring any which didn't fit—whether the first or the last depending on the implementation), and then use the value of i as the value. These results obviously depend on the encoding (but that's true of any character constant), but also on things like byte order and the size of an int. Thus, even assuming an ASCII based encoding, on systems I've used, the results could be 0x61766374, 0x74637661, 0x6374, 0x7463, 0x6176 or 0x7661. (And this doesn't consider "exotic" architectures with 9 bit bytes, or where the size of an int is 6.)