6.4.4.4 Character constants

Previous Table of Contents

860

character-constant:
                ' c-char-sequence '
                L' c-char-sequence '

c-char-sequence: c-char c-char-sequence c-char

c-char: any member of the source character set except the single-quote ', backslash \, or new-line character escape-sequence

escape-sequence: simple-escape-sequence octal-escape-sequence hexadecimal-escape-sequence universal-character-name

simple-escape-sequence: one of \' \" \? \\ \a \b \f \n \r \t \v

octal-escape-sequence: \ octal-digit \ octal-digit octal-digit \ octal-digit octal-digit octal-digit

hexadecimal-escape-sequence: \x hexadecimal-digit hexadecimal-escape-sequence hexadecimal-digit

861 An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x'.

862 A wide character constant is the same, except prefixed by the letter L.

863 With a few exceptions detailed later, the elements of the sequence are any members of the source character set;

864 they are mapped in an implementation-defined manner to members of the execution character set.

865 The single-quote ', the double-quote ", the question-mark ?, the backslash \, and arbitrary integer values are representable according to the following table of escape sequences:

single quote   '       \'
double quote   "       \"
question mark  ?       \?
backslash      \       \\
octal character        \octal digits
hexadecimal character  \xhexadecimal digits

866 The double-quote " and question-mark ? are representable either by themselves or by the escape sequences \" and \?, respectively, but the single-quote ' and the backslash \ shall be represented, respectively, by the escape sequences \' and \\.

867 The octal digits that follow the backslash in an octal escape sequence are taken to be part of the construction of a single character for an integer character constant or of a single wide character for a wide character constant.

868 The numerical value of the octal integer so formed specifies the value of the desired character or wide character.

869 The hexadecimal digits that follow the backslash and the letter x in a hexadecimal escape sequence are taken to be part of the construction of a single character for an integer character constant or of a single wide character for a wide character constant.

870 The numerical value of the hexadecimal integer so formed specifies the value of the desired character or wide character.

871 Each octal or hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence.

872 In addition, characters not in the basic character set are representable by universal character names and certain nongraphic characters are representable by escape sequences consisting of the backslash \ followed by a lowercase letter: \a, \b, \f, \n, \r, \t, and \v.65)

873 65) The semantics of these characters were discussed in 5.2.2.

874 If any other character follows a backslash, the result is not a token and a diagnostic is required.

875 See “future language directions” (6.11.4).

876 The value of an octal or hexadecimal escape sequence shall be in the range of representable values for the type unsigned char for an integer character constant, or the unsigned type corresponding to wchar_t for a wide character constant.

877 An integer character constant has type int.

878 The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer.

879 The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.

880 If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

881 A wide character constant has type wchar_t, an integer type defined in the <stddef.h> header.

882 The value of a wide character constant containing a single multibyte character that maps to a member of the extended execution character set is the wide character corresponding to that multibyte character, as defined by the mbtowc function, with an implementation-defined current locale.

883 The value of a wide character constant containing more than one multibyte character, or containing a multibyte character or escape sequence not represented in the extended execution character set, is implementation-defined.

884 EXAMPLE 1 The construction '\0' is commonly used to represent the null character.

885 EXAMPLE 2 Consider implementations that use two's-complement representation for integers and eight bits for objects that have type char. In an implementation in which type char has the same range of values as signed char, the integer character constant '\xFF' has the value -1; if type char has the same range of values as unsigned char, the character constant '\xFF' has the value +255.

886 EXAMPLE 3 Even if eight bits are used for objects that have type char, the construction '\x123' specifies an integer character constant containing only one character, since a hexadecimal escape sequence is terminated only by a non-hexadecimal character. To specify an integer character constant containing the two characters whose values are '\x12' and '3', the construction '\0223' may be used, since an octal escape sequence is terminated after three octal digits. (The value of this two-character integer character constant is implementation-defined.)

887 EXAMPLE 4 Even if 12 or more bits are used for objects that have type wchar_t, the construction L'\1234' specifies the implementation-defined value that results from the combination of the values 0123 and '4'.

888 Forward references: common definitions <stddef.h> (7.17), the mbtowc function (7.20.7.2).

Next

Created at: 2005-06-29 02:18:58 The text from WG14/N1124 is copyright © ISO