327 The characteristics of floating types are defined in terms of a model that describes a representation of floating-point numbers and values that provide information about an implementation's floating-point arithmetic.16)
328 The following parameters are used to define the model for each floating-point type:
329
330
331
332
333
334 A floating-point number (x) is defined by the following model:
335
In addition to normalized floating-point numbers
(
336 A NaN is an encoding signifying Not-a-Number.
337 A quiet NaN propagates through almost every arithmetic operation without raising a floating-point exception;
338 a signaling NaN generally raises a floating-point exception when occurring as an arithmetic operand.17)
339
340
341 15) See 6.2.5.
342 16) The floating-point model is intended to clarify the description of each floating-point characteristic and does not require the floating-point arithmetic of the implementation to be identical.
343
The accuracy of the floating-point operations (
344 The implementation may state that the accuracy is unknown.
345
All integer values in the
346 all floating values shall be constant expressions.
347
All except
348
The floating-point model representation is provided for all values
except
349
The rounding mode for floating-point addition is characterized by the
implementation-defined value of
All other values for
350 The values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type.
351
The use of evaluation formats is characterized by the
implementation-defined value of
352
353
354 17) IEC 60559:1989 specifies quiet and signaling NaNs.
355 For implementations that do not support IEC 60559:1989, the terms quiet NaN and signaling NaN are intended to apply to encodings with similar behavior.
356
18) Evaluation of
357 19) The evaluation method determines evaluation formats of expressions involving all floating types, not just real types.
358
For example, if
359
360
361
All other negative values for
362 The values given in the following list shall be replaced by constant expressions with implementation-defined values that are greater or equal in magnitude (absolute value) to those shown, with the same sign:
363
radix of exponent representation,
FLT_RADIX 2
364
number of base-
FLT_MANT_DIG DBL_MANT_DIG LDBL_MANT_DIG
365
number of decimal digits,
⌈1+pmaxlog10b⌉
DECIMAL_DIG 10
366
number of decimal digits,
⌊(p-1)log10b⌋
FLT_DIG 6 DBL_DIG 10 LDBL_DIG 10
367
minimum negative integer such that
FLT_MIN_EXP DBL_MIN_EXP LDBL_MIN_EXP
368
minimum negative integer such that 10 raised to that
power is in the range of normalized floating-point numbers,
FLT_MIN_10_EXP -37 DBL_MIN_10_EXP -37 LDBL_MIN_10_EXP -37
369
maximum integer such that
FLT_MAX_EXP DBL_MAX_EXP LDBL_MAX_EXP
370
maximum integer such that 10 raised to that power is
in the range of representable finite floating-point numbers,
FLT_MAX_10_EXP +37 DBL_MAX_10_EXP +37 LDBL_MAX_10_EXP +37
371 The values given in the following list shall be replaced by constant expressions with implementation-defined values that are greater than or equal to those shown:
372
maximum representable finite floating-point number,
FLT_MAX 1E+37 DBL_MAX 1E+37 LDBL_MAX 1E+37
373 The values given in the following list shall be replaced by constant expressions with implementation-defined (positive) values that are less than or equal to those shown:
374
the difference between 1 and the least value greater
than 1 that is representable in the given floating point type,
FLT_EPSILON 1E-5 DBL_EPSILON 1E-9 LDBL_EPSILON 1E-9
375
minimum normalized positive floating-point number,
FLT_MIN 1E-37 DBL_MIN 1E-37 LDBL_MIN 1E-37
376
Conversion from (at least)
377
EXAMPLE 1
The following describes an artificial floating-point representation
that meets the minimum requirements of this International Standard,
and the appropriate values in a
FLT_RADIX 16 FLT_MANT_DIG 6 FLT_EPSILON 9.53674316E-07F FLT_DIG 6 FLT_MIN_EXP -31 FLT_MIN 2.93873588E-39F FLT_MIN_10_EXP -38 FLT_MAX_EXP +32 FLT_MAX 3.40282347E+38F FLT_MAX_10_EXP +38
378
EXAMPLE 2
The following describes floating-point representations that also meet
the requirements for single-precision and double-precision normalized
numbers in IEC 60559,20) and the appropriate values
in a
FLT_RADIX 2 DECIMAL_DIG 17 FLT_MANT_DIG 24 FLT_EPSILON 1.19209290E-07F // decimal constant FLT_EPSILON 0X1P-23F // hex constant FLT_DIG 6 FLT_MIN_EXP -125 FLT_MIN 1.17549435E-38F // decimal constant FLT_MIN 0X1P-126F // hex constant FLT_MIN_10_EXP -37 FLT_MAX_EXP +128 FLT_MAX 3.40282347E+38F // decimal constant FLT_MAX 0X1.fffffeP127F // hex constant FLT_MAX_10_EXP +38 DBL_MANT_DIG 53 DBL_EPSILON 2.2204460492503131E-16 // decimal constant DBL_EPSILON 0X1P-52 // hex constant DBL_DIG 15 DBL_MIN_EXP -1021 DBL_MIN 2.2250738585072014E-308 // decimal constant DBL_MIN 0X1P-1022 // hex constant DBL_MIN_10_EXP -307 DBL_MAX_EXP +1024 DBL_MAX 1.7976931348623157E+308 // decimal constant DBL_MAX 0X1.fffffffffffffP1023 // hex constant DBL_MAX_10_EXP +308
If a type wider than
379
20) The floating-point model in that standard sums powers of
380
Forward references:
conditional inclusion (6.10.1), complex arithmetic
Next
Created at: 2005-06-29 02:18:54
The text from WG14/N1124 is copyright © ISO