ISO/ IEC JTC1/SC22/WG14 N669

Document SC22/WG14/N669 (X3J11/97-032)

Comments on N641 and an outline proposal for secondary integral types.
Clive D.W. Feather
<clive@demon.net>


ABSTRACT

This paper consists of two parts. The first is an informal critique of
N641, explaining what I think is wrong with it. The second is an outline
proposal for a concept of "secondary integral type". If there is interest
in it, I can expand it to a formal proposal.


N641, AND WHAT'S WRONG WITH IT

I have to completely disagree with Randy's position on this paper. Randy
summarizes his position as "that the previous interpretation has bad
consequences, that it is unreasonable, and the revised interpretation is
a reasonable alternative". I think he's wrong in all three.

In section 2 of his paper, Randy says: "The previous interpretation
probably renders non-conforming any implementation with an extension
type."

This is purely and simply wrong. To introduce an extension type into a
program, the programmer must do something that is not strictly
conforming (include a non-Standard header, or use a typedef like
__int16). Once this has happened, all bets are off. In particular,
provided *one* diagnostic like "non-Standard type name seen" has been
generated, there is no need to generate a diagnostic every time it's
used. So this claim looks awfully like scaremongering to me.

In section 3, he talks about "a fairly artificial distinction between
the Standard types and extension types". Despite the waffle about Zen
koans, the distinction is that between the Standard and any extension.

Such types are a form of extension that is not strictly compatible with
the Standard (just like long long used to be). Once a diagnostic (only
one, note) has appeared, there's no problem with using them. It is not
"wrong to shift this implementation extension type, or add it, or ...";
it is wrong to use the type and believe that you are still strictly
conforming.

In section 4, he first (correctly) shows that extra types don't affect
strictly conforming programs, but then claims this means that "there is
no harm in letting the extension type also be a member of a Standard
type category". Here he makes the mistake that first caused me to submit
DR 067: treating random other types as being integral types adds a large
amount of semantic baggage to them. Allowing __uint16 to be an integral
type *automatically* allows it to be used for size_t, and this then
produces a whole range of undescribed behaviour in an apparently
strictly conforming program. This looks to me like begging the question.

Finally, in section 4.2, he appears to assume that the only signficant
problem is the "biggest type" issue. To address his numbered points:

(1) The Standard never discusses integer-holding types larger than
unsigned long. Therefore, other than on esoteric systems (and *all*
systems using long long were esoteric at the time as far as I was
concerned) it is the largest type, and it is the largest type I was ever
likely to come across. Thus it *is* a useful interpretation.

(2) This is a major argument against long long.

(3) Sloppy programmers abound. When unsigned long was the largest
possible type, us careful programmers at least had a workable idiom.
Randy (and long long) takes it away from us (though I'm going to submit
a separate proposal on this).

(5) I disagree that inttypes.h is cleaner.

Randy totally ignores the other issues with using unknown types. For
example, what are the relevant promotion and conversion rules ? What is
the result of (sizeof(V)+0) ? Things like that.

However, this message is not intended solely as an attack on N641.
Curiously enough, I agree with the basic ideas behind it - I just feel
they've been handled badly. So here I present a rough set of ideas which
can be worked up into a formal proposal if people are actually
interested.


PROPOSAL - SECONDARY INTEGRAL TYPES

Introduce a concept of "secondary integral type". A secondary integral
type is a type which has the basic properties of integral types, but is
not one of the types so-named in the Standard. The secondary integral
types provided by an implementation are implementation-defined. The
following is intended to be an *exhaustive* list of areas that need to
be addressed, and what the action is.

[6.1.2.5]

(1) SITs always appear in pairs - a signed and an unsigned version. The
two members of the pair have the same storage and alignment
requirements, the representation of the common range of values is the
same, MAX_U_type >= MAX_S_type, and arithmetic in the unsigned type is
always modulo MAX_U_type+1.

(2) The range of values of all SITs is at least as great as that of
[un]signed char, and is no more than [u]intmax_t.

(3) SITs are integral types (and are each either signed integral types
or unsigned integral types) and types such as wchar_t, size_t, and
ptrdiff_t can be SITs. SITs use pure binary notation. They take part in
the derivation and qualification processes in the normal way.

[6.1.2.6]

(4) Two different SITs are never compatible types, nor are they ever
compatible with the primary integral types.

[6.1.3.2]

(5) Integer constants can only be SITs if they have a larger value than
can be represented by unsigned long long.

[6.2.1.1]

(6) SITs divide into "small" SITs and "large" SITs. A SIT is "small" if
MAX_S_type <= INT_MAX and MAX_U_type <= UINT_MAX, and "large" otherwise.
The corresponding signed and unsigned types of a pair shall always be
both small or both large.

(7) Small SITs may be used in an expression wherever an int or unsigned
int can. The integral promotions apply to them in the same way. Large
SITs are unaffected by the integral promotions.

[6.2.1.2]

(8) With four (or now five) named types, the concept of "larger" is
obvious. Adding SITs requires a rethink. I propose:

  - Corresponding signed and unsigned types are "the same size".

  - If two types of the same signedness have different maximum
    representable values, then the "larger" type is the one with the
    greater maximum.

  - If two types have the same maximum representable value, then:
      * if both are primary types, use the "natural" order;
      * a SIT is "larger" than char and short, but "smaller" than
        int, long, and long long;
      * two SITs have an implementation-defined order which must be
        transitive among SITs of the same maximum representable value.

[I don't see why the last two cases should happen, but let's play safe.]

The number of bytes occupied by an object of a smaller type is less than
or equal to the number occupied by an object of larger type.

(8) The rules of 6.2.1.2 apply, except that the concepts of "size"
should be read as "maximum value that can be represented" (a change that
is worth making anyway).

[6.2.1.7]

(9) The question of the usual arithmetic conversions is the knottiest
one. The following approach seems the cleanest to me - it is compatible
with the present wording, it is fairly easy to explain, and it works
even with SITs larger than long long.

  Replace the text following:

    Otherwise the integral promotions are performed on both operands.
    Then the following rules are applied:

  with:

    If both operands have signed types, or the operand with the larger
    type has an unsigned type, the operand with the smaller type is
    converted to the type of the other operand.

    If one operand has a signed type and the other has the corresponding
    unsigned type, the former is converted to the type of the latter.

    Otherwise, if the larger (signed) type can represent all the values
    of the smaller (unsigned) type, the operand with the smaller type is
    converted to the larger type.

    Otherwise both operands are converted to the unsigned type
    corresponding to the larger (signed) type.

[6.3 onwards]

(10) An expression required to have an integral type may be a (large)
SIT, and behaves like any other integral type.

(11) 6.3.7 paragraph 4 will require rewording.

[6.5.2]

(12) There are no type specifiers that explicitly generate SITs. An
implementation may define typedef names, additional type specifiers that
might be combined with existing ones, or additional combinations of
existing type specifiers, that specify SITs. All such additional
identifiers shall be in an appropriate reserved namespace or require
inclusion of a non-Standard header.

My intent here is to allow an implementation to provide concepts like

    signed __int24
    unsigned __int24
    long char
    int { 32767 }
    int __atleast __bigendian : 12

as well as simple typedefed names.

[6.8.1]

(13) All SITs smaller than long long shall be treated as long long in
preprocessor arithmetic. Alternatively, require preprocessor arithmetic
to use [u]intmax_t.

[7.4 <inttypes.h>]

(14) All the types provided may be SITs. However, if there is a primary
integral type that meets the criterion, it must be used instead of an
SIT. The implementation is not required to provide any SITs for these
purposes.

(15) Reiterate that no SIT can be larger than [u]intmax_t. However, the
latter might be SITs.

-- 
Clive D.W. Feather    | Associate Director  | Director
Tel: +44 181 371 1138 | Demon Internet Ltd. | CityScape Internet Services Ltd.
Fax: +44 181 371 1037 | <clive@demon.net>   | <cdwf@cityscape.co.uk>