Defect Report #117

Submission Date: 03 Dec 93
Submittor: WG14
Source: Ron Guilmette
Question
ANSI/ISO C Defect Report #rfg24:
Subject: Abstract semantics, sequence points, and expression evaluation.
Does the following code involve usage which renders the code itself not strictly conforming?
int example ()
{
int x1 = 2, x2 = 1, x_temp;

return (x_temp = x1, x_temp) + (x_temp = x2, x_temp);
}

Background:
Subclause 5.1.2.3:
The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant.
Subclause 6.3:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression.
Although it is quite clear that the above quoted ``modified at most once'' rule was intended to render certain programs ``not strictly conforming,'' there is an unfortunate amount of ambiguity built into the current wording of that rule.
Quite simply, while the ``modified at most once'' rule is obviously telling us what a ``strictly conforming program'' must not do between two particular points in time, it is altogether less than clear what events and/or actions (exactly) are associated with these two points in time. Additionally, it is also less than clear (from reading the remainder of the C Standard) what actions and/or events are allowed (or required) to take place between some pair of sequence points in cases where both members of the pair are part of some large single expression whose evaluation order is not completely dictated by the C Standard.
Note that despite the assertion given in subclause 5.1.2.3 (and quoted above) the C Standard does not fully specify the behavior of the ``abstract machine,'' especially when it comes to the issue of the ordering of sub-expression evaluation used by the ``abstract machine'' model.
This fact makes it inherently impossible to precisely determine even just the relative timings of various events (including the ``occurrence'' of or the ``execution'' of or the ``evaluation'' of sequence points) which may (or must) occur sometime during the evaluation of a larger containing expression (except in a few cases involving || or && or ?: or , operators).
To put it more plainly, if some pair of sequence points will be ``reached'' (or ``evaluated'' or ``executed'') during the evaluation of any pair of subexpressions which are themselves operands for some binary operator (other than the operators || or && or ?: or ,) then the C Standard's description of the ``abstract machine'' semantics are inadequate to enable us to know either which order these two sequence points will occur in, or even which other aspects of the evaluation of the overall expression may (or must) occur ``between'' the two sequence points.
Thus, it seems that it may also be inherently impossible to know whether or not the prohibition against multiple modifications of a given variable ``between'' two consecutive sequence points is (or may be) violated in such contexts.
Here is a simple example of an expression which illustrates these points:
(x = i, x) + (x = j, x)
In this expression there are two ``comma'' sequence points; however, nothing in the C Standard gives any indication as to which of these two may be (or must be) ``evaluated'' or ``reached'' first. (Indeed, it would seem that on a parallel machine of some sort, both points could perhaps be reached simultaneously.) It is fairly clear however that each of the references to the stored values of x must not be evaluated until their respective preceeding ``comma sequence points'' have been ``reached'' or ``evaluated.'' Thus, a partial (but very incomplete) ordering is imposed upon the sequence of events which must occur during the evaluation of this expression.
For the sake of this example, let us call the leftmost comma in the above expression ``lcomma'' and call the rightmost comma ``rcomma.'' Given this terminology, it would appear that the C Standard permits the following sequence of events during evaluation of the above expression:
eval(i)
x= (leftmost assignment to x)
lcomma <==== sequence point
eval(x) (leftmost reference to stored value of x)
eval(j)
x= (rightmost assignment to x)
rcomma <==== sequence point
eval(x) (rightmost reference to stored value of x)
+
Note that in this (very realistic) example, the stored value of x is never modified more than once between any pair of sequence points. Given that the ordering described above is both a perfectly plausible and also a perfectly permissible ordering for the evaluation of the expression in question, and given that this particular permissible ordering of events does not violate the ``modified at most once'' rule (quoted earlier) it therefore appears that the expression in question may in fact be interpreted as being ``strictly conforming,'' and that such expressions may appear within ``strictly conforming'' programs.
I would like the Committee to either confirm or reject this view, and to provide some commentary explaining that confirmation or rejection.
Response
The C Standard does not forbid an implementation from interleaving the subexpressions in the given example as specified above. Similarly, there is no requirement that an implementation use this particular interleaving. It is irrelevant that one particular interleaving yields code that properly delimits multiple modifications of the same object with sequence points. Any program that depends on this particular interleaving is depending on unspecified behavior, and is therefore not strictly conforming.
Previous Defect Report < - > Next Defect Report