ISO/ IEC JTC1/SC22/WG14 N866

                            Document number: WG14 N866 (J11/99-001)

Title:  Proposals for restricted pointer issues
Author: Bill Homer
Date:   10-Jan-99
URL:    http://reality.sgi.com/homer_craypark/c9x/n866.txt
        http://reality.sgi.com/homer_craypark/c9x/n866.html

  1  ---------- Summary -------------------------------------------------
  2  
  3  This document addresses four issues with the restrict qualifier,
  4  three from PC-UK0118, and one a side effect of the recent change
  5  to the specification for realloc.
  6  
  7  It supersedes the changes proposed in:
  8    SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-4
  9  and it supplies the details of the change proposed in:
 10    SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-5
 11  
 12  The primary goal of these changes is to limit undefined behavior
 13  to those cases in which it promotes optimization.  This allows a
 14  programmer to use the restrict qualifier in more contexts, but
 15  does not impose additional burdens on a translator.
 16  
 17  What follows is a brief description of the issues, then the
 18  proposed changes to the specification that resolve these issues,
 19  and finally some examples that illustrate the effects of the
 20  changes.
 21  
 22  ---------- Description of the issues -------------------------------
 23  
 24  1.  The FCD specification of restrict forbids aliasing of
 25      unmodified objects.  Doing so does not promote optimization,
 26      and has other disadvantages, which are discussed in examples
 27      A-E below.  It is also contrary to the prior art in Fortran.
 28  
 29  2.  If a restricted pointer points to an object with const-qualified
 30      type, the FCD specification allows casting away the const
 31      qualifier, and modifying the object.  Disallowing such
 32      modifications promotes optimization as illustrated in example F
 33      below.
 34  
 35  3.  The FCD specification does not address the effect of accessing
 36      objects through pointers of various types, all based on one
 37      restricted pointer, as in Example G below.  In particular,
 38      these objects are supposed to determine an array, but the
 39      element type is not specified.
 40  
 41  4.  The specification of realloc now states that the old object is
 42      freed, and a new object is allocated.  The old and new objects
 43      cannot, in general, be viewed as being members of an array of
 44      such objects.  With the FCD specification, this appears to
 45      prohibit the use of the restrict qualifier for a pointer that
 46      points to an object that is realloc'd.  There are also related
 47      issues for dynamically allocated linked lists.  See examples
 48      H-K below.
 49  
 50  ---------- Proposed edits to FCD 6.7.3.1 ---------------------------
 51  
 52  === Append to paragraph #1
 53      the text marked with > at left:
 54  
 55  [#1]   Let D be a declaration of an ordinary identifier that provides a
 56         means of designating an object P as a restrict-qualified pointer
 57       > to objects of type T.
 58  
 59  
 60  === Change paragraph #4 from
 61      (text being changed marked with < at left):
 62  
 63  [#4] < During each execution of B, let A be the array object that
 64       < is determined dynamically by all references through pointer
 65       < expressions based on P. Then all references to values of A
 66       < shall be through pointer expressions based on P.  Furthermore,
 67         if  P is assigned the value of a pointer expression E that is
 68         based on another restricted pointer object P2, associated with
 69         block B2, then either the execution of B2 shall begin before
 70         the execution of B, or the execution of B2 shall end prior to
 71         the assignment.  If these requirements are not met, then the
 72         behavior is undefined.
 73  
 74  === to
 75      (new text indicated by > at left):
 76  
 77  [#4] > During each execution of B, let L be any lvalue that has &L based
 78       > on P.  If L is used to access the value of the object X that it
 79       > designates, and X is also modified (by any means), then the
 80       > following requirements apply.  T shall not be const-qualified.
 81       > Every other lvalue used to access the value of X shall also have
 82       > its address based on P.  Every access that modifies X shall be
 83       > considered also to modify P, for the purposes of this section.
 84         If P is assigned the value of a pointer expression E that is
 85         based on another restricted pointer object P2, associated with
 86         block B2, then either the execution of B2 shall begin before
 87         the execution of B, or the execution of B2 shall end prior to
 88         the assignment.  If these requirements are not met, then the
 89         behavior is undefined.
 90  
 91  === Change example 1 from
 92      (text being changed marked with < at left):
 93  
 94  [#7] EXAMPLE 1 The file scope declarations
 95  
 96             int * restrict a;
 97             int * restrict b;
 98             extern int c[];
 99  
100       < assert that if an object is accessed using the value of  one
101       < of  a, b, or c, then it is never accessed using the value of
102       < either of the other two.
103  
104  === to
105      (modified text indicated by > at left):
106  
107  [#7] EXAMPLE 1 The file scope declarations
108  
109           int * restrict a;
110           int * restrict b;
111           extern int c[];
112  
113    > assert that if an object is accessed using one of a, b, or c,
114    > and that object is modified anywhere in the program,
115    > then it is never accessed using either of the other two.
116  
117  === Change example 3 from
118      (text being changed marked with < at left):
119  
120  [#10] EXAMPLE 3 The function parameter declarations
121  
122    <            void h(int n, int * const restrict p,
123    <                    int * const q, int * const r)
124                 {
125                         int i;
126                         for (i = 0; i < n; i++)
127                                 p[i] = q[i] + r[i];
128                 }
129  
130    < show how const can be used  in  conjunction  with  restrict.
131    < The  const qualifiers imply, without the need to examine the
132    < body of h, that q and r cannot become based on p.  The  fact
133    < that  p  is  restrict-qualified  therefore  implies  that an
134    < object accessed through p is never accessed  through  either
135    < of  q  or  r.   This  is  the  precise assertion required to
136    < optimize the loop.  Note that a call of the form  h(100,  a,
137    < b,  b)  has defined behavior, which would not be true if all
138    < three of p, q, and r were restrict-qualified.
139  
140  === to
141      (modified text indicated by > at left):
142  
143  [#10] EXAMPLE 3 The function parameter declarations
144  
145    >      void h(int n, int * restrict p,
146    >              int * restrict q, int * restrict r)
147           {
148               int i;
149               for (i = 0; i < n; i++)
150                   p[i] = q[i] + r[i];
151           }
152  
153    > illustrate how an unmodified object can be aliased through two
154    > restricted pointers.  In particular, if a and b are disjoint
155    > arrays, a call of the form h(100, a, b, b) has defined behavior,
156    > because array b is not modified within function h.
157  
158  ---------- End of proposed edits to FCD 6.7.3.1 --------------------
159  
160  The following examples are included here to illustrate the benefits
161  of the proposed changes, but are not themselves proposed as
162  additions to the standard.
163  
164  --------- Examples for Issue 1 -------------------------------------
165  
166  Example A:  restrict in standard library
167  
168          extern in printf(const char * restrict format, ...);
169  
170          printf("%s", "s");
171  
172      Under C89, a translator could "pool" unmodifiable string
173      literals so that "s" refers to a subobject of "%s".  Such
174      pooling is not allowed by the restrict qualifier on the
175      first parameter of printf with the FCD specification of
176      restrict, but is allowed with the proposed specification
177      (because there are no requirements for unmodified objects).
178  
179  Example B:  restricted pointer structure members
180  
181          typedef struct { int n; double * restrict v; } vector;
182  
183          void addB(vector a, vector b, vector c)
184          {
185              int i;
186              for(i=0; i<a.n; i++)
187                  a.v[i] = b.v[i] + c.v[i];
188          }
189  
190      Allowing calls of the form addB(x,y,y) does not inhibit
191      optimization of the loop, but the behavior of such calls is
192      undefined under the FCD specification.  (This is particularly
193      unfortunate because there is no convenient way to rewrite the
194      function and eliminate the qualifier from the second and third
195      arguments of addB.) In contrast, such calls are defined under
196      the proposed specification.  In particular, the objects that
197      are modified in this function are those designated by a.v[i],
198      and &(a.v[i]), or a.v+i, is based on the restricted pointer a.v.
199      Therefore the only requirement imposed by the proposed
200      specification is that all lvalues used to access a.v[i]
201      during an execution of addB shall have addresses based on a.v.
202  
203  Example C:  two levels of indirection
204  
205          void addC(vector * restrict p, vector * restrict q)
206          {
207              int i;
208              for(i=1; i<p->n; i++)
209                  p->v[i] += q->v[i-1];
210          }
211  
212      Calls of the form addC(&x, &x) should be undefined to promote
213      optimization of the loop, and they are undefined under both the
214      FCD and the proposed specifications.  The analysis for the
215      proposed specification is as follows.  Let X be p->v[i], and
216      let B be the block of addC.  Then for a call addC(&x, &x) with
217      x.n>1, &(p->v[i]) is based on the restricted pointer p->v, and
218      so the modification of p->v[i] is considered also to modify
219      p->v.  Recursively, &(p->v) is based on the restricted pointer
220      p, and the same object is also accessed through the lvalue q->v
221      (because p and q have the same value), but &(q->v) is not based
222      on p, so the behavior is undefined.
223  
224  Example D:  array arguments with unmodified overlap
225  
226          void addD(int n, int * restrict x, int * restrict y)
227          {
228              int i;
229              for(i=0; i<n; i++) {
230                x[i]     += x[i+n];
231                y[i+2*n] += y[i+n];
232              }
233          }
234  
235      If z is an array of 300 int's, then for calls of the form
236      addD(100,z,z), the last half of the array accessed through x
237      coincides with the first half of the array accessed through y,
238      but these common elements are not modified.  The FCD
239      specification needlessly gives such calls undefined behavior,
240      but the proposed specification gives them defined behavior.
241  
242  Example E:  comparison with "property M" of
243              SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-4
244  
245          typedef struct {
246              int * restrict p;
247              int * restrict q;
248          } T;
249  
250          int f(T * restrict a, T * restrict b)
251          {
252              *a->p += 1; 
253              return *b->q;
254          }
255  
256      In this example, even if *a and *b happen to refer to the same
257      object of type T, the members a->p and b->q will be distinct
258      restricted pointers.  A translator can infer that *a->p and
259      *b->q are distinct objects, and so there is no potential
260      aliasing to inhibit optimization of the body of f().
261  
262      Note that this means that the restrict qualifiers in the
263      declarations of parameters a and b are not needed to promote
264      optimization of f, and so the question in this example is
265      whether they "do harm" by giving calls of the form f(x, x)
266      undefined behavior.  The answer is clearly yes for the FCD
267      specification.
268  
269      The answer is also yes for the specification proposed in
270      SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-4.  In
271      particular, *a has property M because it contains a subobject,
272      (*a).p, or a->p, that is a restricted pointer with a target,
273      *a->p, that is modified.  This requires that every lvalue used
274      to access *a must have its address based on a.  But in a call
275      of the form f(x, x), *b will designate the same object as *a,
276      and b is not based on a.
277  
278      In contrast, the answer is no  for the proposed specification.
279      In particular, during an execution of f, the only object that
280      is actually modified is *a->p.  Now &(*a->p) is a->p, a
281      restricted pointer, so a->p is also considered to be modified.
282      Recursively, &(a->p) is based on a, another restricted pointer.
283      Because no other lvalues are used to access either *a->p or
284      a->p, all the requirements of the proposed specification are
285      met (and they do not involve b).
286  
287      For this example, the proposed specification can be said to
288      be more precise that the one based on property M because its
289      "bottom up" recursion better limits the extent of objects
290      considered to be modified as a side effect of modifying the
291      target of a restricted pointer.
292  
293  --------- Examples for Issue 2 -------------------------------------
294  
295  Example F:  asserting unmodifiability
296  
297          extern void g();
298  
299          T f(const T * restrict p)
300          {
301              g(p);
302              return *p;
303          }
304  
305      Under the proposed specification, but not under the FCD
306      specification, the use of const and restrict in this example
307      implies that function g only accesses the value of *p, and does
308      not modify it.  On some architectures, this would allow the
309      load from *p to be initiated prior to the function call, so
310      that its value would be available sooner both within that call
311      and for the return value.  The potential benefit could be large
312      if T is a large aggregate type.
313  
314  --------- Example for Issue 3 --------------------------------------
315  
316  Example G:  casting restricted pointer-to-double to pointer-to-char
317  
318      The following example from PC-UK0118 illustrated uncertainty
319      about how to determine the type and extent of the array
320      accessed through a restricted pointer, under the FCD
321      specification.
322  
323          void fred (double restrict *a, double restrict *b) {
324              /* Assume sizeof(double) > 1 */
325              ((char *)b)[sizeof(double)+2] =
326                                    ((char *)a)[sizeof(double)+1];
327              a[0] = b[2];
328          }
329  
330          double array[3];
331          fred(array,array);
332  
333      Under the proposed specification, the requirements implied by
334      the restrict qualifiers apply separately to each modified
335      object, and it follows that this example has defined behavior.
336      In particular, the 1-byte object ((char *)b)[sizeof(double)+2]
337      is accessed only through b, and the disjoint object a[0] is
338      accessed only through a.
339  
340      Here it is assumed that the modification of the 1-byte object
341      ((char *)b)[sizeof(double)+2] is not considered to be a
342      modification of the containing object b[1].  By the way, the
343      same issue arises if the expressions appear as operands of
344      a comma-operator:
345  
346          ((char *)b)[sizeof(double)+1]++,
347              ((char *)b)[sizeof(double)+2]++
348  
349      Thus the extent of the object modified by a particular
350      operation is an issue that is important for the correct
351      interpretation of the restrict qualifier, but is separate from
352      its specification.
353  
354  --------- Examples for Issue 4 -------------------------------------
355  
356  Example H:  realloc of a restricted pointer
357  
358          void f(T * restrict q) {
359              T * restrict p = (T *)malloc(sizeof(T));
360              p[0] = q[0];
361              ...
362              p = (T *)realloc(p, 2*sizeof(T));
363              p[1] = p[0] + q[1];
364              ...
365          }
366  
367      With recent changes to the specification of realloc, the two
368      instances of the lvalue p[0] refer to two different objects
369      (with the value copied from the old object to the new object by
370      realloc).  Because there is no guarantee that there is an array
371      containing both these objects, this use of restrict does not
372      conform to the FCD specification.  It does conform to the
373      proposed specification, which does not require the existence of
374      such an array.
375  
376  Example I:  disjoint linked lists
377  
378          typedef struct item_t {
379              double d;
380              struct item_t * next;
381          } item;
382  
383          void f(item * restrict list1, item * restrict list2)
384          {
385              while(list1 != 0 && list2 != 0) {
386                 list1->d += list2->d;
387                 list1 = list1->next;
388                 list2 = list2->next;
389              }
390          }
391  
392     This use of restrict does not conform to the FCD specification
393     unless each list is contained within an array.  It does conform
394     to the proposed specification, if the items that are modified
395     through list1 are disjoint from the items that are accessed
396     through list2.  This allows the translator greater flexibility
397     in scheduling the loads and stores of the d members of the
398     items.
399  
400  Example J:  disjoint linked lists, with a recursion
401  
402          void g(item * list1, item * restrict list2)
403          {
404              if(list1 != 0) {
405                 while(list1->next != 0 && list2 != 0) {
406                    list1->next->d += list1->d + list2->d;
407                    list1 = list1->next;
408                    list2 = list2->next;
409                 }
410              }
411          }
412  
413      In this example, the modified objects, list1->next->d,
414      are accessed though the 'next' pointers in the first list.
415      The values of these next pointers are not based on either
416      list1 or list2.  This implies that
417      1)  list1 could not be restrict-qualified, because the modified
418          objects are also accessed through list1->d, and
419          &(list1->d) is based on list1.
420      2)  list2->d is disjoint from the modified objects, because
421          &(list2->d) is based on the restrict-qualified pointer
422          list2.
423  
424  Example K:  even/odd index values
425  
426          void f(int n, char * restrict p, char * restrict q)
427          {
428             int i;
429             for(i=0; i<n; i+=2) {
430               ++p[i];
431               ++q[i];
432             }
433          }
434  
435      With the FCD specification, a call of the form f(100, a, a+1)
436      has undefined behavior, but under the proposed specification,
437      it does not.
438  
439      There is an objection that can be raised to the change for this
440      example.  Although there are no dependencies in the loop caused
441      by aliasing, there may be "architectural" dependencies that
442      limit the ways in which the loads and stores of p[i] and q[i]
443      can be reordered.  In particular, p[i] and q[i] may both be
444      contained within the same smallest loadable unit of storage (on
445      systems that cannot load individual bytes), or within the same
446      cache line (a concern for parallel execution on multiprocessor
447      systems).  The FCD semantics do have the advantage of limiting
448      potential sharing to the ends of the arrays being accessed.
449  
450      On balance, the advantage of the FCD specification in this
451      example, which affects only some systems, seems outweighed by
452      the advantages of the proposed specification in all the
453      previous examples.