Why are these constructs using pre and post-increment undefined behavior?

python undefined behavior
undefined behavior in java
undefined behavior computer science
implementation-defined vs undefined behavior
sequence points
warning sequence point
undefined behavior vs unspecified behavior c++
c defined behavior
#include <stdio.h>

int main(void)
{
   int i = 0;
   i = i++ + ++i;
   printf("%d\n", i); // 3

   i = 1;
   i = (i++);
   printf("%d\n", i); // 2 Should be 1, no ?

   volatile int u = 0;
   u = u++ + ++u;
   printf("%d\n", u); // 1

   u = 1;
   u = (u++);
   printf("%d\n", u); // 2 Should also be one, no ?

   register int v = 0;
   v = v++ + ++v;
   printf("%d\n", v); // 3 (Should be the same as u ?)

   int w = 0;
   printf("%d %d\n", ++w, w); // shouldn't this print 1 1

   int x[2] = { 5, 8 }, y = 0;
   x[y] = y ++;
   printf("%d %d\n", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}

C has the concept of undefined behavior, i.e. some language constructs are syntactically valid but you can't predict the behavior when the code is run.

As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of i.e. requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.

So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile, that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.

Your most interesting-looking example, the one with

u = (u++);

is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).

What makes C developers so curious if "i++ == ++i"?, C# developers don't necessarily use the bitwise operators (&, |, ~) or prefix operators this question specifically, and explains why this construct is undefined. If the equality test is done first, then the pre- and post- increment operators, you're the behaviour of an expression in C or C++ can be undefined because the  Why are these constructs using pre- and post-increment undefined behavior? Why is “using namespace std” considered bad practice? Undefined, unspecified and implementation-defined behavior ; Undefined behavior and sequence points ; Why is it faster to process a sorted array than an unsorted array?

Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.

This is what I get on my machine, together with what I think is going on:

$ cat evil.c
void evil(){
  int i = 0;
  i+= i++ + ++i;
}
$ gcc evil.c -c -o evil.bin
$ gdb evil.bin
(gdb) disassemble evil
Dump of assembler code for function evil:
   0x00000000 <+0>:   push   %ebp
   0x00000001 <+1>:   mov    %esp,%ebp
   0x00000003 <+3>:   sub    $0x10,%esp
   0x00000006 <+6>:   movl   $0x0,-0x4(%ebp)  // i = 0   i = 0
   0x0000000d <+13>:  addl   $0x1,-0x4(%ebp)  // i++     i = 1
   0x00000011 <+17>:  mov    -0x4(%ebp),%eax  // j = i   i = 1  j = 1
   0x00000014 <+20>:  add    %eax,%eax        // j += j  i = 1  j = 2
   0x00000016 <+22>:  add    %eax,-0x4(%ebp)  // i += j  i = 3
   0x00000019 <+25>:  addl   $0x1,-0x4(%ebp)  // i++     i = 4
   0x0000001d <+29>:  leave  
   0x0000001e <+30>:  ret
End of assembler dump.

(I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)

Increment and decrement operators, They are commonly implemented in imperative programming languages. C-like languages feature two versions (pre- and post-) of each operator with slightly  Why are these constructs using pre- and post-increment undefined behavior? (10) A good explanation about what happens in this kind of computation is provided in the document n1188 from the ISO W14 site. I explain the ideas.

I think the relevant parts of the C99 standard are 6.5 Expressions, §2

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

and 6.5.16 Assignment operators, §4:

The order of evaluation of the operands is unspecified. If an attempt is made to modify the result of an assignment operator or to access it after the next sequence point, the behavior is undefined.

What does an expression involving multiple post/pre decrement , EDIT: The merger with another question has left the below answer incorrect in several details. Asking for the output of expressions involving multiple pre/post increment/decrement Why are these constructs (using ++) undefined behavior? Why are these constructs using pre and post-increment undefined behavior? Can code that is valid in both C and C++ produce different behavior when compiled in each language? makefile:4: *** missing separator. Stop; What does the C ??!??! operator do? Why are elementwise additions much faster in separate loops than in a combined loop?

Post Increment variable 'i' and assigned it to the same variable [ i = i , Post Increment variable 'i' and assigned it to the same variable [ i = i++ ] For an LOC(line of code) with undefined behavior, you can not question why, how the difference between pre-increment(++i) and post-increment(i++). Why are these constructs using pre and post-increment undefined behavior? Why are these constructs using pre- and post-increment undefined behavior? Detecting signed overflow in C/C++ ; Undefined behavior and sequence points ; How disastrous is integer overflow in C++? How did I get a value larger than 8 bits in size from an 8-bit integer?

The behavior can't really be explained because it invokes both unspecified behavior and undefined behavior, so we can not make any general predictions about this code, although if you read Olve Maudal's work such as Deep C and Unspecified and Undefined sometimes you can make good guesses in very specific cases with a specific compiler and environment but please don't do that anywhere near production.

So moving on to unspecified behavior, in draft c99 standard section6.5 paragraph 3 says(emphasis mine):

The grouping of operators and operands is indicated by the syntax.74) Except as specified later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.

So when we have a line like this:

i = i++ + ++i;

we do not know whether i++ or ++i will be evaluated first. This is mainly to give the compiler better options for optimization.

We also have undefined behavior here as well since the program is modifying variables(i, u, etc..) more than once between sequence points. From draft standard section 6.5 paragraph 2(emphasis mine):

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

it cites the following code examples as being undefined:

i = ++i + 1;
a[i++] = i; 

In all these examples the code is attempting to modify an object more than once in the same sequence point, which will end with the ; in each one of these cases:

i = i++ + ++i;
^   ^       ^

i = (i++);
^    ^

u = u++ + ++u;
^   ^       ^

u = (u++);
^    ^

v = v++ + ++v;
^   ^       ^

Unspecified behavior is defined in the draft c99 standard in section 3.4.4 as:

use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance

and undefined behavior is defined in section 3.4.3 as:

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

and notes that:

Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

Order of evaluation, This is not to be confused with left-to-right and right-to-left associativity every other language construct that involves a function call, is sequenced before 5) The side effect of the built-in pre-increment and pre-decrement operators is f(​++i, ++i); // undefined behavior until C++17, unspecified after C++17 Why are these constructs using pre- and post-increment undefined behavior? Detecting signed overflow in C/C++ ; Why does integer overflow on x86 with GCC cause an infinite loop? Why are elementwise additions much faster in separate loops than in a combined loop? Why is reading lines from stdin much slower in C++ than Python?

Evaluating Function Arguments, In some cases, the Standard describes the behavior of a construct as implementation-defined or unspecified. Now, suppose you call the function using: It can evaluate the arguments from left to right, and delay the post-​increment on right to left and performs the pre-increment just before evaluating the left argument. [Warning] operation on 'i' may be undefined [-Wsequence-point] Am I missing something about how = functions ? EDIT : Before marking as duplicate, please note that I have browsed other posts about sequence points and undefined behavior. None of them addresses the expression i=++i (note the pre-increment) specifically.

5.4, Postfix increment (post-increment), ++, x++, Copy x, then increment x, then return the copy Consequently, y ends up with the value of 5 (the pre-incremented value), The code to construct, handle and destroy the exception may have its The statement a = a++ results in undefined behavior, and different  What are "sequence points"? What is the relation between undefined behaviour and sequence points? I often use funny and convoluted expressions like a[++i] = i;, to make myself feel better. Why sh

Can anyone explain the output..?, called undefined behavior depending upon compiler to compiler the -these-​constructs-using-pre-and-post-increment-undefined-behavior  After stumbling across the question . After stumbling across the question "Why are these constructs using pre- and post-increment undefined behavior?" today I decided to grab the newest draft for the next C standard I could find and read more about it.

Comments
  • Homework? Not trying to be a pain, but you should never write code with expressions like these. They are usually given as academic examples, sometimes showing that different compilers yield different output.
  • @Jarett, nope, just needed some pointers to "sequence points". While working I found a piece of code with i = i++, I thougth "This isn't modifying the value of i". I tested and I wondered why. Since, i've removed this statment and replaced it by i++;
  • Explain these undefined behaviors? Explain what about them? How they behave is undefined.
  • I think it's interesting that everyone ALWAYS assumes that questions like this are asked because the asker wants to USE the construct in question. My first assumption was that PiX knows that these are bad, but is curious why the behave they way the do on whataver compiler s/he was using... And yeah, what unWind said... it's undefined, it could do anything... including JCF (Jump and Catch Fire)
  • I'm curious: Why don't compilers seem to warn on constructs such as "u = u++ + ++u;" if the result is undefined?
  • I knew it was undefined, (The idea of seing this code in production frighten me :)) but I tried to understand what was the reason for these results. Especially why u = u++ incremented u. In java for example: u = u++ returns 0 as (my brain) expected :) Thanks for the sequence points links BTW.
  • @PiX: Things are undefined for a number of possible reasons. These include: there is no clear "right result", different machine architectures would strongly favour different results, existing practice is not consistent, or beyond the scope of the standard (e.g. what filenames are valid).
  • @PaulManta, If you see this, editing answers is not intended for the addition of irrelevant information to already-accepted answers. This is a C question and the answer was fine as it was to describe the situation in C standards from C90 to C11. Editing is for fixing syntax and style.
  • @rusty Not sure what you mean. The term "undefined behavior" is used in the C standard. It means that even though some constructs are syntactically valid and will typically compile, they lead to undefine behavior i.e. they do not make sense and should be avoided since your program is broken if it has undefined behavior.
  • @MattMcNabb that is only well defined in C++11 not in C11.
  • how do i get the machine code? I use Dev C++, and i played around with 'Code Generation' option in compiler settings, but go no extra file output or any console output