Lab on Data Representation Consequences

This laboratory exercise explores some practical consequences of the representation of data in program processing.

Integer Overflow

Recall from your prior work in C/C++ that constants variables INT_MIN and INT_MAX in C/C++ contain the smallest and largest int values available in C/C++.

Suppose i and j are two non-negative integers, and a program is supposed to find their average (as an integer). (In case the arithmetic average is a real number ending in .5, then the average may be rounded either up or down. Thus, the actual average 7.5 of 6 and 9 may be rounded to either 7 or 8.)

Five approaches are proposed to find this average:
```
      avg1 = (i + j) / 2;
      avg2 = i/2 + j/2;
      avg3 = (i+1)/2 + j/2;
      avg4 = i/2 + (j+1)/2;
      avg5 = (i+1)/2 + (j+1)/2;
    
```
1. Which, if any, of these approaches will work reliably for all non-negative integers i and j? Explain.
2. Suppose i and j may be any integers—positive, negative, or zero. In this general case, which, if any, of these approachs will work reliably for all values of i and j? Explain.
Consider the program integer-average.c.
1. Compile and run the program, and record what int values are possible within C proprams.
2. Review the program to determine how the values of arr1 are computed, and how the value of sum compares to INT_MAX
3. Check the program output. Is the computation of the average of values for arr1 correct?
4. Answer parts b and c for array arr2. What is different in the processing? To the extent that you can, explain why the average computation for this array yields an incorrect result.

Conditionals and Loops

Consider the code segment
```
      double start = 7.0;
      double factor = 3.0;
      double quotient  = start / factor;
      double result = quotient * factor;
      if (result == start)
         printf "start and result are the SAME\n");
      else
         printf "start and result are DIFFERENT\n");
    
```
Include this code segment within a C program, using several values of start and with factor being 10.0, 5.0, 4.0, 3.0, and 2.0.
1. For which values of factor does the code segment print SAME, and for which values does the code yield DIFFERENT?
2. What are the binary representations of 1.0/ factor for those cases where SAME is printed and what can you say about the representations for the cases involving DIFFERENT.
3. Can you reach a possible conclusion about the accuracy of divisions of real numbers in C (or other programming languages?
Consider program float-loop.c. As explained in the Reading on Consequences of the Data Representation of Numbers, repeated adding 0.1 to a number (starting at 0.0) never yields a result that is exactly 1.0.
1. Run this program to confirm that the program contains an infinite loop. (When the program is running, you can stop it by holding down the "control" key and typing "c". (This sometimes is notated "CTL-c".)
2. Change the real numbers from float to double. Then recompile and rerun the program. Does this again result in an infinite loop. Briefly explain why.
3. Rather than using the increment 0.1, use the increment 0.5. Is the loop still infinite, or does it stop as hoped? (Rerun the program using both floatand double variables.) Explain these results, based on the representation of floating point numbers in binary.
4. Change the end value to 12.0, and experiment with increments of 0.25, 0.3, 0.4, and 0.75. Which, if any, of these yield infinite loops, and which programs terminate. In each case, explain why this result occurs (referring to binary representations for floating point numbers. (In this discussion, it may be helpful to note that 0.5 or 0.25 (decimal) is represented exactly in binary by 0.1 and 0.01, respectively

Associativity of Addition for Real Numbers

Consider program arithmetic-series.c
1. Compile and run this program, and observe what happens.
2. Explain why the output of the first loop is obtained.
3. Why do you think the termalways seems to be a power of 10 in the first loop?
4. Explain why the first loop terminates. (That is, why is there a point when oldSum == sum?)
5. Although all values of term are printed as 0's to 17 decimal places, the values of sum do not always end in 0's. Explain how this could happen.
6. Review the second loop. Explain why the sequences of steps of the second loop do (or do not) parallel the steps in the first loop (except perhaps in the opposite order).
7. Why do you think that some printed values of term are not always an exact power of 10 in the second loop ?

created 31 March 2022 revised 31 March 2022 expanded 24 July 2022
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.

Laboratory Exercise on Consequences of Data Representation

Integer Overflow

Conditionals and Loops

Associativity of Addition for Real Numbers