Reading on Types and Variables

The simple programs introduced at the start of this course illustrated that many (most?) (all?) C programs utilize variables to store data during processing. These simple programs also illustrate the following.

A variable name can be almost any sequence of letters (upper or lower case), digits, and the underscore character (-).
- The variable name must start with a letter or underscore.
- Some keywords (e.g., int, float, double, if, else, etc.) have special meanings and cannot be redefined.
Variable names are case sensitive, so a capital letter (e.g., A) is considered distinct from a lower case letter (e.g., a).
C requires a variable to be declared before it can be used.
The declaration must include what type of data will be stored (e.g., int, double), and
Once declared, variables store values which can be used in subsequent processing (e.g., in computations and printing). (Examples later in the course will show values stored in variables can be changed as well.)

Building upon these introductory examples, this reading explores data types and variables in C in moderate detail. Basic questions include:

Why declare variables at all; what is gained by having to specify variables and their types before the variables are used?
What happens when data and variables of differing types are mixed within expressions?
What options are available for printing data of various types?

Reading Outline

Data Representation and Storage
Aiding Error Detection
- Spelling
- Types
Assignment Operations
Auto-increment and auto-decrement
Arithmetic Operations
- Operator Precedence
- Resulting Type after an Operation
Conversion of Types and Casting
Printing
- Identifying the Type of Number to be Printed
- Additional Formatting Options

Data Representation and Storage

At a practical level, C identifies numerous types of data. Further, the 2013 C Standard allows some flexibility in the details underlying numeric types, particularly integer types. In the following description, details represent common implementations within contemporary computers, although these details may differ from one machine to another. General comparisons among types, however, will likely carry over to each C compiler and implementation.

For now, we only consider a few, commonly-used types of numbers. Details are discussed in the course segment on data representation.

short: integers (negative, zero, or positive numbers without decimal points), generally limited to the range -32,768 to 32,767.
int: integers (no decimal points), generally limited to the range -2,147,483,648 to 2,147,483,647.
unsigned short: non-negative integers (no decimal points), generally limited to the range 0 to 65,535
unsigned int: non-negative integers (no decimal points), generally limited to the range 0 to 4,294,967,295.
float: real numbers (with an explicit decimal point), including about 7-8 digits of accuracy
double: real numbers (with an explicit decimal point), including about 16 digits of accuracy

Behind the scenes, these types require different amount of storage, and the details of storage for the types may vary.

Going beyond the details, perhaps the most important observation here is that the nature and details of storage differ for various data types.

Altogether, this background provides an important motivation for the declaration of variables in C: Declaration of each variable in C tells the computer how much storage to allocate to a variable and what details will apply in storing and processing data for that variable.

Comparison of basic numeric types

Although we will study the behind-the-scenes details of these data types in a later course segment, some general comparisons follow.

The short and various int types store integers exactly but the ranges of integers are constrained — perhaps severely constrained. Thus, these types allow exact storage and computation, but within a limited range. These types cannot be used for large positive or large negative integers.
The float and double types allow the storage of real numbers (numbers with decimal points) with a wide range of sizes. However, the accuracy of the numbers stored is limited.
Relatively little space is required for the short and unsigned short types; int and unsigned int types require more space.
Similarly, the float type requires half the space of the double type.

Altogether each of these types represents a trade off between the size/accuracy of stored values and the amount of storage required.

Aiding Error Detection

Programming languages vary considerably with regard to the introduction of variables within a program. At one extreme, some languages consider the appearance of each new variable name as an implicit request to create a variable by that name. Similarly, some languages infer from context what type of data might be involved with a variable, and processing proceeds by assuming the intended data type. Such a flexible environment introduces rather little overhead in writing programs, and coding in these languages can be relatively simple, as long as the programmer does not make typographical errors and the computing environment makes the correct guesses about what type(s) of data are intended. However, with the computing making inferences, the programming environment cannot perform much error checking.

In contrast, C takes a different approach — requiring the explicit declaration of variables and their types. Although this requires some initial overhead for a programmer, declaration of variables also aids the detection of errors in programs in at least two ways: detecting misspelling and finding inconsistent views of data by a programmer.

Spelling

A particularly common difficulty for programs that have just been written involves typographical errors. With C, variables must be declared before they are used, so the compiler can check that each variable has already been identified.

In the short-term, a programmer may be annoyed at having to react to compiler error messages about undeclared variables. It takes time to correct typographical errors and to compile a program — particularly when the programmer wants to move ahead to check how the new code will work.
In the medium- to long-term, finding typographical errors early can resolve difficulties that can take a long time track down later. A small amount of time expended early can save substantial time later.

Limitations in Detecting Spelling Errors

Although declaration of variables can help identify many errors, declarations cannot help identify all typographical errors. For example,

A compiler cannot determine that a programmer intended one variable when another variable was typed.
A compiler cannot discover that the same variable name is used in different parts of the same program for different purposes.

Altogether, declaration of variables can help catch some types of errors quickly, but variable declaration cannot enable identification of all typographical errors.

Types

Some early studies of common errors in computer programs concluded that a substantial fraction of difficulties arose, because a programmer thought of a variable as one type during part of the program and as another type in a different part of the code.

Also, when a language does not require variables to be declared with a specified type and the computer tries to infer the intended data type, subtle errors can arise in some current programming language. A programmer may write code, based on processing using one type of data, when the computer adopts a different type behind the scenes.

Since C requires variables to be declared as a specific type, and since the type cannot change within a program, C avoids possible inconsistencies in programmer thinking or processing inferences. C's requirements may seem inflexible and cumbersome at times, but C's use of data types anticipates and resolves many problems that arose historically and other difficulties still found in using current programming languages.

Historical and Current Examples

In early FORTRAN programs, it was common to use an integer variable to store several characters (e.g., the string "ab"). Processing could proceed without trouble if the variable consistently represented integers, for then arithmetic operations could be applied in a meaningful way. Similarly, storage of multiple characters in an integer could work fine. Difficulty could arise, however, when arithmetic options were applied to data which were supposed to represent multiple characters.

The modern programming language PHP makes inferences and/or assumptions about the nature of data in a program, and processing details may depend upon the data types used. For example, the author has had the experience of assigning numbers to variables, and then trying to compare whether the numbers are equal. In some instances, the PHP environment interpreted one variable as a number and another as a character string. Since the numbers were of different types, the computer concluded the values could never be equal.

Arithmetic Operations

C provides these common arithmetic operations:

operation	meaning	data types	examples
+	addition	all types	1+2=3; 2.4+1.3=3.7
-	subtraction	all types	1-2=-1; 2.4-1.3=1.1
*	multiplication	all types	56=30; 2.41.3=3.12
/	division	all types	7/2=3; 2.4*1.3=1.84615...
%	remainder	integer types only	9%4=1

Operation Notes

Integer types support only a limited range of values. If an operation yields a number out of range, an error condition, called overflow, is created.
Real number types support only a limited degree of accuracy. If an operation yields a result beyond the accuracy available, an error condition, called round-off error, arises. Other types of errors for real numbers are discussed in the course segment on numeric data representation.
One integer divided by another integer yields an integer; the remainder is dropped. Thus, 7/3 yields 2 rather than 2.5.
Integer types have an additional operation: the remainder, denoted %. For example, the remainder is 1 when 7 is divided by 3, so 7%3 is 1.

Assignment Operations

In our programs up to now, we have assigned values to variables with an assignment operation (=). The general syntax is

variable = expression

In this setting, the expression on the right is evaluated, and the resulting value is stored in the variable — replacing the old value. For example, in the first program for this course quarts.c, we computed:

    liters = quarts / 1.056710 ;      /* arithemtic, assignment */

Often in future programs, we will want to update the value stored in a variable. For example,

   int a = 5;
   int b = 7;
   ...
   b = 2 * a + 12;

In this example, b begins with the value 7. However, after awhile, the value for b is updated to 2 * 5 + 12 or 22 (assuming the value for a has not changed).

In the following variation, we use the value of a in computing the new value for a.

   int a = 5;
   ...
   a = 2 * a + 12;

Here, for example, the value of a is updated from 5 to 22.

Although many types of updates are possible, a common circumstance is to add, subtract, multiply, or divide one value by another. For example,

   a = a + 2;  // add 2 to a
   b = b - 5;  // subtract 5 from b
   c = c * 4;  // update c by multiplying it by 4
   d = d / 10; // update d by dividing it by 10

Since this type of computation is so common in C programs, C provides a shorthand type of assignment: +=, -=, *=, and /=. With this shorthand, the above examples become

  a += 2;
  b -= 5;
  c *= 4;
  d /= 10;

Auto-increment and Auto-decrement

Of all the possible updates to a variable, adding and subtracting 1 is particularly common. Here, C has two variations. For a variable a

pre-increment: ++a means add 1 to a before doing anything else in the expression.
post-postincrement: a++ means add 1 to a after every other computation in the expression is completed.

The following examples may help clarify these operators:

   int a = 5;   // a starts at 5
   int b = a++; // b is assigned a's initial value (5), and then a is increment to 6
   int c = 12;  // c starts at 12
   int d = ++c; // c is incremented to 13, and this revised value is assigned to d

At the end of this code, a is 6, b is 5, c is 13, and d is also 13.

Operator Precedence

When multiple operations appear in the same expression, C follows the same conversions as mathematics. Thus, multiplication and division have higher priority or precedence than addition or subtraction. Further, when several of the same precedence appear in the same expression, the operations are done left to right.

In the expression 1 + 2 * 3,
* has higher precedence than +. Thus, 2*3 is performed first, and the resulting 6 is added to 1 to yield 7 as the final answer. Effectively, precedence implies the addition of parentheses to the original expression to give 1 + (2 * 3) .
In the expression 1 - 2 - 3, evaluation proceeds left to right. Effectively, parentheses are added, so the left subtraction is done first, giving (1 - 2) - 3. The left subtraction gives -1. Subtracting 3 gives a final result of -4.

Resulting Type after an Operation

When two variables have the same numeric type (e.g., two int variables or two double variables) C performs arithmetic using that data type. For example,

a real number divided by a real number yields a real number (e.g., 5.7 / 2.2 = 2.590909...
an integer divided by an integer yields an integer (e.g., 11 / 4 = 2 — any remainder is dropped in obtaining an integer).

However, when an arithmetic operation is applied to a real number and an integer, the integer is converted to a real number first, before the operation is performed.

In computing 2.3 + 5, the integer 5 is converted to the real 5.0, and the operation 2.3 + 5.0 is performed to yield 7.3. The same process and result is followed in computing 5 + 2.3.

For division, 11 / 4 yields 2 (the remainder 3 is dropped). However, in computing 11 / 4.0, the 11 is converted to 11.0 before division occurs, giving 11.0 / 4.0 which yields 2.75.

Finally, when an expression contains a mixture of integers and real numbers, conversion to real numbers occurs as late as possible. (Integer arithmetic is faster at the hardware level, so performing as much integer arithmetic as possible makes processing as quick as possible.)

Consider the expression 1.0 + 11 / 4

Since division has higher precedence than addition, 11 / 4 is performed first.
Since both 11 and 4 are integers, integer division is performed, yielding 2
The expression has become 1.0 + 2
Since this expression contains both a real number and an integer, the integer is converted to the real number 2.0.
Adding 1.0 and 2.0 yields 3.0 as a result.

Similarly, consider the assignment double ans = 11 / 4

space for a double is allocated for ans
In initializing ans, the expression 11 / 4 is evaluated.
Since both 11 and 4 are integers, integer division applies, yielding 2.
The value 2 is converted to a double 2.0, and the result stored in the space allocated for ans.

Conversion of Types and Casting

In some cases, conversion from one data type to another seems easy and obvious. For example, conversion from an integer to a real number seems straightforward (conceptually, just a decimal point, so 5 becomes 5. or 5.0).

When an arithmetic operation is applied to an integer and a real number (e.g., 2.3 + 5), we have noted that the integer will be converted to a real number before the operation is performed (e.g., 2.3 + 5 becomes 2.3 + 5.0).
Similarly, when an integer is assigned to a double (e.g., double a = 5), the integer is converted to a real number before the assignment occurs (e.g., double a = 5 becomes double a = 5.0.

However, sometimes we may want to explicitly convert one type to another. For example, in the expression 1.0 + 11/4, we may want the division to be performed for real numbers before the addition operation. In such instances, we can specify the data conversion by placing the desired type in parentheses, before the number to be converted. For example, we might write 1.0 + (double) 11 / (double) 4

Examples of Casting

Effectively, casting has a high operator precedence. Thus, casting takes place before various arithmetic operation. Use parentheses to force casting early or to clarify what data conversions will happen when.

Expression	Expression, with parentheses to show order of evaluation	Result	Notes
1.0 + (double) 11 / 4	1.0 + ((double) 11) / 4	3.75	Casting done first
1.0 + ((double) 11) / ((double) 4)	1.0 + (((double) 11) / ((double) 4))	3.75	Data conversion done first, then real-number division
1.0 + ((double) 11) / 4	1.0 + ((double) 11) / ((double) 4)	3.75	Since division of real and integer requires conversion of integer, both 11 and 4 changed to real.
1.0 + 11 / ((double) 4)	1.0 + ((double) 11) / ((double) 4)	3.75	Since division of real and integer requires conversion of integer, both 11 and 4 changed to real.

Printing

The first programs for this course included reasonably simple printf statements, so users could read the results of computations. As illustrated in those programs, the printf statement utilizes the following format:

   printf ("format specification", variable1, variable2, ... );

The "format specification", given within double quotes, may present text to be printed, together with instructions of what values are to be printed in what format.

First Printing Example

The very first program in this course converted a number of quarts to liters, and printing was accomplished with the statement:

    printf ("%d quarts = %lf liters\n", quarts, liters);

When the program is run with quarts being 2 and liters 1.892667, the program prints

2 quarts = 1.892667 liters

Looking at the "format statement":

%d indicates that a number should be inserted here (the value stored in quarts)
the letter d indicates the number will be an integer.
the next part of the "format statement" is " quarts = ", and this text is printed exactly as stated.
%lf indicates that a second number should be inserted (the value stored in liters) and the letter lf indicates the number will be a double.
the final part of the "format statement" is " liters\n". Again, this text is printed as stated — moving to a new line at the end.

Identifying the Type of Number to be Printed

In particular, a number will be inserted within the text of a "format specification" when a percent sign (%) is encountered, and the letters following the % indicates the type of data to be printed. Previous examples in the course have illustrated the use of %d for printing integers in decimal format and %lf for printing doubles. The following table shows a more complete list of number-formatting options for format specifications within printf statements.

Using the % character

Since the % character has a special meaning within a format specification, one must use %% when the % character itself is to be printed.

Conversion format	associated data type	Explanation	Example
Conversion format	associated data type	Explanation	`printf` statement	Output
`d` or `i`	`int`	print number as a decimal integer	`printf("%d", 26);`	26
`o`	`unsigned int`	print number in octal (base 8) notation	`printf("%o", 26);`	x32 (x represents octal)
`u`	`unsigned int`	print number as an unsigned decimal integer	`printf("%u", 26);`	26
`x`	`unsigned int`	print number in hexadecimal (base 16) notation	`printf("%x", 26);`	0x1A (0x represents hexadecimal; A represents 10)
`f` or `F`	`double`	print a number in the decimal format [-]dddd.dddd `f` format prints `nan` if the variable does not represent a valid number (`nan` stands for "not a number"> `F` format prints `NAN` if the variable does not represent a valid number	`printf("%f", 10.0/3.0);`	33.33333
`e` or `E`	`double`	print a number in exponential, decimal notation the number 123.456 is written 1.23456 E 2, representing 1.23456 × 10² `e` format prints 1.23456 e2, with the exponential form represented with lower-case e `E` format prints 1.23456 E2, with the exponential form represented with upper-case e	`printf("%e", 123.456);`	1.23456e2

In addition, the prefix h may be added to d, i, o, u, and x for printing short and unsigned short integers. For example,

use "%hd" to print a variable declared as a short integer, and
use %hu" to print a variable declared as a unsigned short integer.

Additional Formatting Options

In the example programs up to now, each number was printed in a default format. That is, the computer usually printed the number with no spaces before or after, and real numbers were printed with a default number of digits. Such formatting can work well, when numbers are to be printed within a sentence or other text.

However, printing of numbers within columns requires additional work. For example, we might want to print two rows of numbers in a table, with each column being 11 characters wide. Also, we may want to specify how many decimal places should be printed for real numbers.

To address these needs, C allows a more general designation of conversion format. In particular, for any format (e.g., f), two additional elements may be added:

field width: the minimum number of characters to be printed for a number.
- If the number to be printed requires fewer characters, the full width is allocated, and the number appears right justified in that field.
- If the number to be printed requires more characters, the full number is printed — possibly exceeding the specified field width.
precision:
- For integers, the minimum number of digits that must appear.
  If the precision is 0, the number may not be printed.
  If the precision equals the field width, the number is printed with leading zeros.
- For real numbers, the maximum number of digits to appear after the decimal point.

To specify these optional parameters, a complete format specifier has the form

   % field_width.precision conversion_specifier

The following program printf-formatting.c provides several examples of these formatting options. This program generates two columns, each 11 characters wide, and the output is presented here to the right of the program itself.

/* Program to illustrate formatted printing of several data types */

#include <stdio.h>

int main ()
{
  short   sh = 13;          // short integer
  int     in = 13;          // normal integer
  float   fl = 10.0 / 3.0;  // basic floating point number
  double  db = -25.0 / 9.0; // double floating point number

Program notes

Preliminaries

  /* label columns to facilitate counting */
  printf ("data type:  /w f.width   /w precision\n");
  printf ("            12345678901 / 12345678901\n");

  printf ("integer printing\n");
  /* field width 11   field width 11, precision 11 */
  printf ("short:      %11hd / %11.11hd\n", sh, sh);
  printf ("integer:    %11d / %11.11d\n", in, in);

  printf ("\nfloating-point printing\n");
  printf ("float:      %11f / %11.3f\n", fl, fl);
  printf ("float:      %11f / %11.5f\n", fl, fl);
  printf ("double:     %11lf / %11.2lf\n", db, db);
  printf ("double:     %11lf / %11.8lf\n", db, db);

  return 0;
}

In the program, variable sh is declared as a short integer. Hence, hd is used in the corresponding printf statement to print the value of sh as a decimal number. (Similarly, hx would be used to print the value of sh as a hexadecimal number.)

Program output

data type:  /w f.width   /w precision
            12345678901 / 12345678901
integer printing
short:               13 / 00000000013
integer:             13 / 00000000013

floating-point printing
float:         3.333333 /       3.333
float:         3.333333 /     3.33333
double:       -2.777778 /       -2.78
double:       -2.777778 / -2.77777778

In practice, when designing nicely formatted output, a programmer first needs to consider the desired format on a character-by-character basis. Once the layout is determined, the programmer should count how many characters must be allocated for each numeric value and what level of accuracy should be reported for real numbers. These width/precision values can then be added to format specifications.

created 15 July 2016 by Henry M. Walker
modest editing to address student feedback 7 September 2016 by Henry M. Walker
polishing of html to meet W3C Standard 1 February 2018 by Henry M. Walker

For more information, please contact Henry M. Walker at walker@cs.grinnell.edu .

CSC 115.005/006	Sonoma State University	Spring 2022
	CSC 115.005/006: Programming I
Instructor: Henry M. Walker Lecturer, Sonoma State University Professor Emeritus of Computer Science and Mathematics, Grinnell College

created 15 July 2016 by Henry M. Walker modest editing to address student feedback 7 September 2016 by Henry M. Walker polishing of html to meet W3C Standard 1 February 2018 by Henry M. Walker
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu .

Notes: