CS 415, Section 001 | Sonoma State University | Spring, 2024 |
Algorithm Analysis
|
||
Instructor: Henry M. Walker
Lecturer, Sonoma State University |
Although much of this course has been well developed in recent semesters, the SSU CS faculty recently have approved a partially-updated course description. Currenly, the Web site is reasonably stable, but modest refinements are possbile.
Since May, 2024, the California Faculty Association (CFA) – the labor union of professors, lecturers, librarians, counselors, and coaches across the 23 California State University campuses – has been in negotiations with the management of the California State University System. After a one-day strike on Monday, January 22, the two sides have reached a tentative agreement, and the strike has been called off. Effective Tuesday, January 23, SSU classes (including CS 415) will be held as scheduled.
This page is the second installment of a two-part reading on the binary representation of integers. The first installment discussed bits, bytes, and the representation of non-negative integers. This installment focuses upon signed integers.
In principle, two approaches might be considered in the storage of integers (or other numbers): fixed size and variable size.
Variable size storage allocates as many bits as needed to represent an integer value. The approach is analogous to our common experience writing decimal integers. We use a single decimal digit (e.g., 1 or 7) to write small integers, two digits (e.g., 12 or 65) for integers between 10 and 99, three digits (e.g., 123 or 948) for integers between 100 and 999, etc. Such an approach allows an unlimited range of integers — as long as we have sufficient space to write the digits.
Fixed size storage works within the constraint that only a specified number of digits (or binary bits) are available for use. For example, if 8 bits (one byte) are available, the number zero would be represented 00000000, the number one as 00000001, the number two as 00000010, ..., and the number 255 as 11111111. With this approach, numbers 256 or larger cannot be represented within the 8-bit constraint.
In practice, although a few programming languages, such as Scheme, use variable size storage for integers, most programming languages, including C, use fixed size storage for integers. In the next section(s), we explore several fixed-size options for integers available within C.
The most recent C Standard, approved in 2011, includes several required integer types and also allows a C implementation to define additional, extended types. Historically, the size of integer types has represented a compromise. Large integers require more bits than small integers. If an application can expect all integers to be small, then a small storage size (low number of bits) can save space. Of course, savings for just one number may be small. However, when using large arrays of numbers, the savings can be substantial.
As an example, MyroC stores images (e.g., from a camera) pixel by pixel. Each pixel contains red, green, and blue values, and a char variable (i.e., 1 byte) is used to store each color intensity. That is, the current storage mechanism requires 3 bytes for each pixel.
Overall, storage and transmission factors can have a substantial effect on what can be stored and how long processing takes. C's different storage types allow programmers to choose a size that supports an application adequately without wasting resources and without yielding unnecessary communication delays.
Altogether, the 2011 C Standard defines five different basic types for storing [unsigned] integers. Also, the Standard indicates the minimum number of bits required for each integer type. However, with computer memories becoming larger and less expensive over time, the 2011 C Standard allows the use of larger bit allocations than the minimum. Run program integer-sizes.c on your local machine to determine how many bits are allocated for varying types on your own computer. Some additional notes follow:
The C Standard specifies minimal sizes for several unsigned integer types, but larger sizes are possible on a macine-by-machine basis.
Unsigned Integer Type | Number of Bits | |
---|---|---|
Standard Minimum | Common Size | |
unsigned char | 8 | 8 |
unsigned short int | 16 | 16 |
unsigned int | 16 | 32 |
unsigned long int | 32 | 64 |
unsigned long long int | 64 | 64 |
Addition of binary numbers follows much the same rules as with the addition of decimal numbers:
0 + 0 = 0 1 + 0 = 1 0 + 1 = 1 1 + 1 = 10 //in decimal 1+1 is 2, //but 2 in binary is 10
In adding 1 + 1, think of the 1 as being a carry into the next place.
When adding multi-digit binary numbers, we proceed right-to-left, digit by digit in much the same way as with decimal addition. First, we add the right most digits. For 0+0, 1+0, 0+1, the result is a single digit which we can write down directly. However, for 1+1, we write down 0 as the result for that column, but then the 1 needs to be added when we perform the addition for the next column of numbers to be added.
To illustrate, we add 1 to 0 six times:
0 1 10 11 100 101 +1 +1 +1 +1 +1 +1 __ __ __ ___ ___ ___ 1 10 11 100 101 110 ↑ ↑↑ ↑ carry no two no carry carry carries carry from working right right digit to left
Find the [binary] sum of the following 8-bit binary numbers.
(Each initial binary number has an initial 0, so the resulting sum will fit
within 8 binary bits.)
Be sure to type 8 bits in your answer.
First number: | |
Second number: | |
Enter sum (8 bits): |
Throughout the discussion of integers so far, we have focused on non-negative integers, sometimes called unsigned integers.
However, if we want to allow both negative and non-negative values, then about half of the numbers will be negative and about half positive.
With this in mind, C defines two versions of standard integers:
Once both negative and non-negative integers are allowed, the natural next step is to determine how to represent these numbers within a specified number of bits.
Interestingly, the 2011 C Standard allows three alternatives for representing negative numbers.
As noted earlier, the actual range of values for each integer data type varies according to the actual computer and compiler. The <limits.h> header file provides constants that present the actual range of integer values for each type.
Integer Type | Constants from <limits.h> | |
Minimum in range | Maximum in range | |
unsigned char | 0 | UCHAR_MAX |
char | CHAR_MIN | CHAR_MAX |
unsigned short int | 0 | USHRT_MAX |
short int | SHRT_MIN | SHRT_MAX |
unsigned int | 0 | UINT_MAX |
int | INT_MIN | INT_MAX |
unsigned long int | 0 | ULONG_MAX |
long int | LONG_MIN | LONG_MAX |
unsigned long long int | 0 | ULLONG_MAX |
long long int | LLONG_MIN | LLONG_MAX |
Run program integer-ranges.c on your local machine to determine the ranges of the various types on your computer.
In all three notations for signed integers, the first (leftmost) bit designates whether the number is negative or non-negative.
As we shall discuss, the mechanism to determine the representation of a negative number varies according to the three notations, but in each case, the leftmost bit effectively indicates the sign of the number (- or +).
With the first bit devoted to the number's sign (- or +), the largest positive number that can be represented will be 0111...111. That is, the first bit is 0 (the + sign), and the remaining bits are 1. Numbers larger than this cannot be stored properly — a situation called overflow.
In sign-magnitude notation, the first bit is 0 or 1 (for + or -, respectively), and the remaining bits give the binary bits of the non-negative number.
Examples using a 16-bit signed integer:
Decimal | 16-bit representation sign-magnitude notation |
---|---|
127 | 0000000001111111 |
-127 | 1000000001111111 |
0 | 0000000000000000 |
-0 | 1000000000000000 |
87 | 0000000001010111 |
-87 | 1000000001010111 |
Notes:
Both of these circumstances (zero, addition) yield some complexity for circuitry.
Fill in the table for the 16-bit, sign-magnitude representation of the given decimal integer
Decimal Number | Fill in 16-bit sign-magnitude representation |
---|---|
As with sign-magnitude notation using n bits, all non-negative numbers must utilize the right n-1 bits and be written with a leading zero. Non-negative numbers are exactly the same with sign-magnitude and ones-complement notation (and also with two-complement, to be discussed later).
Given a non-negative number, the corresponding negative number is determined by changing all 0's to 1's and all 1's to 0's. Several examples follow:
Decimal | 16-bit representation ones-complement notation |
---|---|
27 | 0000000000011011 |
-27 | 1111111111100100 |
0 | 0000000000000000 |
-0 | 1111111111111111 |
87 | 0000000001010111 |
-87 | 1111111110101000 |
114 | 0000000001110010 |
-114 | 1111111110001101 |
60 | 0000000000011100 |
-60 | 1111111111000011 |
Fill in the table for the 16-bit, ones-complement representation of the given decimal integer
Decimal Number | Fill in 16-bit ones-complement representation |
---|---|
Although ones complement notation is clear and unambiguous, the following example illustrates that addition may seem a little quirky.
Consider binary addition for ±87 and ±27. (Refer to the example above for relevant values.)
decimal | ones complement | Notes | |
---|---|---|---|
27 | 0000000000011011 | ||
87 | 0000000001010111 | ||
binary sum | 114 | 0000000001110010 | correct — matches previously-computed value for 114! |
27 | 0000000000011011 | ||
-87 | 1111111110101000 | ||
binary sum | -60 | 1111111111000011 | correct — matches previously-computed value for -60! |
-27 | 1111111111100100 | ||
-87 | 1111111110101000 | ||
binary sum | -114 | 11111111110001100 | carry to seventeenth bit — right 16 bits off; need to add 1 |
-27 | 1111111111100100 | ||
87 | 0000000001010111 | ||
binary sum | 60 | 10000000000111011 | carry to seventeenth bit — right 16 bits off; need to add 1 |
Notes: As with sign-magnitude notation, two circumstances arise with ones-complement notation
Again, both the two forms of zero and the multiple cases for addition yield some complexity for circuitry.
The addition example for ones-complement arithmetic suggests an issue for n-bit signed numbers: sometimes addition of two ones-complements numbers seems to generate an extra bit, and in those cases, one must be added to the remaining n bits to obtain the proper answer. The following notes explain why this pattern occurs generally.
Suppose n bits are allocated for ones-complement numbers. Since n bits allow 2n different patterns — about half for positives and about half for negatives (and 1 or 2 for zero), positive numbers will be between 0 and 2n-1-1.
In addition, since we only store n bits, we can add or subtract multiples of 2n as we wish — such values are not stored in n-bit numbers.
Now suppose a and b are positive integers, 0 ≤ a, b ≤ 2n-1-1. The negative numbers -a and -b will be represented by binary values with a leading 1 — numbers between 2n-1-1 and 2n.
Next consider the process of switching bits from 0 to 1 and from 1 to 0. One way to accomplish this switch is subtracting the original bit from 1:
1 - 0 = 1 1 - 1 = 0
The positive number representing all n 1's is 2n-1, so switching all bits of a can be achieved by subtracting the binary representation of a from 2n-1, bit-by-bit. That is, the ones-complement representation of the number -a is given by the number 2n-1 - a — a number with a leading 1 that is larger than 2n-1-1. Similarly, -b is represented by 2n-1 - b.
Now, let's consider cases of arithmetic; for simplicity, we assume b > a.
a + b will be positive. As long as this number can be represented with n-1 bits, addition will work without trouble.
Consider a - b. Since we are assuming b > a, (b -a) is positive, and (a - b) would be negative. We have observed that -b would be represented by 2n-1 - b, so a - b = a - (2n-1 - b) = (2n-1 - (b - a)). With ones-complement notation, this value exactly corresponds to -(b-a), as desired.
Consider -a - b. Since -a is represented by 2n-1 - a and -b is represented by 2n-1 - b, -a -b is represented by 2n-1 - a + 2n-1 - b = 2*2n-2 - (a + b).
Here, the factor 2*2n indicates a carry into the extra digit. Also, the -2 term is off by one for ones-complement. Altogether, we can obtain the correct result by ignoring the carry into the extra digit and adding 1.
Consider b - a. Since -a is represented by 2n-1 - a, b - a is represented by 2n-1 - a + b. Since we are assuming b > a, b-a is positive, the answer to the addition should be b-a. In interpreting this number, and we can ignore 2n (the carry into an nth bit), because this number is not stored in n bits. This leaves the number -1 - a - b, which is off by 1. Again, adding 1 to the result of the b-a computation yields the correct answer.
Altogether, the representation of -a by 2n-1 - a sometimes works fine, but sometimes the representation is off by 1 and we need an extra term 2n (not recorded) for the result to work right.
As with the other n-bit notations for negative and non-negative integers, the left-most bit for non-negative integers is 0, the integers utilize the remaining n-1 bits, and non-negative integers must be in the range 0 through 2n-1-1.
Also given a non-negative number a, the twos-complement notation for -a is obtained in two basic steps:
Several examples follow:
decimal | ones-complement | twos-complement |
---|---|---|
27 | 0000000000011011 | 0000000000011011 |
-27 | 1111111111100100 | 1111111111100101 |
87 | 0000000001010111 | 0000000001010111 |
-87 | 1111111110101000 | 1111111110101001 |
60 | 0000000000111100 | 0000000000111100 |
-60 | 1111111111000011 | 1111111111000100 |
114 | 0000000001110010 | 0000000001110010 |
-114 | 1111111110001101 | 1111111110001110 |
Fill in the table for the 16-bit, twos-complement representation of the given decimal integer
Dec. Num. | Fill in 16-bit representations |
---|---|
16-bit-pos. | |
1's-comp. | |
2's-comp. |
In the discussion of ones-complement notation, we noted that non-negative numbers a are represented by their unsigned binary representation, but negative numbers a are represented by 2n-1 - a.
Since the twos-complement representation for -a starts with ones-complement notation and adds one, the two's complement representation for -a is 2n- a. With this notation, the issues for addition for ones-complement numbers (involving sometimes being off by one) are completely resolved).
In practice, twos-complement notation has two advantages over other approaches for representing negative integers.
Altogether, although the 2011 C Standard allows the use of sign-magnitude notation, ones-complement notation, and twos-complement notation, [almost] all modern computers utilize twos-complement notation due to its simple representation of zero and its efficiency with addition of integers.
created August 7, 2022 revised December 29, 2022 |
|
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. |