Representation of Signed Integers
Reading outline
-
Unsigned integers
- Bits: Simple Data Elements
- Bytes: Groups of Bits
- Non-negative Integers
- Basic [unsigned] C Integer Types
- Signed Integers
This page is the second installment of a two-part reading on the binary representation of integers. The first installment discussed bits, bytes, and the representation of non-negative integers. This installment focuses upon signed integers.
Fixed and Variable Size Approaches
In principle, two approaches might be considered in the storage of integers (or other numbers): fixed size and variable size.
-
Variable size storage allocates as many bits as needed to represent an integer value. The approach is analogous to our common experience writing decimal integers. We use a single decimal digit (e.g., 1 or 7) to write small integers, two digits (e.g., 12 or 65) for integers between 10 and 99, three digits (e.g., 123 or 948) for integers between 100 and 999, etc. Such an approach allows an unlimited range of integers — as long as we have sufficient space to write the digits.
-
Fixed size storage works within the constraint that only a specified number of digits (or binary bits) are available for use. For example, if 8 bits (one byte) are available, the number zero would be represented 00000000, the number one as 00000001, the number two as 00000010, ..., and the number 255 as 11111111. With this approach, numbers 256 or larger cannot be represented within the 8-bit constraint.
In practice, although a few programming languages, such as Scheme, use variable size storage for integers, most programming languages, including C, use fixed size storage for integers. In the next section(s), we explore several fixed-size options for integers available within C.
Basic [unsigned] C Integer Types
The most recent C Standard, approved in 2011, includes several required integer types and also allows a C implementation to define additional, extended types. Historically, the size of integer types has represented a compromise. Large integers require more bits than small integers. If an application can expect all integers to be small, then a small storage size (low number of bits) can save space. Of course, savings for just one number may be small. However, when using large arrays of numbers, the savings can be substantial.
As an example, MyroC stores images (e.g., from a camera) pixel by pixel. Each pixel contains red, green, and blue values, and a char variable (i.e., 1 byte) is used to store each color intensity. That is, the current storage mechanism requires 3 bytes for each pixel.
- A medium-resolution image (800 by 1280 pixels) requires about 3,072,000 bytes (3 MB) when color data are stored with 1 byte per color per pixel. Experiments indicate that transmission of these data over Bluetooth requires 25-30 seconds per image.
- If 8 bytes were used for each color (C's long int or long long int type, depending upon the machine), one image would require about 24MB of storage, and transmission of a picture over Bluetooth would require over 7 minutes.
Overall, storage and transmission factors can have a substantial effect on what can be stored and how long processing takes. C's different storage types allow programmers to choose a size that supports an application adequately without wasting resources and without yielding unnecessary communication delays.
Altogether, the 2011 C Standard defines five different basic types for storing [unsigned] integers. Also, the Standard indicates the minimum number of bits required for each integer type. However, with computer memories becoming larger and less expensive over time, the 2011 C Standard allows the use of larger bit allocations than the minimum. Run program integer-sizes.c on your local machine to determine how many bits are allocated for varying types on your own computer. Some additional notes follow:
-
Historically, the char type or unsigned char type (1
byte) has been used for character data, although char is
considered a type for small integers. The char will be discussed
in more detail in the course segments, sessions, readings, and labs related
to character data.
- Traditionally, characters have focused upon a Western alphabet although other options are available, and historically storage of these characters has required just one byte.
- New character options allow multiple bytes, and new capabilities are identified in the wchar.h header and library.
- In the progression of types, [unsigned] char, [unsigned] short int, [unsigned] int, [unsigned] long int, [unsigned] long long int, each type conceptually utilizes about twice the number of bits as the previous type — although bit allocations for specific implementations vary. For example, a long int contains twice the number of bits as an int. Details may be found in the table at the right.
Unsigned C Types
Unsigned Integer Type | Number of Bits | |
---|---|---|
Standard Minimum | Common Size | |
unsigned char | 8 | 8 |
unsigned short int | 16 | 16 |
unsigned int | 16 | 32 |
unsigned long int | 32 | 64 |
unsigned long long int | 64 | 64 |
Excursion: Basic Addition
Addition of binary numbers follows much the same rules as with the addition of decimal numbers:
0 + 0 = 0 1 + 0 = 1 0 + 1 = 1 1 + 1 = 10 //in decimal 1+1 is 2, //but 2 in binary is 10
In adding 1 + 1, think of the 1 as being a carry into the next place.
When adding multi-digit binary numbers, we proceed right-to-left, digit by digit in much the same way as with decimal addition. First, we add the right most digits. For 0+0, 1+0, 0+1, the result is a single digit which we can write down directly. However, for 1+1, we write down 0 as the result for that column, but then the 1 needs to be added when we perform the addition for the next column of numbers to be added.
To illustrate, we add 1 to 0 six times:
0 1 10 11 100 101 +1 +1 +1 +1 +1 +1 __ __ __ ___ ___ ___ 1 10 11 100 101 110 ↑ ↑↑ ↑ carry no two no carry carry carries carry from working right right digit to left
Addition Practice
Find the [binary] sum of the following 8-bit binary numbers.
(Each initial binary number has an initial 0, so the resulting sum will fit
within 8 binary bits.)
Be sure to type 8 bits in your answer.
First number: | |
Second number: | |
Enter sum (8 bits): |
Signed Integers
Throughout the discussion of integers so far, we have focused on non-negative integers, sometimes called unsigned integers.
- If 16 bits are allocated to an integer, then the integer could store 216 bit patterns, and a variable could hold values between 0 and 216-1=65535.
- If 32 bits are allocated, then an integer could store 232 bit patterns, and a variable could hold values between 0 and 232-1=4294967295.
However, if we want to allow both negative and non-negative values, then about half of the numbers will be negative and about half positive.
- If 16 bits are allocated for both negative and non-negative integers, then the range of numbers would be approximately -32767 to 32767.
- If 32 bits are allocated, then the range could be -2147483647 to 2147483647.
With this in mind, C defines two versions of standard integers:
- The basic integer types (char, short int, int, long int, long long int) allow negatives, zero, and positives.
- Adding "unsigned" to the type (unsigned char, unsigned short int, unsigned int, unsigned long int, unsigned long long int) prescribes all bits will be used to represent non-negative integers.
Once both negative and non-negative integers are allowed, the natural next step is to determine how to represent these numbers within a specified number of bits.
Interestingly, the 2011 C Standard allows three alternatives for representing negative numbers.
Range of Integer Types Varies
As noted earlier, the actual range of values for each integer data type varies according to the actual computer and compiler. The <limits.h> header file provides constants that present the actual range of integer values for each type.
Integer Type | Constants from <limits.h> | |
Minimum in range | Maximum in range | |
unsigned char | 0 | UCHAR_MAX |
char | CHAR_MIN | CHAR_MAX |
unsigned short int | 0 | USHRT_MAX |
short int | SHRT_MIN | SHRT_MAX |
unsigned int | 0 | UINT_MAX |
int | INT_MIN | |
unsigned long int | 0 | ULONG_MAX |
long int | LONG_MIN | LONG_MAX |
unsigned long long int | 0 | ULLONG_MAX |
long long int | LLONG_MIN | LLONG_MAX |
Run program integer-ranges.c on your local machine to determine the ranges of the various types on your computer.
Sign-magnitude Notation
In all three notations for signed integers, the first (leftmost) bit designates whether the number is negative or non-negative.
- If the leftmost bit is 0, the number is non-negative.
- If the leftmost bit is 1, the number is non-positive.
As we shall discuss, the mechanism to determine the representation of a negative number varies according to the three notations, but in each case, the leftmost bit effectively indicates the sign of the number (- or +).
With the first bit devoted to the number's sign (- or +), the largest positive number that can be represented will be 0111...111. That is, the first bit is 0 (the + sign), and the remaining bits are 1. Numbers larger than this cannot be stored properly — a situation called overflow.
In sign-magnitude notation, the first bit is 0 or 1 (for + or -, respectively), and the remaining bits give the binary bits of the non-negative number.
Examples using a 16-bit signed integer:
Decimal | 16-bit representation sign-magnitude notation |
---|---|
127 | 0000000001111111 |
-127 | 1000000001111111 |
0 | 0000000000000000 |
-0 | 1000000000000000 |
87 | 0000000001010111 |
-87 | 1000000001010111 |
Notes:
- As shown in the table, the number zero has two representations (+0 and -0).
- With this notation, addition is somewhat complicated, as one must consider the addition of two positive numbers, one positive and one negative, one negative and one positive, and two negatives.
Both of these circumstances (zero, addition) yield some complexity for circuitry.
Practice
Fill in the table for the 16-bit, sign-magnitude representation of the given decimal integer
Decimal Number | Fill in 16-bit sign-magnitude representation |
---|---|
Ones-complement Notation
As with sign-magnitude notation using n bits, all non-negative numbers must utilize the right n-1 bits and be written with a leading zero. Non-negative numbers are exactly the same with sign-magnitude and ones-complement notation (and also with two-complement, to be discussed later).
Given a non-negative number, the corresponding negative number is determined by changing all 0's to 1's and all 1's to 0's. Several examples follow:
Decimal | 16-bit representation ones-complement notation |
---|---|
27 | 0000000000011011 |
-27 | 1111111111100100 |
0 | 0000000000000000 |
-0 | 1111111111111111 |
87 | 0000000001010111 |
-87 | 1111111110101000 |
114 | 0000000001110010 |
-114 | 1111111110001101 |
60 | 0000000000011100 |
-60 | 1111111111000011 |
Practice
Fill in the table for the 16-bit, ones-complement representation of the given decimal integer
Decimal Number | Fill in 16-bit ones-complement representation |
---|---|
Addition in Ones-Complement
Although ones complement notation is clear and unambiguous, the following example illustrates that addition may seem a little quirky.
Consider binary addition for ±87 and ±27. (Refer to the example above for relevant values.)
decimal | ones complement | Notes | |
---|---|---|---|
27 | 0000000000011011 | ||
87 | 0000000001010111 | ||
binary sum | 114 | 0000000001110010 | correct — matches previously-computed value for 114! |
27 | 0000000000011011 | ||
-87 | 1111111110101000 | ||
binary sum | -60 | 1111111111000011 | correct — matches previously-computed value for -60! |
-27 | 1111111111100100 | ||
-87 | 1111111110101000 | ||
binary sum | -114 | 11111111110001100 | carry to seventeenth bit — right 16 bits off; need to add 1 |
-27 | 1111111111100100 | ||
87 | 0000000001010111 | ||
binary sum | 60 | 10000000000111011 | carry to seventeenth bit — right 16 bits off; need to add 1 |
Notes: As with sign-magnitude notation, two circumstances arise with ones-complement notation
- The number zero has two representations (all zeros and all ones).
- The example with addition suggests that sometimes addition works fine with ones-complement notation, However, sometimes there is a carry into a seventeenth digit, and when the seventeenth bit is removed, a 1 must be added to the computed sum to obtain the correct answer.
Again, both the two forms of zero and the multiple cases for addition yield some complexity for circuitry.
Optional: For the Mathematically Inclined and/or Curious
The addition example for ones-complement arithmetic suggests an issue for n-bit signed numbers: sometimes addition of two ones-complements numbers seems to generate an extra bit, and in those cases, one must be added to the remaining n bits to obtain the proper answer. The following notes explain why this pattern occurs generally.
Suppose n bits are allocated for ones-complement numbers. Since n bits allow 2n different patterns — about half for positives and about half for negatives (and 1 or 2 for zero), positive numbers will be between 0 and 2n-1-1.
In addition, since we only store n bits, we can add or subtract multiples of 2n as we wish — such values are not stored in n-bit numbers.
Now suppose a and b are positive integers, 0 ≤ a, b ≤ 2n-1-1. The negative numbers -a and -b will be represented by binary values with a leading 1 — numbers between 2n-1-1 and 2n.
Next consider the process of switching bits from 0 to 1 and from 1 to 0. One way to accomplish this switch is subtracting the original bit from 1:
1 - 0 = 1 1 - 1 = 0
The positive number representing all n 1's is 2n-1, so switching all bits of a can be achieved by subtracting the binary representation of a from 2n-1, bit-by-bit. That is, the ones-complement representation of the number -a is given by the number 2n-1 - a — a number with a leading 1 that is larger than 2n-1-1. Similarly, -b is represented by 2n-1 - b.
Now, let's consider cases of arithmetic; for simplicity, we assume b > a.
-
a + b will be positive. As long as this number can be represented with n-1 bits, addition will work without trouble.
-
Consider a - b. Since we are assuming b > a, (b -a) is positive, and (a - b) would be negative. We have observed that -b would be represented by 2n-1 - b, so a - b = a - (2n-1 - b) = (2n-1 - (b - a)). With ones-complement notation, this value exactly corresponds to -(b-a), as desired.
-
Consider -a - b. Since -a is represented by 2n-1 - a and -b is represented by 2n-1 - b, -a -b is represented by 2n-1 - a + 2n-1 - b = 2*2n-2 - (a + b).
Here, the factor 2*2n indicates a carry into the extra digit. Also, the -2 term is off by one for ones-complement. Altogether, we can obtain the correct result by ignoring the carry into the extra digit and adding 1.
-
Consider b - a. Since -a is represented by 2n-1 - a, b - a is represented by 2n-1 - a + b. Since we are assuming b > a, b-a is positive, the answer to the addition should be b-a. In interpreting this number, and we can ignore 2n (the carry into an nth bit), because this number is not stored in n bits. This leaves the number -1 - a - b, which is off by 1. Again, adding 1 to the result of the b-a computation yields the correct answer.
Altogether, the representation of -a by 2n-1 - a sometimes works fine, but sometimes the representation is off by 1 and we need an extra term 2n (not recorded) for the result to work right.
Twos-complement Notation
As with the other n-bit notations for negative and non-negative integers, the left-most bit for non-negative integers is 0, the integers utilize the remaining n-1 bits, and non-negative integers must be in the range 0 through 2n-1-1.
Also given a non-negative number a, the twos-complement notation for -a is obtained in two basic steps:
- Write -a in ones-complement notation.
- Add 1
Several examples follow:
decimal | ones-complement | twos-complement |
---|---|---|
27 | 0000000000011011 | 0000000000011011 |
-27 | 1111111111100100 | 1111111111100101 |
87 | 0000000001010111 | 0000000001010111 |
-87 | 1111111110101000 | 1111111110101001 |
60 | 0000000000111100 | 0000000000111100 |
-60 | 1111111111000011 | 1111111111000100 |
114 | 0000000001110010 | 0000000001110010 |
-114 | 1111111110001101 | 1111111110001110 |
Practice
Fill in the table for the 16-bit, twos-complement representation of the given decimal integer
Dec. Num. | Fill in 16-bit representations |
---|---|
16-bit-pos. | |
1's-comp. | |
2's-comp. |
Optional: For the Mathematically Inclined and/or Curious
In the discussion of ones-complement notation, we noted that non-negative numbers a are represented by their unsigned binary representation, but negative numbers a are represented by 2n-1 - a.
Since the twos-complement representation for -a starts with ones-complement notation and adds one, the two's complement representation for -a is 2n- a. With this notation, the issues for addition for ones-complement numbers (involving sometimes being off by one) are completely resolved).
Concluding Observations
In practice, twos-complement notation has two advantages over other approaches for representing negative integers.
- In examining twos-complement notation, note that adding 1 to -0 in ones-complement notation yields 00000...000, the same value as +0. Thus, twos-complement notation has only one bit pattern for the number zero.
- The notes For the Mathematically Inclined or Curious explain why twos-complement addition resolves difficulties encountered with sign-magnitude notation (complexity) and with ones-complement notation (extra carry and sometimes off by one).
Altogether, although the 2011 C Standard allows the use of sign-magnitude notation, ones-complement notation, and twos-complement notation, [almost] all modern computers utilize twos-complement notation due to its simple representation of zero and its efficiency with addition of integers.