Reading Data with scanf
When a user enters information into a program, the user types a sequence of characters. Sometimes this information is intended to be a string of characters, such as a name or an address. In other applications, a sequence of characters, such as 123.45, should be interpreted as a number.
When characters are considered to be part of larger units, such as numbers, processing can follow either of two basic approaches:
-
The program can proceed in two steps:
- read the information as a sequence of characters
- convert the character sequence to a number
-
the program can rely upon a library function, such as scanf, to perform both steps as one logical operation.
Previously in this course, we have used scanf to read numbers, combining both reading and conversion steps within the C library function. However, up to this point, examples have been reasonably simple — reading has involved numbers only. This session examines scanf in more detail, allowing input to include a mixture of numbers, characters, and strings in varying formats. The next session will consider reading input character-by-character and then converting parts of the input to numbers or other pieces as a separate step.
Conversion formats
The general scanf function prototype has the form:
int scanf ("format string", addr_arg_1, addr_arg2, ...)
In reading input, the computer scans character-by-character. When a conversion specification is encountered, the computer attempts to find an element of the proper type. That element then is converted to the proper form (e.g., to a number), and the computed value is stored in the address given. When several conversion specifications are given, the first computed argument is stored at the addr_arg_1, the second at the addr_arg_2, etc.
Review Example
Consider the code segment
double a, b; int i; scanf ("%lf%d%lf", &a, &i, &b);
From work earlier in this course, we know:
-
The format string "%lf%d%lf" directs the computer to read three numbers — a double, an int, and another double,
-
In each case, the computer skips initial whitespace (spaces, tabs, newline characters) and then expects to find a number of the specified type.
-
The first number will be stored at the address of a.
-
The second number will be stored at the address of i.
-
The third number will be stored at the address of b.
In addition to %d, %f and %lf, C allows numerous other conversions. A list of common conversion formats follows:
Conversion symbol | Expected conversion | Address type | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
d | converts to a [possibly signed] decimal integer | pointer to an int | |||||||||||||||
u | converts to an unsigned decimal integer | pointer to an unsigned int | |||||||||||||||
o | converts to an unsigned octal (base 8) integer | pointer to an unsigned int | |||||||||||||||
x, X | converts to an unsigned hexadecimal (base 16) integer | pointer to an unsigned int | |||||||||||||||
i | converts to a [possibly signed] integer
if input starts with 0, the input is treated as an octal integer if input starts with 0x or 0X, the input is treated as a hexadecimal integer if the input is not explicitly octal or hexadecimal, the input is treated as a decimal integer | pointer to an int
f, e, g, E, a
| converts to a [possibly signed] float
| pointer to a float
| lf, le, lg, lE, la
| converts to a [possibly signed] double
| pointer to a double
| c
| reads the next character (without skipping whitespace)
| pointer to a char
| s
| after skipping initial whitespace, reads the next characters until
either whitespace is encountered or the maximum field width is read
| a null character is added at the end of the read characters base address of a char array
| n
| ignore characters (precede n by number of characters to skip)
| none; "%10n" ignores the next 10 characters
| |
In addition to these commonly-used conversions, scanf provides some more complicated options as well. Consult the online document for complete details.
man scanf
Notes:
-
All conversion formats, except %c, skip initial whitespace.
-
scanf returns the number of conversions correctly matched — i.e., the number of values actually assigned to variables.
-
If input encountered does not conform to the conversion format (e.g., if letters or punctuation is found when the start of a number is expected), then reading fails and scanf does not attempt to read further variables.
-
If input encounters an end-of-file character (designated EOF) before any successful conversions are found, then scanf, then scanf returns the EOF character.
Input specifications and whitespace within a format string
In addition to conversion formats, the format string within a scanf statement can contain other characters. Overall, scanf processes input using the format string as a guide.
-
When a format string contains a conversion specification (e.g., %d or %lf), scanf expects the proper type of input and stores the computed value in the address given by the next parameter.
-
When a format string contains whitespace, scanf skips over any amount of whitespace in the input (e.g., spaces, tabs, vertical tabs, newlines).
Example 1
Consider the input
K2.71828 H 3.141592
With the code segment
double a, b; int i, result; char ch1, ch2; result = scanf("%c %lf %c %d %lf", &ch1, &a, &ch2, &i, &b);
After this code is executed, variables have these values:
variable | value |
---|---|
ch1 | K |
a | 2.71828 |
ch2 | H |
i | 3 |
b | .141592 |
result | 5 |
Example 1 Commentary
- The %c specification expects a character, so K is read and stored for ch1
- The space before %lf allows scanf to read any whitespace. Here, no whitespace is encountered, so this space in the format string is ignored.
- The %lf specification skips any whitespace (such space already has been discarded), and the real number 2.71828 is stored for a.
- The space before %c allows scanf to read any whitespace. Here the spaces between 2.71828 and H are read and discarded.
- The %c specification expects a character, so H is read and stored for ch2
- The space before %d allows scanf to read any whitespace. Here the spaces between H and 3 are read and discarded.
- %d skips any whitespace (such space already has been discarded), and the integer part of the next number 3 is stored for i.
- No spaces come next in the input, so the next number .141592 is stored in b.
- Altogether, scanf has read and converted 5 characters and numbers, so scanf returns the integer 5.
Example 2
The input for this example is identical with Example 1.
K2.71828 H 3.141592
The code segment for this example is the same as Example 1, except that spaces are removed from the scanf format string.
double a, b; int i, result; char ch1, ch2; result = scanf("%c%lf%c%d%lf", &ch1, &a, &ch2, &i, &b);
After this code is executed, variables have these values:
variable | value | comment |
---|---|---|
ch1 | K | same as example 1 |
a | 2.71828 | same as example 1 |
ch2 | space | no whitespace skipped |
i | no value stored | H cannot be part of an integer |
b | no value stored | reading skipped, as previous scanf error |
result | 3 |
Program scanf-example-1-2.c provides code for both Examples 1 and 2.
Example 2 Commentary
- The %c specification expects a character, so K is read and stored for ch1
- The %lf specification skips any whitespace, and the real number 2.71828 is stored for a.
- The %c specification expects a character, so a space is read and stored for ch2
- The space before %d allows scanf to read any whitespace. Here the spaces before H are read and discarded. Then H is read. Since this is not part of an integer, scanf fails and no further input processing is attempted.
- Altogether, scanf has successfully read and converted 3 numbers and characters, so scanf returns the integer 3.
Additional characters within a format string
Additional characters within a format string identify specific input that a user must type.
-
When a format string contains another character, scanf expects that character to be next in the input.
-
Since the percent character % has a special meaning (to identify a conversion type, such as %d or %lf), a format string should contain %% if the user is to type the percent character as input.
Example 3
Suppose a program is supposed to read hours and minutes in the format hour:minutes:seconds, such as 12:34:56 or 5:8:27. In this setting, the user is supposed to enter the colon character between integer numbers. The following code segment would perform such a read operation:
int hr, min, sec; scanf ("%d:%d:%d", &hr, &min, &sec);
Program scanf-example-3.c provides code for Example 3.
Example 3 Commentary
- As written, spaces may appear in the input before each integer, but not before each colon.
-
If spaces are to be allowed both before and after each colon, the format
string might read:
"%d :%d :%d"
created 13 May 2008 by Henry M. Walker revised 9 January 2010 by Henry M. Walker rewritten, expanded, and reformatted 30 May 2016 by Henry M. Walker |
![]() ![]() |
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. |