Laboratory Exercise on Characters and Strings
Goals
This laboratory exercise examines characters, character functions, details of string storage, and the operations of string library functions within the C programming language.
Work Started in Class
Characters
-
Review the sample program character-example.c described in the Reading on Characters.
-
Be sure you understand each step in processing and why each line of output is printed.
-
Consider the lines
ch5 = ch3 + 1; ch6 = ch5 - 4;
What happens if these lines are changed to
ch5 = ch3 - 5; ch6 = ch5 - 1;
Explain briefly.
-
What happens if the lines are changed to
ch5 = ch3 + 10 ; ch6 = ch5 - 20;
Explain.
-
What happens if the lines are changed to
ch5 = ch3 + 100 ; ch6 = ch5 - 200;
Explain.
-
-
Write a program that reads a character from a terminal and performs the following:
-
prints the character and its integer encoding,
-
prints whether the character is an alphabetic character,
-
prints whether the character is either a letter or a numeric character,
-
capitalizes the letter, if it was a lowercase letter,
-
scanf can be used to read a character with the code:
char ch; scanf ("%c", &ch);
Later sessions in this module will expand on this approach for user input and also will discuss additional mechanisms for reading characters and strings.
Character Arrays, Strings, char * and Storage
Program string-intro.c shows several variations related to the declaration of character arrays, strings, and char * variables.
One run of this program produced the following output:
first 3 characters in each array first: Col second: Wor third: Com fourth: Wor fifth: Hel Variable addresses and array base addresses first address: 359157264, array base address: 359157264 second address: 359157248, array base address: 359157248 third address: 359157232, array base address: 359157232 fourth address: 359157224, array base address: 359157248 fifth address: 359157216, array base address: 4196464 variables printed as strings first: Cold\ufffd second: World third: Computer ScienceWorld fourth: World fifth: Hello
Understanding this program and output can provide substantial insights to how C works with arrays, characters, strings, and pointers.
Storage
The right column shows (in extreme detail) the allocation of memory for program string-intro.c, based upon the above run. Starting at the top of the program:
- first is allocated space for four characters, beginning in storage location 359157264 (see bottom part of the table). Following the normal approach of initializing arrays, the letters, C, o, l, and d are stored in these locations. The program does not specify what data might be located after this part of memory.
- second is allocated space for six characters, beginning in storage location 359157248. In C, a string contains a sequence of characters, followed by a null character (code zero). Since World contains five characters, the string requires six characters to include the code 0 at the end.
- In organizing memory, the gcc compiler decided not to use the space between second and first for data storage. Although these memory locations are present, the data in those unallocated memory addresses may be left over from the work of previous programs.
- third is allocated space for sixteen characters, beginning in storage location 359157232. As with first, this space is initialized with specified characters. As an array of characters (not a string), no code zero is placed in memory at the end of this array.
- fourth specifies the address of a character (e.g., a pointer to the character). In this case, fourth is given the address that begins the string second defined earlier. Note that fourth refers to a location in memory (359157224), and the address of second (359157248) is stored in the variable fourth.
- fifth specifies the address of a character. The address of a character can be the base address of a character array. A char * may be considered either the location of a single character or the starting point for a string. In this case, information for variable fifth is located at 359157216, and that location contains the starting location 4196464 for the literal string "Hello" — compilers often reserve a separate part of main memory for literal data, such as literal strings.
Output
The first set of printf statements access the first three characters in each character array. Within a printf statement, the %c format prints exactly one data element as a character, so that three characters are printed for each printf statement here. Note that arrays and subscripts work the same whether the variable is declared as an array or as the base address of an array found elsewhere.
The second set of printf statements display where each variable is mapped in main memory. The output shown above maps to the memory schematic on the right.
The third set of printf statements print data as C strings. In C, a string variable identifies a starting or base address, and the string is considered to continue until a code 0 or null character is encountered.
- For variables, second, fourth, and fifth, the character data were stored with a null character at the end, and these character strings are printed without difficulty.
- For the variable third, the initialization placed characters in the array, but no null character was at the end. Rather, from the mapping of memory identified in the table, the string "World" was located immediately after the characters in the third array. When printing third, the printf started with the first character of third (i.e., the C character) and continued character by character until reaching a null. Since no null character was encountered in the processing of the third array, printing continued with the data from the second array.
- For the variable first, the array declaration specified four characters, without a null character at the end. Although this works fine for arrays, work with strings requires processing to continue until a null is found. In this case, first is stored in memory at the end of the program area, and we have no idea what might follow. Thus, processing proceeds with the printing of random material until a null is found.
Schematic Memory Diagram
variable | value stored | memory address |
---|---|---|
section of memory for literal strings | H | 4196464 |
e | 4196465 | |
l | 4196466 | |
l | 4196467 | |
o | 4196468 | |
\0 (number) | 4196469 | |
… | ||
fifth |
integer value 4196464 | 359157216 |
359157217 | ||
359157218 | ||
359157219 | ||
359157220 | ||
359157221 | ||
359157222 | ||
359157223 | ||
fourth |
integer value 359157248 | 359157224 |
359157225 | ||
359157226 | ||
359157227 | ||
359157228 | ||
359157229 | ||
359157230 | ||
359157231 | ||
third | C | 359157232 |
o | 359157233 | |
m | 359157234 | |
p | 359157235 | |
u | 359157236 | |
t | 359157237 | |
e | 359157238 | |
r | 359157239 | |
<space> | 359157240 | |
S | 359157241 | |
c | 359157242 | |
i | 359157243 | |
e | 359157244 | |
n | 359157245 | |
c | 359157246 | |
e | 359157247 | |
second | W | 359157248 |
o | 359157249 | |
r | 359157250 | |
l | 359157251 | |
d | 359157252 | |
\0 (number) | 359157253 | |
not specified | 359157254 | |
not specified | 359157255 | |
not specified | 359157256 | |
not specified | 359157257 | |
not specified | 359157258 | |
not specified | 359157259 | |
not specified | 359157260 | |
not specified | 359157261 | |
not specified | 359157262 | |
not specified | 359157263 | |
first | C | 359157264 |
o | 359157265 | |
l | 359157266 | |
d | 359157267 | |
not specified | 359157268 | |
not specified | 359157269 | |
… |
-
Copy string-intro.c to your account, compile and run it, and examine the output.
- Each run of this program likely places variables in different memory locations. Absolute addresses may change, but do the relative addresses change? That is, to what extent does the memory schematic in the above table need to change for different runs of the program.
-
Immediately after the declaration of all arrays, but before any printing, insert the line:
first[3] = second[3] = third[3] = 0;
This line inserts a null character at index 3 for each of the three strings.
Recompile and rerun the program, describe what (if any) differences result in the output printed, and explain why this output is obtained.
-
Immediately after the declaration of all arrays, but before any printing, insert the line:
fifth[3] = 0;
What happens when you try to compile and run this program? Why do you think this result occurs?
Declaring Strings
- Here are a number of different string declarations.
char *baboon; char *chimpanzee = "animal"; char dolphin[]; char emu[] = "animal"; char fox[4] = "animal"; char giraffe[8] = "animal"; char elephant[10]; elephant = "animal";
- Which are valid and which are invalid?
- How do the valid declarations differ?
- What happens if you switch fox and giraffe? How do you think this can be explained? Think about the bounds of arrays, and the layout of characters in main memory.
Initialized and Uninitialized Strings
- The sizeof function tells you how much memory has been allocated to a something.
- The strlen function tells you the length of a string, which is the number of characters stored before the null character('\0').
-
Start a new program for several experiments with strings.
-
Copy the following declaration and code into a main procedure, making sure that you include the library string.h:
char computerscience[16] = "isawesome"; char isawesome[16] = "computerscience"; printf ("strlen (computerscience): %d\n", strlen (computerscience) ); printf ("strlen (isawesome): %d\n", strlen (isawesome) ); printf ("computerscience: %s\n", computerscience ); printf ("isawesome: %s\n", isawesome );
What output is obtained? Briefly explain why these results are printed.
-
What would you expect to get if you had written:
char computerscience[16];
instead of:
char computerscience[16] = "isawesome";
Run the program to check whether your expectations match what is printed.
-
Restore the initialization in Step 3a and then add this line of code:
printf ("Concatenate the strings: %s", strcat (isawesome, computerscience));
- What is the result? Is this what you expected? Change the bounds of array isawesome to 32 and see what happens? What happened now?
- What did the string operator strcat() do? Explain conceptually what happens in the array and where the null character(s) is/are?
-
Strings and Robot Names
-
Write a program that beeps once for each uppercase letter in the string and twice for each lowercase letter. If a string contains spaces, punctuation, or digits, those characters should not cause beeps.
For this program, do not use the string operator strlen().
Control Characters
Characters can represent actions rather than just printing a symbol. Here is a short list of what can be done with some characters:
- '\t' = tab (horizontal tab)
- '\b' = backspace
- '\v' = vertical tab
- Write a program that:
- prints out a sentence with tabs in between each word
- prints out a sentence with vertical tabs between each word
- illustrates how backspace works (print a word with a few backspace and see how much of the word you can read)
String Functions
-
Write a function with the following prototype;
void string_reverse (char str[]):
It should reverse the order of the characters in str (except the null character). Note that it will not return a new string, but it will modify the given string.
Homework
Robot Names as Strings
-
Write a program that takes the name of the robot by using rGetName(), converts all the characters of the name to uppercase (use the string function toupper), and then sets the robot's name to the uppercase version, using rSetName().
Keep in mind that the value of a character is represented by putting single quotes, and string is represented by double quotes.
The MyroC header includes the following description:
/** * @brief Get the name of the robot * @return information about the name of the robot * @post the returned name is a newly-allocated 17-byte string */ const char * rGetName();
Since the return type of rGetName specifies const, this returned string cannot be changed directly. Rather, this string should be copied to another char array, after which changing the string can work without difficulty.
String Operators
Remember from the reading on characters and strings that:
- strlen(s) = return length of s
- strcmp(s,t) = returns 0 is two strings are the same, a value smaller than 1 if s<t , and a value greater than 1 if t<s .
- strcpy(s,t) = copy the string t to s.
-
The terminal command man can provide helpful information about the standard C functions.
-
Go to your terminal and type man strcmp. What are the two different ways that you can compare two strings?
-
In the same manual page, find what the parameters are for strncmp.
Remember that to quit the man pages, you can simple type q.
-
-
Use the man-page capability within a terminal window to obtain information about string functions (e.g., type man string).
-
What does strcat do? Using what you have learned about how strings are stored, and their null characters, explain how strcat works.
-
Copy the program catstr.c. Run the program and observe the PART 1 of the output.
-
Follow the directions in the comments and fill in the blanks in the program.
-
What happened in part 2 that caused the output to be what we didn't expect? Hint: Think about how strcat works and the null character.
-
Strings of Music
-
Write a program that makes the robot beep in the frequencies that corresponds to the musical letters (A,B,C,D,E,F,G - ignore sharps and flats- and H!) that are given in a string. Make the program so that it is not case-sensitive. Here is a simplified header file to define pitches and their corresponding letters: pitches.h Remember that a string is actually an array.
For example, give it the word "BED" and play the frequencies for B, E, and D.
-
Now make this program work for all the letters in the alphabet. Hint: Use mod to wrap the letters back to the musical letters. For example 'H' would wrap to be 'A' ,'I' would wrap to be 'B', 'J' would wrap to be 'C', and so on.
created 26 July 2011 by Dilan Ustek revised 11 August 2011 by Dilan Ustek: citation: Samuel Rebelsky and Henry Walker revised 28 October 2011 by Dilan Ustek revised (correct html and reference) 21 July 2012 by Henry M. Walker introduction added, moderate editing 1 February 2014 by Henry M. Walker readings added 19 September 2014 by Henry M. Walker expanded and reformatted 27 May 2016 by Henry M. Walker |
![]() ![]() |
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. |