An Introduction to Program Organization
At this point in the course, laboratory exercises have involved writing short programs to solve some simple problems. As you would expect, future tasks will involve larger and more complex problems. In anticipation of this later work, we need to develop strategies and practices that can help us to be effective and efficient in writing correct solutions to these problems.
Throughout the course, we will methodically expand our problem-solving insights and techniques. In this reading, we begin with three principles:
-
Principle: Write code once, not multiple times — using mechanisms called functions and procedures.
-
Principle: Comments can clarify thinking, reduce time for code development, and minimize the need for correcting errors after initial code is written.
-
Principle: Write code incrementally, not all at once — a process called incremental program development.
This reading introduces each of these principles.
Some Definitions
When a solution to a problem might include the same code in several places, a mechanism is needed to extract the details into a common place and then referenced as needed. In C, the mechanism to collect this common material is called a function or procedure.
When details of a solution are collected in one place (e.g., a function or procedure) and used in several places, details are separated from their use. This is procedural abstraction; one part of the code serves as a high-level outline of the steps of an algorithm using function/procedure references, while the low-level details of those steps are given elsewhere and do not mask the overall flow of the solution.
When code is first written, it must be tested. When errors are identified (i.e., when the answers produced are not correct to solve the problem), the error must be resolved. The process of finding and correcting errors is called debugging.
Write code once, not multiple times
Although the principle, "write code once, not multiple times", may seem straightforward and obvious, applying this principle may require some planning and insight. To illustrate this point, consider the myroc-espeak-1.c program from the reading on Using the Scribbler 2.
You may recall that the program header identified three main parts of the program:
/* Program combining test-to-speech capabilities * with Scribbler 2 motion and sound * * Part 1: robot moves forward and right three times * while playing an ascending musical chord * Part 2: robot moves forward and left three times * while playing a descending musical chord * Part 3: repeats Part 1 * * Author: Henry M. Walker * Date created: 12 May 2016 * */
Although the code worked without difficulty, at least four criticisms are possible.
-
Since the application commands the robot to beep, an early section of the program identifies a sequence of pitch names and frequencies. Such a listing of notes has two weaknesses:
-
A long listing of notes is error prone. For example, it would be easy to mistype
const int pitchG6 = 1568;
with a frequency of 1586 (switching the last two digits) or 1567 (mistyping the last digit). -
This type of listing will likely be needed in many programs involving robot beeping, and retyping the notes in many programs would be tedious and time consuming.
-
The names used for the pitches in myroc-espeak-1.c come from common labeling of keys on a piano.
- Following Western musical tradition, notes on a scale are grouped into octaves. In particular, a piano has 88 keys, arranged in 12-note groupings. The lowest grouping is called octave 0 (with just 3 notes), the next octave 1 (with 12 notes), the next octave 2 (with 12 notes), and so forth through octave 7 (with 12 notes) and octave 8 (with just 1 note).
- Within an octave, notes are labeled C, D, E, F, G, A, B. In addition, "half tones" are identified between some notes. For example, notes B flat and A sharp represent frequencies between A and B. On a piano, B flat and A sharp are the same (the scale is called well-tempered). On other instruments (e.g., a violin), distinctions are made between such half tones — for this course, we leave such matters to musicians and other courses.
- In the myroc-espeak-1.c program, variables follow the naming convention: pitch, note label, and octave. Thus, pitchC6 is the frequency for the note C in the 6th octave. For notes involving sharps or flats, such a convention might use pitchAs0 and pitchBf0 for the notes A sharp and B flat in octave 0.
- In Western music, frequencies are given as real numbers (with decimal points). Thus, pitchA0 is often specified as 27.5000 Hz and pitchC6 as 1046.50. For the Scribbler 2 robots, however, pitches must be integers, so these pitches must be rounded to either 27 or 28 for pitchA0 and 1046 or 1047 for pitchC6. Other pitches are rounded to the nearest integer without decision making.
-
The code for Part 3 duplicates the code for Part 1: 13 lines of commands (plus 3 blank lines) are repeated. This repetition also has two weaknesses:
-
If we made an error in Part 1 (e.g., if rTurnRight (0.7, 0.5) should have been rTurnRight (0.5, 0.7), we would need to remember to make the correction in both Parts 1 and 3. A similar issue would arise if, after initial development, we wanted to change Part 1 (perhaps using different pitches).
-
In testing the program, just because we know Part 1 is correct, we cannot assume Part 3 is correct. Since the code is typed separately, a typographical error (or other difficulty) in one part might or might not be present in the other part.
-
-
The main flow of the program is obscured. Again, two weaknesses can be identified.
-
More than 1 dozen lines of details are combined with the high-level outline of the program, so it may be difficult for the author or another reader to follow both the high-level logic and the low-level details.
-
Adjusting one part of the program might undermine the logic in another part of the program.
-
Header Files
With these weaknesses identified for myroc-espeak-1.c, we consider how to construct a better version, myroc-espeak-2.c
To begin, we create a new program, scale-notes.h, that contains C code to define the 88 notes on a piano. Rather than focus on just the four notes needed for this specific program, scale-notes.h defines the full range of pitches on the Western well-tempered scale. In any particular program, we may not need all of these pitches defined. However, by having all pitches available, we can choose what we need; if we decide to expand the program later, the full range of pitches will already be available.
Once defined, we should place the file scale-notes.h in the directory with our other programs. Then we can reference it with an include statement, in much the same way we identify standard libraries.
When using an include statement
-
references to C libraries are specified with angle brackets:
#include <stdio.h>
-
references to our own files are specified in double quotes:
#include "scale-notes.h"
Notes for include
Operationally, when a C compiler encounters an include statement, the compiler inserts the text of the file into the current program. Thus, if a compiler were working with myroc-espeak-2.c and encountered include "scale-notes.h", the compiler would locate this new file and read in line-by-line into myroc-espeak-2.c. One might consider include as a directive to copy-and-paste additional text into a file.
Functions and procedures
To address the challenge of duplicate code, we proceed in 3 steps:
- We give the block of code a name, following a specified header format.
- We place the code within braces { } following the header.
- Within the main program itself, we use the name to reference the code block.
As an example, consider myroc-espeak-1.c as the base for the revised program myroc-espeak-2.c
We will use the name forward_right to describe the common block of code for Parts 1 and 3. Within C the beginning of this block of code follows a specific format that also describes what the code will do:
/* Procedure to move the robot forward and right three times while playing an ascending musical chord */ void forward_right ()
Following this header, the desired block of code is placed in braces. The full procedure follows:
/* Procedure to move the robot forward and right three times while playing an ascending musical chord */ void forward_right () { eSpeakTalk ("move forward and turn right"); rForward (1.0, 1.0); rBeep (0.5, pitchC6); rTurnRight (0.7, 0.75); eSpeakTalk ("move forward and turn right again"); rForward (1.0, 1.0); rBeep (0.5, pitchE6); rTurnRight (0.7, 0.5); eSpeakTalk ("move and turn a third time"); rForward (1.0, 1.0); rBeep (0.5, pitchG6); rTurnRight (0.7, 0.25); rBeep (0.5, pitchC7); }
Notes on procedure forward_right
-
As with any variable, the name of a function or procedure can be any combination of letters, numbers, and underscore characters (but should not start with a number).
-
In specifying a procedure or function header, void indicates the code will not return a value to the main program. This code sends commands to the robot and sends text to the speech synthesizer. When done, the main program does not expect a value in return.
- In programming, when a block of code does not return a value, it often is called a procedure. Thus, many programmers would designate forward_right as a procedure.
-
When a block of code returns a value, it is often called a
function. As an example, C's mathematics library includes a square
root function: given a number, it returns the square root of that number.
A possible line in a program might be:
double rt = sqrt (9.0);
Here, the sqrt starts with 9.0 as an initial value and returns 3.0 as the square root. Hence 3.0 is assigned to the variable rt.
-
In this context, the parentheses ( ) after forward_right indicate what follows is a procedure. Later in the course, we will consider options for what might be inserted within these parentheses.
-
To use a procedure (e.g., in main), we write its name with the parentheses:
forward_right ();
- Within C, one should declare a procedure before main, so the compiler will know what the procedure's label means while it is reading main. (Later we shall encounter variations in how procedures can be declared, but it is always safe to declare a procedure before it is first used.)
Similarly for myroc-espeak-2.c, we could define a procedure forward_left for Part 2. Although this code is only used once, definition of a procedure would highlight the separate nature of those details.
With forward_right and forward_left defined, the main procedure highlights the main flow of the overall program:
int main () { // connect for both MyroC and eSpeak rConnect ("/dev/rfcomm0"); eSpeakConnect (); /* Part 1: robot moves forward and right three times while playing an ascending musical chord */ printf ("starting Part 1\n"); forward_right (); /* Part 2: robot moves forward and left three times while playing a descending musical chord */ eSpeakSetGender ("female"); // specify voice characteristics printf ("starting Part 2\n"); forward_left (); /* Part 3: robot moves forward and right three times while playing an ascending musical chord */ eSpeakSetGender ("male"); // specify voice characteristics printf ("starting Part 3\n"); eSpeakTalk ("Part 3 repeats Part 1"); forward_right (); // finish with no errors eSpeakTalk ("Enough of this; I am going to stop now"); rDisconnect (); eSpeakDisconnect (); return 0; }
The complete, revised program, with needed include statements, is available as myroc-espeak-2.c.
-
Much code in main is descriptive (e.g., comments, printing for the user, and eSpeak commentary).
-
Details are placed in procedures, so low-level details do not interfere with the high-level logic of the main program.
-
The procedures are declared in the file before main.
Comments
The very first reading on C programming emphasized that "Since a program articulates a proposed algorithm to accomplish a desired task, the program should be considered as a formal mechanism for precise communication. As a communication vehicle, a program has at least three audiences:" the author, other people, and computers. C's specific syntax and semantics allows computers to compile and run programs — largely handling communication to computers.
Comments and formatting can make a substantial difference in communication to the programming and to others.
Although comments can help others understand a program after it is written, comments can be particularly helpful to authors as they are writing the code.
-
Writing comments first (before code) can help an author understand the problem at hand.
- If the author truly understands the problem, writing comments is fast and easy — often requiring just a few minutes.
- The act of writing comments can uncover fuzziness in the specification of a problem or ambiguity in what must be done.
-
Once the problem is understood, comments can help clarify an author's thinking.
- Writing comments requires full understanding of the algorithm involved.
- Writing comments helps clarify the logical flow of a solution.
Altogether, programmers are urged to write comments early!.
Words of common wisdom
-
A few minutes spent clarifying thinking at an early stage can help identify issues and potential errors later on.
-
"Minutes spent in writing comments can save hours in debugging!"
Write code incrementally, not all at once
Once an overall program is organized into pieces (e.g., with procedures), an author often can write many elements of main. In getting started,
-
comments identify the main steps of a solution,
-
program set up can be accomplished (e.g., with include statements).
-
each main step can be identified as a procedure.
-
A stub can be created for each procedure, giving a header, braces { }, and a largely-empty body.
With this arrangement, the program represents the proper overall structure of a solution, but few details are completed.
Stubs
A stub is a small block of code that identifies a logical step to be performed, but which has limited initial functionality.
For example, when starting to write the myroc-espeak-2.c program, the forward_right and forward_left procedures might be simple stub:
void forward_right () { printf ("procedure forward_right not yet implement\n"); } void forward_left () { printf ("procedure forward_left not yet implement\n"); }
With stubs for procedures, the overall program, as given above in the reading, can run using the proper structure, and the printf statements print text that checks the flow of operations.
The code may not do much when development of a program begins, but the main pieces are in place.
With this structure in place, a programmer can focus on one piece of the overall code at a time. No need to write all details at once and then have to contend with possible issues in dozens or hundreds of lines of code!
By writing one piece of code at a time, a programmer can focus on a few lines, in writing, compiling and running the program.
-
Writing a short, focused piece of code often seems less error-prone than writing a long, complex piece.
-
If something goes wrong, the error is likely to be in the short code segment that was just added. A programmer often can avoid looking at the entire program, if just one small piece has been inserted or changed.
In developing large programs, an important challenge is to manage complexity. If a complex solution can be divided into small tasks, then work can proceed methodically step-by-step, and a programmer does not have to keep many details and logical connections in mind all at once.