Sonoma State University
 
Algorithm Analysis
Instructor: Henry M. Walker

Lecturer, Sonoma State University
Professor Emeritus of Computer Science and Mathematics, Grinnell College

Although CS 415 has been well developed for several years, last year the CS faculty made a significant, long-term curricular change regarding SSU's Upper Division GE Area B Requirement.

Assignment on Brute Force Algorithms and Merge Sort

This assignment is in two parts: Brute Force Algorithms and Merge Sort

Brute Force Algorithm

This section on contains two exercises related to brute force algorithms.

  1. Consider the Brute-Force String Matching Problem, in which a string or text T[0..n-1] is searched for the occurance of a substring or pattern P[0..m-1]. For details, see the discussion in Section 3.2 (pp. 105-106) of Levitin's textbook or in numerous Web sources.

    For this problem, we consider the following text, slightly edited from the textbook and numerous Web sites:

            for (int i = 0; i <= n-m; i++) {
                for (int j = 0; (j < m) && (P[j] == T[i+j]); j++) {
                }
                if (j == m) {
                   return i;
                }
            }
            return -1;
          
    1. Show an example how this algorithm proceeds in processing (with alphabetic strings other than "NOBODY_NOTICED_HIM" and "NOT", as used in the text).
    2. Suppose the first characterof string/text T is the character 'a'. What can you say about the worst case for run time, regarding the first character of substring/pattern P and the remaining characters in T? (For a worse case, execution must enter the inner loop as much as possible!)
    3. Based on your answer for part a (when the first character of T is 'a'), what can you say about the substring/pattern P for the worst case? (For a worse case, execuion must enter the inner loop as much as possible, and the inner loop must proceed through the inner loop as much as possible.)
    4. For the worst case, the textbook and many other sources state, "the algorithm makes m(m-n+1) characer comprisons". Using rules for C and C++, do you agree with that conclusion? Justify your answer. (If your conclusion is different from the textbook, where do you think the textbook answer comes from?)
    5. Based on your analysis, determine Big-O and Big-Ω for this algorithm. Does Big-Θ exist for this code? Explain briefly.

Merge Sort

  1. Merge Sort is discussed in the "Divide and Conquer" section of the textbook. What part of the algorithm corresponds to a "divide" segment and what part corresponds to a "conquer" segment? Explain in at least 5 sentences.

  2. Figure 5.2 in the textbook show a graph, giving the steps involved for sorting an 8-element array using a Merge Sort.

    1. Draw the parallel graph for the 12-element array a = {3, 1, 4, 1, 5, 9, 2, 9, 5, 3, 5, 8]
    2. Describe how processing actually proceeds through the diagram, using the traditional, recursive algorithm. That is, write several sentences, indicating what part of the diagram is processed first, what happens second, etc.
  3. As discussed in the textbook and in class, a Merge Sort has Θ(n log n), and this provides a general sense of how this algorithm will scale as the array increases in size. This problem explores how much variation might be expected in time for sorting with different types of data. To begin, consider the program mergesort-data-sets.c which runs the traditional, recursive merge sort on ascending, random, and descending data for arrays of different size.

    1. The program is designed to run with array sizes 800,000, 1,600,000, 3,200,000, . . . , 25600000. Compile and run the program and describe what happens?

    2. With size constraints on the run-time stack, the stack overflows, since the merge procedure allocates local storage with each call. (On my machine, stack overflow occurs with an array size between 2,000,000 and 2,100,000—but your experience may be different.) To resolve this problem, replace the local declarations

                  int larr [lsize];
                  int rarr [rsize];
      with dynamic memory (e.g., declare variables int * larr and initialize using malloc, but be sure to free that memory when the merge procedure terminates).
      With that change, compile and run the program several times, and answer the following questions.
      • To what extent do the times differ from one run of the program to another?
      • How much variation is reported for the different data sets for a given array size? Do you observe any consistent differences moving across a row? Explain briefly.
      • Looking at the column for arrays in ascending order, what can you say about sorting times, when the array size doubles?
        • Algorithms which have Θ(n) would have times double (approximately) when size doubles. Do the times here fit that pattern? Explain.
        • Algorithms which have Θ(n2) would have times increase by about a factor of 4 when size doubles. Do the times here fit that pattern? Explain.
        • The analysis of Merge Sort indicates a Θ(n log n) algorithm. Discuss how the actual timings might fit this analysis.
      • Answer the same questions for arrays in random or descending order.
    3. To further explore how timing might be effected by the data set, expand the program to include two more data sets for each array (adding additional columns to the output).

      • One data set should be in ascending order, except that the first 10 elements are swapped with last 10 elements.
      • One data set should be a separate array initialized independently with random data
      Compare these additional data sets to the previous ones, and describe any similarities or differences in the run times.
created December 1, 2018
revised December 2, 2018
revised December 27-30, 2021
revised February 4, 2022
reformatted and heap material added July 28, 2022
reorganized with brute force/merge sort added October 3-6, 2022
revised December 30, 2022
reorganized with moderate editing Summer, 2023
additional editing November 30-December 7, 2024
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.
ccbyncsa.png

Copyright © 2011-2025 by Henry M. Walker.
Selected materials copyright by Marge Coahran, Samuel A. Rebelsky, John David Stone, and Henry Walker and used by permission.
This page and other materials developed for this course are under development.
This and all laboratory exercises for this course are licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.