CS 415, Section 001 Sonoma State University Fall, 2022
 
Algorithm Analysis
Instructor: Henry M. Walker

Lecturer, Sonoma State University
Professor Emeritus of Computer Science and Mathematics, Grinnell College

Although much of this course is well developed, some details can be expected to evolve as the semester progresses.
Any changes in course details will be announced promptly during class.

Worksheet/Lab on Topological Sorting, Partition, and Quicksort:>

As suggested by this worksheet/lab's title, this exercise is organized into three parts:

Topological Sorting

  1. Consider the following directed graph.

    a directed graph
  2. (not required, but available for extra credit) Describe the efficiency (Big-θ) of the topological sort algorithm for a directed graph with v vertices and e edges, assuming edge information is stored in an adjacency list, and justify your answer.

The Partition Procedure and Alternative Loop Invariants

  1. Based on the Reading on Quicksort, the program partitions.c provides

    1. Review the program and explain
      • the purpose of the code
          typedef struct algs {
             char * name;
             int (*proc) (int [ ], int, int, int);
          } partitionType;
        
                    
        and how this structure is used.
      • the purpose of the line
        #define printCopyTime 0  // 1 =  print times to copy arrays; 0 = omit this output
                    
        and how printCopyTime is used.
      • what steps are involved to time segments of code in C/C++ (be sure to identify what function(s) is(are) called).
      • explain the purpose of variables, maxreps and cop_time, and why these variables are needed.
    2. Expand this program by inserting the following two functions and updating the main procedure to include them in test runs:

      • a procedure that implements Loop Invariant 2 (most of the code is given in the Reading on Quicksort).
      • a procedure that implements Loop Invariant 3 in the same reading
      Note: Both of these procedures must be based on the Loop Invariants specified. Procedures violating the specified Loop Invariant will lose [almost] all credit for that part.
  2. Run your expanded program from Step 3, and print out the output obtained. Then answer these questions.

    1. Which, if any, of the implementations of the partition are most efficient? Why do you think this result is observed?
    2. What, if any, time penalty is obtained by using a separate swap function, rather than writing the three lines of code inline within a partition function?
    3. Roughly, how do the times change for each procedure, when the data sets double in size with each main iteration? Does this experimental timing suggest Big-θ for the run-time? Explain.
  3. Finding the kth largest item: The partition method may be used to find the kth smallest element in an array, by narrowing the range to be examined within the overall array. For example, suppose that partition returns index middle as the location of the final location for the pivot. Basic processing involves three cases:

    Write a procedure select to find the kth largest element in any array. select should use procedure partition and the above notes the above algorithm to guide its processing. Your lab write up should include the code for select, the enclosing program used for testing, and the test runs used for checking correctness.

    Note: For an array of size n, setting k to n/2 enables select to find the median value.

Quicksort, Improved Quicksort, and Hybrid Quicksort

  1. Program quicksort-comparisons.c contains two copies of quicksort procedures and a framework for timing the runting of these procedures on ascending, random, and descending data sets of varying sizes. In particular,

    These functions come directly from the Reading on Quicksort

    1. Revise the relevant function(s) labeled "impr", to transform the code to implement an "improved quicksort". In brief, an "improved quicksort" modifies the "basic quicksort" by selecting a random element in the array segment between index left and right, and swapping that element with the element at array index left. Otherwise, the "improved quicksort" is the same as the "basic" version.
    2. Run the program and describe what happens. Why do you think the program crashes on ascending and/or descending data for the basic quicksort, once the data set gets to a basic size?
    3. Modify the testing component of the main procedure, so that the basic quicksort component is run only for ascending or decreasing data sets of relatively small size, but times for those data sets are given only as --- for larger data sets. The full program still should produce timing output for the basic quicksort for all sizes of random data and for the improved quicksort for all data sets. For example, part of the output might have the following format (although the numbers may be [quite] different).
                          Data Set                   Times
      Algorithm             Size     Ascending Order   Random Order  Descending Order
      basic quicksort      40000          1.1  ok        0.0  ok           1.1  ok
      improved quicksort   40000          1.1  ok        0.0  ok           1.1  ok
      
      . . .
                          
      basic quicksort     160000         18.1  ok        0.0  ok          18.0  ok
      improved quicksort  160000         17.9  ok        0.0  ok          18.0  ok
      
      basic quicksort     320000         ----            0.0  ok          ----
      improved quicksort  320000          0.0  ok        0.0  ok          0.0  ok
      
      . . .
                          
      basic quicksort    2560000         ----            0.4  ok          ----
      improved quicksort 2560000          0.2  ok        0.4  ok          0.2  ok
      
      . . .
              
    4. Review the output produced by this updated program (and turn it in with the revised program and other answers to this assignment). Under what circumstances, if any, does the improved quicksort yield better results than the basic version? Explain these results briefly, based on your program runs.
  2. Expand the program in Step 6 to include a hybrid quicksort function (with any needed helper functions—perhaps copied with minor revision from the improved quicksort). The hybrid quicksort, is described in the Reading on Quicksort.

    1. The expanded program should include these elements:
      • The revised hybrid quicksort function should include another parameter—the maximum size of the array segment for an insertion sort (before the improved quicksort is used).
      • For each data set, the main program should call the hybrid quicksort with the maximum size for the insertion sort having values 4, 5, 6, . . . 11.
      • The maximum sized data set should be set as 40960000
    2. Print out the results of a sample run of this program, and answer these questions:
      • For which array-segment sizes, if any, does the insertion sort improve the performance of the hybrid quicksort?
      • What optimal size of an array segment should be used for an insertion sort, rather than a quicksort, in this hybrid algorithm? Explain briefly.
created August 6, 2022
revised August 9, 2022
revised September 27, 2022
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.