Goals

Upon successful completion of this lab, you should be able to understand and write programs using:


Setup and Practice

Setup

  1. If your lab machine was not started in MacOS when you went to use it, reboot it to MacOS.
  2. Follow these procedures to mount your blue home-directory on the local machine and create a PyCharm project (call it Lab10). You will use PyCharm in the next part, but, it is important that you set up your project now so that if you run into any issue, we can resolve them quickly.
  3. Right-click or control-click to download these 4 files and place them into the directory for this lab:

Practice

This lab is all about different ways of answering one question: If I have a list L and a value v, how can I determine whether v is an element of L, and if so, what its index is?

For example, if I have the list ['A', 'B', 'C'] and the value 'C', my answer would be that 'C' is at position 2 in the list.

The functions we write will return None if the value is not in the list. So if I search the same list for 'D', the answer would be None. Note that None (no quotes) is a built-in Python constant, just like True and False.

  1. Answer Question 1 in your Moodle writeup.

  2. Linear search

    If the list is not guaranteed to be in any particular order, we can't do any better than looping through it, looking at each element, and seeing if that element is equal to our desired value.

    For example, if we have the list ['A', 'B', 'C'], and we're looking for 'B', we would start by comparing 'B' to 'A'. In the next iteration of the loop, we would compare 'B' to 'B' and return a 1.

    However, if we were searching for 'D' instead, we would have to look at every element of the list before reaching the conclusion that 'D' is not present.

  3. Answer Question 2 in your Moodle writeup.

  4. Binary search

    If the list is guaranteed to be in sorted order, we can actually do better than examining every single element. Here's how:

    If we have the list ['A', 'B', 'C'], and we're looking for 'C', and we know that the list is in alphabetical order, we can start by comparing 'C' to the middle element (which is 'B' in our example). Since 'C' comes after the middle element alphabetically, we can rule out the element in position 0 without even looking at it. For a longer list, we can keep repeating this trick to cut the list in half, and then half again, and so on.

  5. Answer Question 3 in your Moodle writeup.


Part A

In this part, we are going to read data from a file and print its contents. In later sections, we will search through this data.

Instructions

  1. Create a new Python source code file named lab10a.py:
    """
    Program: CS 115 Lab 10
    Author: Your name
    Description: This program will open a file and then search its contents
        using linear and binary search.
    """
    
    
    def readfile(filename):
        '''Reads the entire contents of a file into a single string using
        the read() method.
    
        Args:
            filename (str): The name of the file to read.
    
        Returns:
            str: The text in the file as a large, possibly multi-line, string.
        '''
        infile = open(filename, 'r')  # opens file for reading
    
        filetext = infile.read()  # reads the file contents into filetext
    
        infile.close()  # closes the file
    
        return filetext  # returns text of the file, as a long string
    
    
    def main():
        '''Reads and prints a file's contents.
        '''
        pass
    
    
    main()
    
  2. Read this code carefully before continuing and answer Question 4.
  3. Right now, this code doesn't do anything because the definition of main is empty. To fill out the definition of main, replace the pass with the following 3 lines of code:
    • A statement that asks the user to supply the name of a file to read. This statement should save the user's answer as a string.

    • A statement that calls the readfile function to retrieve the contents of the user's file.

      You should pass readfile the user-supplied filename (as a string).
      You should save readfile's return value in a different variable.

    • Write a statement to print the value of the variable that holds readfile's return value.
  4. Run your program. Type cities.txt when prompted. Verify that your program prints the names of cities.
  5. Answer Question 5 in your writeup. To help you answer these questions, you can print the length of the information you read from the file --- just comment out any extra print statements when you are finished.
  6. Inside the definition of readfile, add a call to split() at the end of the call to read:

    filetext = infile.read().split()
    
    Answer Question 6 in your writeup.
  7. To split the text from the file into a list of lines rather than a list of words, modify your line of code as shown:

    filetext = infile.read().splitlines()
    
    Answer Question 7 in your writeup.
  8. Here are some samples of what your program should be doing at this point. Modify your wording to match the example, and add code to print the number of lines in the file if you do not have it already.
    Sample 1
    Name of input file: cities-small.txt
    Number of lines in file: 5
    ['Santa Rosa', 'Petaluma', 'Rohnert Park', 'Windsor', 'Healdsburg']
    
    Sample 2
    Name of input file: cities.txt
    Number of lines in file: 50
    ['Los Angeles', 'San Diego', 'San Jose', 'San Francisco', 'Fresno', 'Sacramento', 'Long Beach', 'Oakland', 'Bakersfield',
    'Anaheim', 'Santa Ana', 'Riverside', 'Stockton', 'Chula Vista', 'Fremont', 'Irvine', 'San Bernardino', 'Modesto', 'Oxnard', 
    Fontana', 'Moreno Valley', 'Glendale', 'Huntington Beach', 'Santa Clarita', 'Garden Grove', 'Santa Rosa', 'Oceanside', 'Rancho
    Cucamonga', 'Ontario', 'Lancaster', 'Elk Grove', 'Palmdale', 'Corona', 'Salinas', 'Pomona', 'Torrance', 'Hayward', 
    Escondido', 'Sunnyvale', 'Pasadena', 'Orange', 'Fullerton', 'Thousand Oaks', 'Visalia', 'Simi Valley', 'Concord', 'Roseville', 
    Santa Clara', 'Vallejo', 'Victorville']
    
  9. Add the following function definition to your program:
    def print_list(list_to_print):
        '''Prints the contents of a list.
    
        Args:
            list_to_print (list): The list to print.
        '''
        for i, item in enumerate(list_to_print):
            print(i, ': ', item, sep="")
    
    We probably have not discussed the enumerate function in class. You may not understand how it works, so please experiment with it in PythonTutor using this starter code. Call over a lab helper with any questions.
  10. Modify your main function to call your print_list function instead of printing the list directly. Sample output:
    Name of input file: cities.txt
    Number of lines in file: 50
    0: Los Angeles
    1: San Diego
    2: San Jose
    3: San Francisco
    [etc]
    
  11. Answer Question 8 in your writeup.
  12. Add code at the end of your main function to prompt the user for the names of cities until the user types quit:
    Name of input file: cities.txt
    Number of lines in file: 50
    0: Los Angeles
    1: San Diego
    2: San Jose
    3: San Francisco
    [etc]
    
    Enter the name of a city: Cotati
    
    Enter the name of a city: Fresno
    
    Enter the name of a city: Santa Rosa
    
    Enter the name of a city: quit
    
  13. Continue to Part B.

Part B

In this part of the lab, we will implement our own version of linear search.

Instructions

  1. Make a copy of your lab10a.py file, and name it lab10b.py.
  2. Add the following function definition to your program:
    def linear_search(search_list, value_to_find):
        '''Uses linear search to find the position of an item in a list.
    
        Parameters:
            search_list (list): The list.
            value_to_find (str): The item to search for.
    
        Returns:
            int: The position of the item in the list, or None if not found.
        '''
        pass
    
  3. Write code in linear_search to do the following:

    • Loop over the items in your list
    • Compare each item, in turn, to value_to_find
    • If an item matches the value you are trying to find, return its position in the list immediately.

    You do not need to explicitly return None, but you should make sure that your code does not return anything if the item is not in the list.

    Note: you may have learned about or already know about the search_list.index() method function, but do not use that method. We are trying to duplicate what the index() method function does, by writing our own code rather than make-use of existing functions.

  4. In main, add code to your loop to do the following:
    • Call your linear search function to get the position of each city entered by the user.
    • Print the result.
  5. Make sure your code matches the sample inputs and outputs below exactly (except for the etc.), including the blank lines:
    Name of input file: cities.txt
    Number of lines in file: 50
    0: Los Angeles
    1: San Diego
    2: San Jose
    3: San Francisco
    [etc]
    
    Enter the name of a city: Santa Rosa
    The position of Santa Rosa is:
    Linear search: 25
    
    Enter the name of a city: Oakland
    The position of Oakland is:
    Linear search: 7
    
    Enter the name of a city: Cotati
    The position of Cotati is:
    Linear search: None
    
    Enter the name of a city: quit
    
  6. Answer Question 9 in your writeup, using cities.txt as the input file.
  7. Comment out the line of code that prints the number of lines in the input file.
  8. Continue to Part C.

Part C

In this part of the lab, we will extend our code to also implement binary search.

Instructions

  1. Make a copy of your lab10b.py file, and name it lab10c.py.
  2. Binary search only works if the input list is sorted. Currently, our lists are not sorted.

    In main, after printing the list of cities, add another line of code to sort the list:

    city_list.sort()  # use the name of your list in place of city_list
    

    Then print the sorted version of the list.

    Sample input and output:

    Name of input file: cities-small.txt
    
    The original list of cities is:
    0: Santa Rosa
    1: Petaluma
    2: Rohnert Park
    3: Windsor
    4: Healdsburg
    
    After sorting, the new list is:
    0: Healdsburg
    1: Petaluma
    2: Rohnert Park
    3: Santa Rosa
    4: Windsor
    
    Enter the name of a city: Petaluma
    The position of Petaluma is:
    Linear search: 1
    
    Enter the name of a city: Windsor
    The position of Windsor is:
    Linear search: 4
    
    Enter the name of a city: Cotati
    The position of Cotati is:
    Linear search: None
    
    Enter the name of a city: quit
    
  3. Add the following function definition to your program:

    def binary_search(search_list, value_to_find):
        '''Uses a binary search function to find the position of an item in a list.
        
        Args:
            search_list (list): The list.
            value_to_find (str): The item to search for.
    
        Returns:
            int: The position of the item in the list, or None if it is not in the list.
        '''
        pass
    
  4. In main, call binary_search inside your loop after calling linear search. Sample output so far:

    Name of input file: cities-small.txt
    
    The original list of cities is:
    0: Santa Rosa
    1: Petaluma
    2: Rohnert Park
    3: Windsor
    4: Healdsburg
    
    After sorting, the new list is:
    0: Healdsburg
    1: Petaluma
    2: Rohnert Park
    3: Santa Rosa
    4: Windsor
    
    Enter the name of a city: Petaluma
    The position of Petaluma is:
    Linear search: 1
    Binary search: None
    
    Enter the name of a city: Windsor
    The position of Windsor is:
    Linear search: 4
    Binary search: None
    
    Enter the name of a city: Cotati
    The position of Cotati is:
    Linear search: None
    Binary search: None
    
    Enter the name of a city: quit
    
  5. Inside the definition of binary_search, create three variables:
    • first --- this should be equal to the index of the first element in the list that was passed to binary_search
    • last --- this should be equal to the index of the last element in the list that was passed to binary_search
    • middle --- this should be equal to the index of the middle element in the list that was passed to binary_search. If there is an even number of elements, middle should be rounded down so that it refers to the earlier of the two middle elements. Be sure to use a formula that will be correct as long as first is less than or equal to last.
    For now, print out the values of these 3 variables, and then answer Question 7. Once you have completed Question 10, comment out these print statements.
  6. Inside binary_search, add a while loop. For now, make the condition while True: Inside this loop, add code to do the following:
    • Calculate middle (move the line of code for middle from the previous step inside the loop).
    • Return middle if value_to_find is equal to the middle element of the list.
    • Adjust first if value_to_find comes after the middle element of the list.
    • Adjust last if value_to_find comes before the middle element of the list.
    Answer Question 11 in your writeup.
  7. Based on your answer to Question 11, modify your while loop so that the loop ends if the positions of first and last indicate that value_to_find is not in the list.
  8. Rerun your program on all of the sample inputs from Question 11, and be sure that it always returns the correct position without infinite looping.
  9. Inside the definition of binary_search, add code to your while loop to count the number of loop iterations. Print the value of this counter just before you return from the function. (Because your function can return from two different places, you will need to insert two print statements.)
  10. Edit your program until you match this sample output:

    Name of input file: cities-small.txt
    
    The original list of cities is:
    0: Santa Rosa
    1: Petaluma
    2: Rohnert Park
    3: Windsor
    4: Healdsburg
    
    After sorting, the new list is:
    0: Healdsburg
    1: Petaluma
    2: Rohnert Park
    3: Santa Rosa
    4: Windsor
    
    Enter the name of a city: Windsor
    Linear search: 4
    **Binary search iterations: 3
    Binary search: 4
    
    Enter the name of a city: Rohnert Park
    Linear search: 2
    **Binary search iterations: 1
    Binary search: 2
    
    Enter the name of a city: SSU
    Linear search: None
    **Binary search iterations: 2
    Binary search: None
    
    Enter the name of a city: quit
    
  11. Test your code on the larger cities.txt file, and answer Questions 12–15 in your writeup.
  12. Demo. Demo your program.
  13. Continue to the next part to submit your program.

Assignment Submission

Instructions

  1. Answer the last question (#16) in your Moodle writeup. Review your answers, and then click the "Next" button at the bottom of the quiz. Once you do that, you should see a "Summary of Attempt" screen.
  2. Click the "Submit all and finish" button. Warning: You must hit "Submit all and finish" so that your writeup can be graded! It is not submitted until you do this. Once you have submitted your quiz, you should see something similar to this at the top of your Moodle window. The important part is that the State shows up as Finished.
    Quiz confirmation
    Please leave this tab open in your browser.
  3. Click on the "Lab 10 code" link in Moodle and open in a new tab. Follow the instructions to upload your source code (lab10c.py) for Lab10. You could either browse for your code or, using a finder window, drag and drop your lab10c.py from your cs115/Lab10 folder to Moodle. You should subsequently see a dialog box which indicates 'Submission Status' as 'Submitted for grading'. Leave this tab open in your browser.
  4. With these confirmation pages open in your browser, you may call an instructor over to verify that you have completed every part of the lab. Otherwise, you are done!