Goals
Upon successful completion of this lab, you should be able to understand and write programs using:
- linear search
- binary search
- file I/O
Setup and Practice
Setup
- If your lab machine was not started in MacOS when you went to use it, reboot it to MacOS.
- Follow these procedures to mount your blue home-directory on the local machine and create a PyCharm project (call it Lab10). You will use PyCharm in the next part, but, it is important that you set up your project now so that if you run into any issue, we can resolve them quickly.
- Right-click or control-click to download these 4 files and place them into the directory for this lab:
Practice
This lab is all about different ways of answering one question: If I have a list L and a value v, how can I determine whether v is an element of L, and if so, what its index is?
For example, if I have the list ['A', 'B', 'C']
and the value 'C'
, my answer would be that 'C'
is at position 2 in the list.
The functions we write will return None
if the value is not in the list. So if I search the same list for 'D'
, the answer would be None
. Note that None
(no quotes) is a built-in Python constant, just like True
and False
.
Answer Question 1 in your Moodle writeup.
Linear search
If the list is not guaranteed to be in any particular order, we can't do any better than looping through it, looking at each element, and seeing if that element is equal to our desired value.
For example, if we have the list
['A', 'B', 'C']
, and we're looking for'B'
, we would start by comparing'B'
to'A'
. In the next iteration of the loop, we would compare'B'
to'B'
and return a 1.However, if we were searching for
'D'
instead, we would have to look at every element of the list before reaching the conclusion that'D'
is not present.Answer Question 2 in your Moodle writeup.
Binary search
If the list is guaranteed to be in sorted order, we can actually do better than examining every single element. Here's how:
If we have the list
['A', 'B', 'C']
, and we're looking for'C'
, and we know that the list is in alphabetical order, we can start by comparing'C'
to the middle element (which is'B'
in our example). Since'C'
comes after the middle element alphabetically, we can rule out the element in position 0 without even looking at it. For a longer list, we can keep repeating this trick to cut the list in half, and then half again, and so on.Answer Question 3 in your Moodle writeup.
Part A
In this part, we are going to read data from a file and print its contents. In later sections, we will search through this data.Instructions
- Create a new Python source code file named lab10a.py:
""" Program: CS 115 Lab 10 Author: Your name Description: This program will open a file and then search its contents using linear and binary search. """ def readfile(filename): '''Reads the entire contents of a file into a single string using the read() method. Args: filename (str): The name of the file to read. Returns: str: The text in the file as a large, possibly multi-line, string. ''' infile = open(filename, 'r') # opens file for reading filetext = infile.read() # reads the file contents into filetext infile.close() # closes the file return filetext # returns text of the file, as a long string def main(): '''Reads and prints a file's contents. ''' pass main()
- Read this code carefully before continuing and answer Question 4.
- Right now, this code doesn't do anything because the definition of
main
is empty. To fill out the definition ofmain
, replace thepass
with the following 3 lines of code:A statement that asks the user to supply the name of a file to read. This statement should save the user's answer as a string.
A statement that calls the
readfile
function to retrieve the contents of the user's file.You should pass
readfile
the user-supplied filename (as a string).
You should savereadfile
's return value in a different variable.- Write a statement to print the value of the variable that holds
readfile
's return value.
- Run your program. Type cities.txt when prompted. Verify that your program prints the names of cities.
- Answer Question 5 in your writeup. To help you answer these questions, you can print the length of the information you read from the file --- just comment out any extra print statements when you are finished.
Inside the definition of
readfile
, add a call tosplit()
at the end of the call toread
:filetext = infile.read().split()
Answer Question 6 in your writeup.To split the text from the file into a list of lines rather than a list of words, modify your line of code as shown:
filetext = infile.read().splitlines()
Answer Question 7 in your writeup.- Here are some samples of what your program should be doing at this point.
Modify your wording to match the example, and add code to print the number of lines in the file if you do not have it already.
- Sample 1
Name of input file: cities-small.txt Number of lines in file: 5 ['Santa Rosa', 'Petaluma', 'Rohnert Park', 'Windsor', 'Healdsburg']
- Sample 2
Name of input file: cities.txt Number of lines in file: 50 ['Los Angeles', 'San Diego', 'San Jose', 'San Francisco', 'Fresno', 'Sacramento', 'Long Beach', 'Oakland', 'Bakersfield', 'Anaheim', 'Santa Ana', 'Riverside', 'Stockton', 'Chula Vista', 'Fremont', 'Irvine', 'San Bernardino', 'Modesto', 'Oxnard', Fontana', 'Moreno Valley', 'Glendale', 'Huntington Beach', 'Santa Clarita', 'Garden Grove', 'Santa Rosa', 'Oceanside', 'Rancho Cucamonga', 'Ontario', 'Lancaster', 'Elk Grove', 'Palmdale', 'Corona', 'Salinas', 'Pomona', 'Torrance', 'Hayward', Escondido', 'Sunnyvale', 'Pasadena', 'Orange', 'Fullerton', 'Thousand Oaks', 'Visalia', 'Simi Valley', 'Concord', 'Roseville', Santa Clara', 'Vallejo', 'Victorville']
- Add the following function definition to your program:
def print_list(list_to_print): '''Prints the contents of a list. Args: list_to_print (list): The list to print. ''' for i, item in enumerate(list_to_print): print(i, ': ', item, sep="")
We probably have not discussed theenumerate
function in class. You may not understand how it works, so please experiment with it in PythonTutor using this starter code. Call over a lab helper with any questions. - Modify your
main
function to call yourprint_list
function instead of printing the list directly. Sample output:Name of input file: cities.txt Number of lines in file: 50 0: Los Angeles 1: San Diego 2: San Jose 3: San Francisco [etc]
- Answer Question 8 in your writeup.
- Add code at the end of your
main
function to prompt the user for the names of cities until the user types quit:Name of input file: cities.txt Number of lines in file: 50 0: Los Angeles 1: San Diego 2: San Jose 3: San Francisco [etc] Enter the name of a city: Cotati Enter the name of a city: Fresno Enter the name of a city: Santa Rosa Enter the name of a city: quit
- Continue to Part B.
Part B
In this part of the lab, we will implement our own version of linear search.Instructions
- Make a copy of your lab10a.py file, and name it lab10b.py.
- Add the following function definition to your program:
def linear_search(search_list, value_to_find): '''Uses linear search to find the position of an item in a list. Parameters: search_list (list): The list. value_to_find (str): The item to search for. Returns: int: The position of the item in the list, or None if not found. ''' pass
Write code in
linear_search
to do the following:- Loop over the items in your list
- Compare each item, in turn, to
value_to_find
- If an item matches the value you are trying to find, return its position in the list immediately.
You do not need to explicitly return
None
, but you should make sure that your code does not return anything if the item is not in the list.Note: you may have learned about or already know about the
search_list.index()
method function, but do not use that method. We are trying to duplicate what theindex()
method function does, by writing our own code rather than make-use of existing functions.- In
main
, add code to your loop to do the following:- Call your linear search function to get the position of each city entered by the user.
- Print the result.
- Make sure your code matches the sample inputs and outputs below exactly (except for the etc.), including the blank lines:
Name of input file: cities.txt Number of lines in file: 50 0: Los Angeles 1: San Diego 2: San Jose 3: San Francisco [etc] Enter the name of a city: Santa Rosa The position of Santa Rosa is: Linear search: 25 Enter the name of a city: Oakland The position of Oakland is: Linear search: 7 Enter the name of a city: Cotati The position of Cotati is: Linear search: None Enter the name of a city: quit
- Answer Question 9 in your writeup, using cities.txt as the input file.
- Comment out the line of code that prints the number of lines in the input file.
- Continue to Part C.
Part C
In this part of the lab, we will extend our code to also implement binary search.Instructions
- Make a copy of your lab10b.py file, and name it lab10c.py.
Binary search only works if the input list is sorted. Currently, our lists are not sorted.
In
main
, after printing the list of cities, add another line of code to sort the list:city_list.sort() # use the name of your list in place of city_list
Then print the sorted version of the list.
Sample input and output:
Name of input file: cities-small.txt The original list of cities is: 0: Santa Rosa 1: Petaluma 2: Rohnert Park 3: Windsor 4: Healdsburg After sorting, the new list is: 0: Healdsburg 1: Petaluma 2: Rohnert Park 3: Santa Rosa 4: Windsor Enter the name of a city: Petaluma The position of Petaluma is: Linear search: 1 Enter the name of a city: Windsor The position of Windsor is: Linear search: 4 Enter the name of a city: Cotati The position of Cotati is: Linear search: None Enter the name of a city: quit
Add the following function definition to your program:
def binary_search(search_list, value_to_find): '''Uses a binary search function to find the position of an item in a list. Args: search_list (list): The list. value_to_find (str): The item to search for. Returns: int: The position of the item in the list, or None if it is not in the list. ''' pass
In
main
, callbinary_search
inside your loop after calling linear search. Sample output so far:Name of input file: cities-small.txt The original list of cities is: 0: Santa Rosa 1: Petaluma 2: Rohnert Park 3: Windsor 4: Healdsburg After sorting, the new list is: 0: Healdsburg 1: Petaluma 2: Rohnert Park 3: Santa Rosa 4: Windsor Enter the name of a city: Petaluma The position of Petaluma is: Linear search: 1 Binary search: None Enter the name of a city: Windsor The position of Windsor is: Linear search: 4 Binary search: None Enter the name of a city: Cotati The position of Cotati is: Linear search: None Binary search: None Enter the name of a city: quit
- Inside the definition of
binary_search
, create three variables:first
--- this should be equal to the index of the first element in the list that was passed tobinary_search
last
--- this should be equal to the index of the last element in the list that was passed tobinary_search
middle
--- this should be equal to the index of the middle element in the list that was passed tobinary_search
. If there is an even number of elements,middle
should be rounded down so that it refers to the earlier of the two middle elements. Be sure to use a formula that will be correct as long asfirst
is less than or equal tolast
.
- Inside
binary_search
, add awhile
loop. For now, make the conditionwhile True:
Inside this loop, add code to do the following:- Calculate
middle
(move the line of code formiddle
from the previous step inside the loop). - Return
middle
ifvalue_to_find
is equal to the middle element of the list. - Adjust
first
ifvalue_to_find
comes after the middle element of the list. - Adjust
last
ifvalue_to_find
comes before the middle element of the list.
- Calculate
- Based on your answer to Question 11, modify your
while
loop so that the loop ends if the positions offirst
andlast
indicate thatvalue_to_find
is not in the list. - Rerun your program on all of the sample inputs from Question 11, and be sure that it always returns the correct position without infinite looping.
- Inside the definition of
binary_search
, add code to yourwhile
loop to count the number of loop iterations. Print the value of this counter just before you return from the function. (Because your function can return from two different places, you will need to insert two print statements.) Edit your program until you match this sample output:
Name of input file: cities-small.txt The original list of cities is: 0: Santa Rosa 1: Petaluma 2: Rohnert Park 3: Windsor 4: Healdsburg After sorting, the new list is: 0: Healdsburg 1: Petaluma 2: Rohnert Park 3: Santa Rosa 4: Windsor Enter the name of a city: Windsor Linear search: 4 **Binary search iterations: 3 Binary search: 4 Enter the name of a city: Rohnert Park Linear search: 2 **Binary search iterations: 1 Binary search: 2 Enter the name of a city: SSU Linear search: None **Binary search iterations: 2 Binary search: None Enter the name of a city: quit
- Test your code on the larger cities.txt file, and answer Questions 12–15 in your writeup.
- Demo. Demo your program.
- Continue to the next part to submit your program.
Assignment Submission
Instructions
- Answer the last question (#16) in your Moodle writeup. Review your answers, and then click the "Next" button at the bottom of the quiz. Once you do that, you should see a "Summary of Attempt" screen.
- Click the "Submit all and finish" button.
Warning: You must hit "Submit all and finish" so that your writeup can be graded! It is not submitted until you do this.
Once you have submitted your quiz, you should see something similar to this at the top of your Moodle window. The important part is that the State shows up as Finished.
Please leave this tab open in your browser. - Click on the "Lab 10 code" link in Moodle and open in a new tab. Follow the instructions to upload your source code (lab10c.py) for Lab10. You could either browse for your code or, using a finder window, drag and drop your lab10c.py from your cs115/Lab10 folder to Moodle. You should subsequently see a dialog box which indicates 'Submission Status' as 'Submitted for grading'. Leave this tab open in your browser.
- With these confirmation pages open in your browser, you may call an instructor over to verify that you have completed every part of the lab. Otherwise, you are done!