Assignment 2

· Due Nov 5 by 4pm

· Points 9.5

· Submitting on paper

CSC108 Assignment 2: Rent-a-Bike

Updates:

·         October 26, 2020: update to handout description of has_kiosk regarding use of constant

·         October 22, 2020: update to starter code to correct docstring for get_nearest_station. Clarification in redistribute_bikes about capacity.

·         October 21, 2020: update to starter code to change stations_files.close() to stations_file.close() in the __main__ block at the end of the file.

Due Date: Thursday, November 5, 2020 before 4:00pm

This assignment must be completed alone (no partners). Please see the syllabus for information about academic offenses.

Late policy: There are penalties for submitting the assignment after the due date. These penalties depend on how many hours late your submission is. Please see the syllabus on Quercus for more information.

Introduction

This handout explains the problem you are to solve, and the tasks you need to complete for the assignment. Please read it carefully.

Goals of this Assignment

·         Develop code that uses loops, conditionals, and other earlier course concepts.

·         Practice with lists, including looping over lists, list methods, and list mutation.

·         Practice reading a problem description in English and provided docstring examples, and implementing function bodies to solve the problem.

·         Continue to use Python 3, Wing 101, provided starter code, a checker module, and other tools.

Rent-a-Bike

Toronto's bike share network (Links to an external site.) debuted in 2011, offering rental bikes to Torontonians and visitors in the downtown core. This network consists of hundreds of docking stations scattered around downtown. Bikes can be rented from any docking station and returned to any docking station in the city. In this assignment, you will write several functions to help manage and track bike rentals across this network. Using real data from Toronto's bike share system, your functions will simulate bike rentals and returns as well as keep track of the current state of the network and even provide directions to riders.

The data that you will work with is provided by the Toronto bike share network. The data contains information about the docking stations, such as the location of the station and how many bikes are currently available. More information about the data provided is given later in this handout.

The purpose of this assignment is to give you practice using the programming concepts that you have seen in the course so far, including (but not limited to) strings, lists and list methods, and loops.

This handout explains the problem you are to solve, and the tasks you need to complete for the assignment. Please read it carefully.

Files to Download

Please download the Assignment 2 Starter Files and extract the zip archive. A description of each of the files that we have provided is given in the paragraphs below:

Starter code: bikes.py

The bikes.py file contains some constants, and a couple of complete helper functions that you may use. You must not modify the provided helper functions.

The bikes.py file also contains function headers and docstrings for the A2 functions to which you are required to add function bodies. For each function, read the header and docstring (especially the examples) to learn what task the function performs. Doing so may help you to determine what you need to do for each required function. To gain a better understanding of each function, you may want to add more examples to the docstrings.

Data: stations.csv

The stations.csv file contains bike share data in comma-separated values (CSV) format. See below for detailed information on the file format. You must not modify this file.

Checker: a2_checker.py

We have provided a checker program (a2_checker.py) that tests two things:

·         whether your functions have the correct parameter and return types, and

·         whether your code follows the Python and CSC108 style guidelines.

The checker program does not test the correctness of your functions, so you must do that yourself.

The Data

For this assignment, you will use data from a Comma Separated Value (CSV) file named stations.csv. Each row of this file contains the following information about a single bike rental station:

·         station ID: the unique identification (ID) number of the station

·         name: the name of the station (not necessarily unique)

·         latitude: the latitude of the station location

·         longitude: the longitude of the station location

·         capacity: the total number of bike docks (empty or with bike) at the station

·         bikes available: the number of bikes currently available to rent at the station

·         docks available: the number of empty and working docks at the station

Note: While the sum of the number of bikes available at a station and the number of docks available at a station will usually equal the station's capacity, this need not be the case. When a bike or a dock is broken, the sum of the two availability numbers will not match the capacity.

Another feature of a bike rental station is whether or not it has a kiosk. A kiosk allows a renter to pay for their bike rental using a credit card. Without a kiosk, renters can only pay for their bike through an app. Stations that are app-only (that is, that do not have a kiosk) have SMART somewhere in their name.

We have provided a function named csv_to_list, which reads a CSV file and returns its contents as a List[List[str]]. As you develop your program, you can use the csv_to_list function to produce a larger data set for testing your code. See the main block at the end of bikes.py for an example.

Your Tasks

Imagine that it is your job to manage Toronto's bike share system. As the manager, you need to know everything about the system. But, there are hundreds of docking stations, which is way too many to keep track of in your head. To make your life easier, you will write Python functions to help you manage the system.

Your functions will fall into three categories: functions for data cleaning, functions for data queries, and functions for data modification.

Data cleaning

We provided a function named csv_to_list that reads data from a CSV file and returns it in a List[List[str]]. Here is a sample of the type of list returned:

[['7000', 'Ft. York / Capreol Crt.', '43.639832', '-79.395954', '31', '20', '11'],
['7001', 'Lower Jarvis St SMART / The Esplanade', '43.647992', '-79.370907', '15', '5', '10']]

Notice that all of the data in the inner lists are represented as strings. You are to write the function clean_data, which should make modifications to the list according to the following rules:

·         If and only if a string represents a whole number (ex: '3' or '3.0'), convert it to an int.

·         If and only if a string represents a number that is not a whole number (ex: '3.14'), convert it to a float.

·         Otherwise, leave it as a str.

After applying the clean_data function to the example list, it should look like this:

[[7000, 'Ft. York / Capreol Crt.', 43.639832, -79.395954, 31, 20, 11],
[7001, 'Lower Jarvis St SMART / The Esplanade', 43.647992, -79.370907, 15, 5, 10]]

Before you write the clean_data function body, please note:

·         you must not use the built-in function eval, and

·         this function is one of the more challenging functions in A2, because it mutates a list. We suggest that you start with some of the other functions, and come back to this one later.

Data cleaning function to implement in bikes.py

Function name:
(Parameter types) -> Return type

Full Description (paraphrase to get a proper docstring description)

clean_data
(List[List[str]]) -> None

The parameter represents a list of list of strings. The list could have the format of stations data, but is not required to. See the starter code docstring for some examples.

Modify the input list so that strings that represent whole numbers are converted to ints, and strings that represent numbers that are not whole numbers are converted to floats. Strings that do not represent numbers are left as is.

Data queries

Once the data has been cleaned, you can use the following functions to extract information from the data. All the examples shown below assume that you are calling the function on the cleaned example list shown above.

You can work on these functions even before you have completed the clean_data function by working with the provided sample data in the starter code, and by creating your own small lists of clean station data for testing. See the docstrings in the starter code for examples of how to work with that data.

We will use 'Station' in the type contracts to represent a cleaned list that represents a single station.

List of data query functions to implement in bikes.py.

Function name:
(Parameter types) -> Return type

Full Description (paraphrase to get a proper docstring description)

has_kiosk
('Station') -> bool

The parameter represents cleaned station data for a single station.

The function should return True if and only if the given station has a kiosk. Recall that a station has a kiosk if and only if the string that the constant NO_KIOSK refers to is not part of its name.

get_station_info
(int, List['Station']) -> list

The first parameter represents a station ID number and the second parameter represents cleaned stations data.

The function should return a list containing the name, the number of bikes available, the number of docks available, and if the station has a kiosk (in this order), for the station with the given station ID.

get_total
(int, List['Station']) -> int

The first parameter represents an index and the second parameter represents cleaned stations data.

The function should return the sum of the values at the given index in each inner list of the cleaned stations data.

You can assume that the given index is valid for the given data, and that the items in the list at the given index position are integers.

get_stations_with_n_docks
(int, List['Station']) -> List[int]

The first parameter represents a required minimum number of available docks and the second parameter represents cleaned stations data.

The function should return a list of the station IDs of stations that have at least the required minimum number of available docks . The station IDs should appear in the same order as in the given stations data list.

You can assume that the given number is non-negative.

get_nearest_station
(float, float, bool, List['Station']) -> int

The first and second parameter represent the latitude and longitude of a location, the third parameter represents whether or not to search for a station with a kiosk, and the fourth parameter represents cleaned stations data.

If the third argument is True, the function should return the ID of the station that is nearest to the given lat/lon location that has a kiosk. If the third argument is False, the function should return the ID of the nearest station, which may or may not have a kiosk. In the case of a tie, the function should return the ID of a station that appears first in the given list.

You can assume there is at least one station in the input data list. Furthermore, you can assume that if the third argument in True, then there is at least one station with a kiosk in the given data.

Data modification

The functions that we have described up to this point have allowed us to clean our data and extract specific information from it. Now we will describe functions that let us change the data.

Notice that each of these functions mutates the given list. We recommend getting your rent_bike and return_bike functions working correctly first before attempting redistribute_bikes.

List of data modification functions to implement in bikes.py.

Function name
(Parameter types) -> Return type

Full Description

rent_bike
(int, List['Station']) -> bool

The first parameter represents a station ID and the second represents cleaned stations data.

A bike can be rented from a station if and only if there is at least one bike available at that station. If the condition is met, this function successfully rents a single bike from the station. A successful bike rental requires updating the
bikes available count and the docks available count as if a single bike was removed from the station. Return True if the bike rental is successful, and False otherwise.

Precondition: the station ID will appear in the cleaned stations data.

return_bike
(int, List['Station']) -> bool

The first parameter represents a station ID and the second represents cleaned stations data.

A bike can be returned to a station if and only if there is at least one dock available at that station. If the condition is met, this function successfully returns a single bike to the station. A successful bike return requires updating the
bikes available count and the docks available count as if a single bike was added to the station. Return True if the bike return is successful, and False otherwise.

Precondition: the station ID will appear in the cleaned stations data.

redistribute_bikes
(List['Station']) -> int

The parameter represents cleaned stations data.

Modify the stations data so that the percentage of bikes available at each station is as close as possible to the overall percentage of bikes available across all stations.

To get your percentage, you'll need to first calculate the total number of bikes in the system, and the total capacity. Do not round this number.

Once you have computed the target percentage for the whole system, you can compute the target number of bikes for each station. Since this number needs to be a whole number of bikes (we won't have any fractions of bikes), use Python's built-in round.

Then, update the bikes and docks available at the station so that the number of bikes matches the target. When this has been completed for every station, your function should return the difference between the number of bikes rented and the number of bikes returned to complete this modification. For example, if more bikes were returned than were rented, the function should produce a negative number. You can think of this as the number of excess bikes (if positive) or the number of needed bikes (if negative) to achieve the target percentage across the system.

An example: To illustrate this, let's consider our cleaned example list from before. This list contains two stations, one that has 20 bikes available out of a 31 dock capacity, and one that has 5 bikes available out of a 15 dock capacity. We want both of these docking stations to have a percentage available that is as close as possible to the percentage available across all stations. In our example, 20+5 bikes out of 31+15 capacity gives a goal percentage of 54.34782609%.

For each station, based on its capacity, we calculate how close we can get to the goal percentage. With the cleaned example list, we are aiming for 17 bikes in the first station (54.34782609% of 31 is 17, after rouding to a whole number of bikes) and 8 in the other station (54.34782609% of 15 is 8, after rounding to a whole number of bikes). Now, for each station, we rent and/or return enough bikes to reach the target. Because we remove 3 bikes (to get down to 17 from 20) to one station, and add 3 bikes (to get up to 8 from 5) to the other, the function should return 0 in this case.

If you are still not sure what you need to do for this function, try going through the docstring examples carefully on paper to make sure you follow what happens.

Notes: - This function must be able to redistribute the bikes no matter how many stations are in the cleaned stations data.
- You are not moving bikes around between stations as you add or remove them. You can assume there are bikes that can be added to or removed from the system as needed; this is actually what the number you return at the end represents.
- This function is to return the difference: a positive number if more bikes were rented than returned, 0 if the same number were rented as returned, and a negative number if fewer bikes were rented than returned.
- This is the most challenging function in the assignment. We suggest you focus on completing the others first before spending too much time on this one.
- [Update Oct 22] Capacity is determined based on the capacity attribute of each station, and the bikes and docks attributes are the ones that should be modified. You do not need to worry about correcting for any inconsistencies if (bikes + docks) != capacity for a particular station.

Using Constants

As in Assignment 1, your code should make use of the provided constants for the indexes in your stations data. This will not only make your code easier to read, but if the columns in the data moved around, your code would still work.

Additional requirements

·         Do not add statements that call print, input, or open, or use an import statement.

·         Do not use any break or continue statements. We are imposing this restriction (and we have not even taught you these statements) because they are very easy to abuse, resulting in terrible code.

·         Do not modify or add to the import statements provided in the starter code.

Testing your Code

We strongly recommended that you test each function as you write it. As usual, follow the Function Design Recipe (we've done the first couple of steps for you). Once you've implemented a function, run it on the examples in the docstring, as well as some other examples that you come up with yourself to convince yourself it works.

Here are a few tips:

·         Be careful that you test the right thing. Some functions return values; others modify the data in-place. Be clear on what the functions are doing before determining whether your tests work.

·         Can you think of any special cases for your functions? Test each function carefully.

·         Once you are happy with the behaviour of a function, move to the next function, implement it, and test it.

Remember to run the checker!

Marking

These are the aspects of your work that may be marked for A2:

· Coding style (20%):

o    Make sure that you follow Python style guidelines that we have introduced and the Python coding conventions that we have been using throughout the semester. Although we don't provide an exhaustive list of style rules, the checker tests for style are complete, so if your code passes the checker, then it will earn full marks for coding style with one exception: docstrings for any helper functions you add may be evaluated separately. For each occurrence of a PyTA error, one mark (out of 20) deduction will be applied. For example, if a C0301 (line-too-long) error occurs 3 times, then 3 marks will be deducted.

o    All functions you design and write on your own (helper functions), should have complete docstrings including preconditions when you think they are necessary.

· Correctness (80%):

Your functions should perform as specified. Correctness, as measured by our tests, will count for the largest single portion of your marks. Once your assignment is submitted, we will run additional tests not provided in the checker. Passing the checker does not mean that your code will earn full marks for correctness.

·

How should you test whether your code works

First, run the checker and review ALL output — you may need to scroll. Remember that the checker ONLY shows you style feedback, and that your functions take and return the correct types. Passing the checker does not tell you anything about the correctness of your code.

A2 Checker

We are providing a checker module (a2_checker.py) that tests two things:

·         whether your code follows the Python style guidelines, and

·         whether your functions are named correctly, have the correct number of parameters, and return the correct types.

To run the checker, open a2_checker.py and run it. Note: the checker file should be in the same directory as your bikes.py, as provided in the starter code zip file. When you run your own checker, be sure to scroll up to the top and read all messages.

If the checker passes for both style and types:

·         Your code follows the style guidelines.

·         Your function names, number of parameters, and return types match the assignment specification. This does not mean that your code works correctly in all situations. We will run a different set of tests on your code once you hand it in, so be sure to thoroughly test your code yourself before submitting.

If the checker fails, carefully read the message provided:

·         It may have failed because your code did not follow the style guidelines. Review the error description(s) and fix the code style. Please see the PyTA documentation (Links to an external site.) for more information about errors.

·         It may have failed because:

o    you are missing one or more function,

o    one or more of your functions is misnamed,

o    one or more of your functions has the incorrect number or type of parameters, or

o    one of more of your function return types does not match the assignment specification, or

o    your .py file is misnamed or in the wrong place.

Read the error message to identify the problematic function, review the function specification in the handout, and fix your code.

Make sure the checker passes before submitting.

Running the checker program on Markus

In addition to running the checker program on your own computer, run the checker on MarkUs as well. You will be able to run the checker program on MarkUs once every 12 hours (note: we may have to revert to every 24 hours if MarkUs has any issues handling every 12 hours). This can help to identify issues such as uploading the incorrect file.

First, submit your work on MarkUs. Next, click on the "Automated Testing" tab and then click on "Run Tests". Wait for a minute or so, then refresh the webpage. Once the tests have finished running, you'll see results for the Style Checker and Type Checker components of the checker program (see both the Automated Testing tab and results files under the Submissions tab). Note that these are not actually marks -- just the checker results. This is the same checker that we have provided to you in the starter code. If there are errors, edit your code, run the checker program again on your own machine to check that the problems are resolved, resubmit your assignment on MarkUs, and (if time permits) after the 24 hour period has elapsed, rerun the checker on MarkUs.

No Remark Requests

No remark requests will be accepted. A syntax error could result in a grade of 0 on the assignment. Before the deadline, you are responsible for running your code and the checker program to identify and resolve any errors that will prevent our tests from running.

What to Hand In

The very last thing you do before submitting should be to run the checker program one last time.

Otherwise, you could make a small error in your final changes before submitting that causes your code to receive zero for correctness.

Submit bikes.py on MarkUs by following the instructions on the course website. Remember that spelling of filenames, including case, counts: your file must be named exactly as above.