The primary purpose of this assignment is to work with invariants and asserts. Secondary purposes include continuing to work with Python data structures and working on good programming style.

Background

This problem uses the same datasets as the biodiversity problem, but does a different data analysis: it examines the distribution of species across the different national parks and prints out the most widely-distributed species (in this case, interpreted as "occurring in the largest number of parks").

Expected Behavior

Write a program, in a file abundance.py, that behaves as follows.
  1. Use input() (without any arguments) to read in the name of a file sinfo ("species information"). Read the file sinfo and for each line in this file, use the Scientific Name field (see Input below) to count the number of national parks that species appears in. In order to avoid any issues that may arise from upper/lower case entries for the same species in different parks, you should process species names in a case-insensitive manner.
  2. Print out (in any order) the species that are found in the largest number of parks (see Output below). Note that there may be more than one such species.

Data Structures

Use a dictionary.

Input

The species information file is a CSV files. A line that begins with the character '#' is a comment and should be discarded. The first line of each file is a comment line that gives information about the columns in that file.

The species information file has the following format, with each line containing information about one species at one park:

Park Name Category Scientific name Common names Occurrence Nativeness Abundance Seasonality Conservation status (empty)

Of this information, we will only use the Scientific Name field in this assignment. An example of a species information file is given here.

Output

Print out information for all of the species that occur in the largest number of parks using the following:
print("{} -- {:d} parks".format(species_name, number_of_parks))

The species may be printed out in any order.

Assertions

Your program should use assert statements to check the following (however, see below for replacements for asserts in situations where asserts are difficult to state).
  • For each method, any pre-conditions on the arguments for that method. The assert should be placed at or very soon after the beginning of the method.
  • For each loop that either (i) computes a value; or (ii) transforms some data, at least one assert that holds in the body of the loop. You can choose what the invariant is, but it must be something that:
    • reflects the computation of the loop; and
    • is not simply a statement of the iteration condition (or some expression whose value follows directly from the iteration condition).

Asserts are not necessary for loops that neither compute values nor transform data, e.g., loops that simply read in data or print out results.

This level of assertion-checking is almost certainly overkill, but we'll do this for a little while in order to get more experience with pre-conditions and loop invariants and to practise working with assert statements.

Try to make your asserts as precise and specific as you can.

Try to make your asserts as precise and specific as you can. This document shows a simple way to check types of Python variables and values.

Replacements for asserts

In some situations, it may be difficult or impossible to write an assert that captures what you want to capture. In such situations, in place of an assert you can write a comment giving the invariant or assumption you want to state. Such a comment should be written as follows:
### INVARIANT: ...your invariant in English and/or Python
or
### ASSUMPTION: ...your assumption in English and/or Python

Programming Requirements

  1. Input files:
    • Read the files yourself. Don't import Python's csv library.
    • Make sure you close the file when you have finished reading it.
    • Each file should not be read more than once.
  2. Make sure that the data structures in your code satisfy the requirements for the assignment (see Data Structures above).
  3. Make sure you use asserts to check function preconditions and at least one invariant within each loop (see Assertions above).

Examples

Several examples of this data analysis are given here.