Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


CS 22A: Introduction to Python

Fall 2021

Homework Three: Chapters 5 & 6


Please submit, via Canvas, your solutions to the following problems.


Problems 1 to 5:

The text file called data.csv, contains some data for a number of strains. Each line contains the following fields for a single strain in this order:

a) id

b) domain

c) genus

d) species

e) strain

f) sequence

The fields are separated by commas (hence the name of the file - csv stands for Comma Separated Values). Think of it as a representation of a table in a spreadsheet - each line is a row, and each field in a line is a column(index). Problems 1 to 5 use the data read from this file.


This is a multipart exercise which involves extracting and printing data from the file. The nature of this type of problem means that it is quite easy to get a program that runs without errors, but does not quite produce the correct output. So, be sure to check your solutions manually!

Do not forget to use the Python Coding Style conventions.


Problem One:

Write a program to print out the GC content for the domain Bacteria only. Name your program Bacteria_gc_content.py.


Problem Two:

Print out the strain names for strains that have GC content greater than 40% and less than 60%.

Name your program gc_content.py.


Problem Three: Complex Condition

Write a program that prints the id of the strains that start with “Fusobacterium” or “Brevitalea” and have a GC content greater than 50%.

Name your program complex_condition.py.


Problem Four:

For each strain in the file, print a message stating whether the sequence has high gc content (greater than 60%), medium gc content (between 45% and 60%), or low gc content (less than 45%).

Name your program high_medium_low.py.