Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CS 230 Final Project

Fall 2023

Due on December 15, 5:00 PM

Interactive Data-Explorer:  Tell a Story with Real-World Data

In this last project you will develop an interactive data-driven web-based Python application that tells a story with   real-world data. You will show your mastery of many coding concepts as you interact with real-world data. You will use Pandas for managing and interacting with data, MatPlotLib, or other charting packages for creating charts and  graphs, PyDeck (or other mapping packages) for maps, and the Streamlit.io package for creating interactive web applications using Python.

The datasets we will be using this semester all come from Analyze Boston http://data.boston.gov), the city of Boston’s open data hub.

Name

Description

Boston Blue Bikes

Where do Blue Bikes riders ride? When do they ride? How far do they go? Which stations   are most popular? On what days of the week are most rides taken? Two files are included: Trip History from Q1 2015 and alist of stations.

Trash Schedule by Address

When is trash collected in Boston's neighborhoods?

Boston Crime 2023

Where shouldn’t you hangout in Boston at night? Crime incident reports are provided by   Boston Police Department (BPD) to document the initial details surrounding an incident to which BPD officers respond. This data set includes the type of incident as well as when and where it occurred.

Cannabis Registry

Where togo for cannabis? This Open Data Registry includes currently licensed applicants as well aspending cannabis license applicants.

Boston Parking

Meters

Looking for a place to park in Boston? This data set shows where to park in downtown Boston on each block, and hours of meter operation.

BigBellyTrash

Alertsand

Locations

Big Belly trash receptacles are solar powered, internet connected, compacting trash

receptacles that can collect up to five times as much waste as traditional bins and help the city more efficiently manage the waste collection process. This is a legacy dataset containing all signals received from the trash receptacles for the calendar year 2014.

Boston Building

Violations

Which buildings in Boston are unsafe to enter? This data set contains violations on Boston buildings or properties issued by inspectors from the Building and Structures Division of the Inspectional Services Department.

The links for each data set provide background information and sources for the data.  Please read them, as they

often contain data dictionaries describing the fields or columns in each data table.    You shouldget the csv data files here, rather than from Analyze Boston website, as the data files provided for this project in some cases are

sampled or cleaned versions of the original data files. (If there’sa file that ends with 7000_sample, use that one. It means the original file from Analyze Boston is much larger, and a random sample of 7000 records is provided.)

You will be assigned to a specific dataset Find your assigned dataset here. Failure to use the assigned dataset will result in a zero for your final project.

Project Description

Part 1.   Design

The purpose of the design phase is to start thinking about what you might do before you jump into coding. Identify at least three different queries or questions you can ask about your data set. Try to phrase your questions so that    they can have a parameter which can come from user input.

For example: (and these queries don’t match the data sets you are given but are here to inspire you!)

•    What’s the cost of the most expensive <house_type> in <city>?

•     Find all of the apartments in <city> that rent for under <amount> .

Then think about the interactive widgets from Streamlit that you can bring to your application to obtain user input. For example:  you can use a numeric slider to have the user enter a monthly rental amount.

Next, describe how you will visually present the data or query results using charts, graphs, tables, or maps.

Be sure your web pages and visualizations are "user friendly" and as "polished" as possible. Be sure to label  controls requiring user interaction, make sure your charts have titles, legends or explanations that would be helpful to the user.  Think about how the user will navigate from one part of your site to another.

Feel free to add to your project as you explore Pandas and Streamlit capabilities and find cool ways to implement  new or additional features. Part of your grade will be a "complexity/originality" score.   If you use a module or do something cool that we may not have discussed in class or implement more than the minimum requirements, you will receive a higher complexity score.

A complexity score of 1 means you implemented the minimum requirements for this project. A complexity score of zero means you didn’t meet the requirements.

Part 2.  Coding

Create your Python application with a Streamlit UI and several visualizations.

Create charts and graphs of different types with custom legends, axis labels, tick marks, colors, other features), and at least one map showing locations and databasedon latitude and longitude.  Your chart should tell a story, so be

sure elements are labelled appropriately, and add any narrative that will help the reader understand your

visualizations and to cue the reader about which values to specify, and the purpose of each chart or graph. You may wish to add a few sentences explaining each chart as a paragraph of text on the screen.

You might also use pandas to create summary report based on the data itself (max/min values, relationships between columns, etc.).

See thedocumentationfor how to use different Streamlit features. You might make use ofsidebarsto place your widgets, multi-page applications, orcachingto improve performance.

Read the documentation forPyDeckMaps. Our examples of maps in class were PyDeck’s Scatterplot Layer or IconLayer, but PyDeck support several other styles such as Text and Heatmaps. Have a look.

To explore another chart library, considerSeaborncharts which have additional chart types and customization

options.  You might also look at Foliummaps (here’s asimple tutorial) if you’d like to play with a different mapping library.

If your project contains more than one Python code file (i.e., one or more Python code files and images),create a zipfolder containing all of your project files and submit it.  You do not have to submit the data file that you used.

Part 3.  In-Class Presentation (December 15, 3:00 - 5:00 PM)

You will have five minutes to present your project. Give an overview of your project’s capabilities.   Demonstrate    what you feel is the most interesting part of your project, and then show how you implemented the cool stuff in    your program.  Describe how you used various coding features and pandas queries.  Then talk through the pandas and Streamlit code well enough to convince me that you understand how your code works and what you did.

The presentation is mandatory. Failure to show up and present your project will result in a zero for your final project grade.

Requirements

As you write your program, be sure to include code that demonstrates each of these items. Each contributes to your project grade (see the rubric below).

Python Features:

•    A function with two or more parameters, one of which has a default value

    A function that returns more than one value

•    A list comprehension

•    A loop that iterates through items in a list, dictionary, or data frame

•    At least two different methods of lists, dictionaries, or tuples.

Streamlit Features:

•    At least three Streamlitwidgets  (sliders, drop downs, multi-selects, text box, etc.)

•     Page design features (sidebar, fonts, colors, images, navigation)

Visualizations:

•    At least three different charts with titles, colors, labels, legends, as appropriate

•     At least one detailed map (st.map will only get you partial credit) – for full credit, include dots, icons, text that appears when hovering over a marker, or other map features

Data Analytics Capabilities:

•     Sorting data in ascending or descending order, by one or more columns,

•     Filtering data by one condition

•     Filtering data by two or more conditions with AND or OR

•    Analyzing data with pivot tables

•    Add/drop/select/create new/group columns, frequency count, other features

•    Text analysis based on word frequencies, etc.

Usual rules about writing "good" code apply:

•     Make your code as modular and easy to follow as possible.

•     Include a docstring, comments, and meaningful variable names.

•     If you did something "cool" in your code that you are incredibly proud of, please write a comment call attention to what you did.

•     If you referred to any online articles or other information beyond class examples, please be sure to list them as references / comments in your code.

•     Make sure the program runs and the output is correct.

Documentation String

Use this documentation string at the top of your  Python code file:

"""

Name:

Your Name

CS230:

Section XXX

Data:

Which data set you used

Description:

This program ... (a few sentences about your program and the queries and charts)

"""

Grading

The Final Project will be worth 16% of your course grade.  It is based on 50 points, as follows:

Requirement

Points

Project: Proposal, Design and Queries submitted on time

2

Python Coding Features (at least 4 @ 2.5 points)

10

Code quality

3

Streamlit Features (3 controls and other page design features)

8

Visualizations (3 different charts and one map,  4 points each)

•     2 points for displaying the data correctly

•     2 points for customization (colors, grid lines, legend, etc.)

16

Data Analytics Capabilities (at least 4 – sort, filter, etc.)

12

Presentation

6

Complexity:

0 = Your project implements less than the minimum requirements

1 = Your project meets the minimum requirements

2 = Your project includes some complex queries, charts, or UI features, or added a small number of extra features beyond those which are required

3 = Wow! You went above and beyond in requirements, ether doing more than what is required, or by including features, modules, or packages  learned independently or not  described in class

3

Total:

60

Submission

Submit your Python file (or zip file containing multiple files) on the BS by 5:00 PM on December 15. Also submit your data files.

Getting Help and Academic Integrity:

•     Please do not discuss your program with anyone other than your instructor.

•    You can ask CIS Sandbox tutors for assistance on related or general topics, but you cannot ask them to

help you write your code for this project. For example, you can ask tutors to help review examples of how to create bar charts in Python (in general), but you cannot ask them to help you debug a bar chart you

might create using the data set for this project. You can ask for help with fixing syntax or runtime errors.

•    You are prohibited from seeking help from anyone other than your instructor or CIS Sandbox tutors.

•    You are prohibited from using ChatGPT or any other AI tools to do any part of your project.

•    Any violation of these policies will result in a zero for the project at minimum or even a grade of ‘F’ for the course.