CSE115 Project - Part 1
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CSE115 Project - Part 1
Processing Data from real-world data sets
Submit a ZIP file of your repl in AutoLab as Project Part 1 by Thurs. Oct. 20 at 5:00 PM. This submission is worth 4 out of the project's 60 total points.
NOTE: You have a limited number (5) of AutoLab submissions. Use them wisely. Your LAST submission counts.
Language: Python
Topics: lists, dictionaries, loops, conditionals (lectures up to and including Day 17 - Oct. 8)
Overview
By the time you complete Project 1, you will have developed a web application which visualizes a data set the instructor selected from among the datasets published on the city of Buffalo’s website.
Later parts of the project will have you writing code that downloads data from the city of Buffalo’s website, makes that data usable within your app, and eventually builds a web frontend to visualize the data. But breaking up the project into smaller pieces makes writing and testing each piece easier and completing this project more manageable. For this first step, you will write the functions which process a dataset to generate the numbers we will need.
The final project will use real data and so "correct" results will have to change with the release of each day's data. These daily updates are critical for the professionals relying on these data, but complicate the steps needed to check that our code works properly. Following common practice, you can use our sample input (found as an assignment statement in the Sample Data For Testing section below) to begin testing your code. This code declares a variable named data and assigns to it a list of dictionaries.
Remember that the majority of the project points are earned by your final project submission. So it is important to complete these functions, even if you miss the Part 2 submission deadline.
Functions to be Graded
These specify the 4 functions AutoLab will evaluate for Part 1. While you are not required to write any additional functions, we are happy for you to do so if it helps you complete this work.
gen_dictionary
Define a function named gen_dictionary with two parameters. The first parameter will be a list of dictionaries (the data). The second parameter will be a string (a key). gen_dictionary should use the accumulator pattern to create and return a new dictionary. The initial value for your accumulator variable should be an empty dictionary (e.g., {}). For the accumulator pattern's update step, you will first need to check if the loop variable has the second parameter as a key. If the loop variable has the second parameter as a key, your function should assign a variable, v, to that key's value in the loop variable. The function should then update the accumulator variable to include v as a key and associate it with a value of 0. You SHOULD NOT assume that all dictionaries in the first parameter will have the second parameter as a key.
Sample function call:
gen_dictionary(data, 'hour_of_day')
total_matches
Define a function named total_matches with three parameters:
● the 1st parameter, called lod in this description, is a list of dictionaries (the data); ● the 2nd parameter, called k in this description, is a string (a key);
● the 3rd parameter, called v in this description, is a value (a value);
total_matches should use the accumulator pattern to calculate and return a float. It should initialize the accumulator variable to 0. The accumulator loop should iterate over the dictionaries in lod. Inside the loop, get the value that the current dictionary's value associated with the key k. If that value is equivalent to v, increase the accumulator variable's value one. You SHOULD make sure when testing that the v you use matches the data type of those corresponding in the data set.
Sample function call:
total_matches(data, 'hour_of_day', '11')
total_matches_specific
Define a function named total_matches_specific with five parameters:
● the 1st parameter, called lod in this description, is a list of dictionaries (the data); ● the 2nd parameter, called k in this description, is a string (a key);
● the 3rd parameter, called v in this description, is a value (a value);
● the 4th parameter, called k2 in this description, is a string (a key);
● the 5th parameter, called v2 in this description, is a value (a value);
total_matches_specific should use the accumulator pattern to calculate and return a float. It should initialize the accumulator variable to 0. The accumulator loop should iterate over the dictionaries in lod. Inside the loop, get the value that the current dictionary's value associated with the key k. If that value is equivalent to v, and that dictionary’s value associated with the key k2 is v2, then increase the accumulator variable's value one. You SHOULD assume that k and k2 are keys in each of lod's entries. You SHOULD make sure when testing that the v and v2 you use match the data type of those corresponding in the data set.
Sample function call:
total_matches_specific(data, 'census_tract_2010', '31', 'hour_of_day', '11')
remove_min
Define a function named remove_min with two parameters. The first parameter will be a dictionary (the data). The second parameter will be an integer (minimum value threshold). remove_min must create a new dictionary and copy all dictionary inputs from the data to that new dictionary that have values above the minimum value threshold. You SHOULD assume that all dictionary entries have an integer as their value.
Sample function call:
remove_min(data, 20)
Sample Data For Testing
data = [{"case_number":"22-1710216","incident_datetime":"2022-06- 20T08:10:06.000","incident_type_primary":"LARCENY/THEFT","incident_des cription":"Buffalo Police are investigating this report of a crime. It is important to note that this is very preliminary information and further investigation as to the facts and circumstances of this report may be
necessary.","parent_incident_type":"Theft","hour_of_day":"8","day_of_w eek":"Monday","address_1":"100 Block N PARADE
AV","city":"Buffalo","state":"NY","created_at":"2022-06- 20T08:10:06.000"}
,{"case_number":"22-1790230","incident_datetime":"2022-06- 27T14:40:00.000","incident_type_primary":"LARCENY/THEFT","incident_des cription":"Buffalo Police are investigating this report of a crime. It is important to note that this is very preliminary information and further investigation as to the facts and circumstances of this report may be
necessary.","parent_incident_type":"Theft","hour_of_day":"8","day_of_w eek":"Tuesday","address_1":"300 Block DEARBORN
ST","city":"Buffalo","state":"NY","location":{"type":"Point","coordina tes":[-78.903,42.939]},"latitude":"42.939","longitude":"- 78.903","created_at":"2022-06-
28T08:49:00.000","census_tract_2010":"59","census_block_group_2010":"3 ","census_block_2010":"3007","census_tract":"59","census_block":"3005" ,"census_block_group":"3","neighborhood_1":"Black
Rock","police_district":"District
D","council_district":"NORTH","tractce20":"005900","geoid20_tract":"36 029005900","geoid20_blockgroup":"360290059003","geoid20_block":"360290 059003005",":@computed_region_kwzn_pe6v":"10",":@computed_region_eziv_ p4ck":"72",":@computed_region_uh5x_q5mi":"50",":@computed_region_dwzh_ dtk5":"147",":@computed_region_xbxg_7ifr":"9",":@computed_region_tmcg_ v66k":"1",":@computed_region_fk4y_hpmh":"5",":@computed_region_ff6v_jb aa":"88",":@computed_region_h7a8_iwt4":"10",":@computed_region_vsen_jb mg":"2"}
,{"case_number":"22-1800242","incident_datetime":"2022-06- 28T23:30:00.000","incident_type_primary":"ASSAULT","incident_descripti on":"Buffalo Police are investigating this report of a crime. It is important to note that this is very preliminary information and further investigation as to the facts and circumstances of this report may be
necessary.","parent_incident_type":"Assault","hour_of_day":"9","day_of _week":"Wednesday","address_1":"19TH ST & DELAWARE
AV","city":"Buffalo","state":"NY","created_at":"2022-06- 29T09:50:00.000"}]
2022-10-29