Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CSE115 Project - Part 1

Processing Data from real-world data sets

Submit a ZIP file of your repl in AutoLab as Project Part 1 by Thurs. Oct. 20 at 5:00 PM.  This submission is worth 4 out of the project's 60 total points.

NOTE: You have a limited number (5) of AutoLab submissions. Use them wisely. Your  LAST submission counts. 

Language: Python

Topics: lists, dictionaries, loops, conditionals (lectures up to and including Day 17 - Oct. 8)

Overview

By the time you complete Project 1, you will have developed a web application which visualizes  a data set the instructor selected from among the datasets published on the city of Buffalo’s  website.

Later parts of the project will have you writing code that downloads data from the city of  Buffalo’s website, makes that data usable within your app, and eventually builds a web frontend  to visualize the data. But breaking up the project into smaller pieces makes writing and testing  each piece easier and completing this project more manageable. For this first step, you will write  the functions which process a dataset to generate the numbers we will need.

The final project will use real data and so "correct" results will have to change with the release  of each day's data. These daily updates are critical for the professionals relying on these data,  but complicate the steps needed to check that our code works properly. Following common  practice, you can use our sample input (found as an assignment statement in the Sample Data  For Testing section below) to begin testing your code. This code declares a variable named  data and assigns to it a list of dictionaries.

Remember that the majority of the project points are earned by your final project submission. So  it is important to complete these functions, even if you miss the Part 2 submission deadline.

Functions to be Graded

These specify the 4 functions AutoLab will evaluate for Part 1. While you are not  required to write any additional functions, we are happy for you to do so if it helps you complete  this work.

gen_dictionary

Define a function named gen_dictionary with two parameters. The first parameter  will be a list of dictionaries (the data). The second parameter will be a string (a key).  gen_dictionary should use the accumulator pattern to create and return a new  dictionary. The initial value for your accumulator variable should be an empty dictionary  (e.g., {}). For the accumulator pattern's update step, you will first need to check if the  loop variable has the second parameter as a key. If the loop variable has the second  parameter as a key, your function should assign a variable, v, to that key's value in the  loop variable. The function should then update the accumulator variable to include v as a  key and associate it with a value of 0. You SHOULD NOT assume that all dictionaries in  the first parameter will have the second parameter as a key.

Sample function call:

gen_dictionary(data, 'hour_of_day')

total_matches

Define a function named total_matches with three parameters:

● the 1st parameter, called lod in this description, is a list of dictionaries (the data);  ● the 2nd parameter, called k in this description, is a string (a key);

● the 3rd parameter, called v in this description, is a value (a value);

total_matches should use the accumulator pattern to calculate and return a float. It  should initialize the accumulator variable to 0. The accumulator loop should iterate over  the dictionaries in lod. Inside the loop, get the value that the current dictionary's value  associated with the key k. If that value is equivalent to v, increase the accumulator  variable's value one. You SHOULD make sure when testing that the v you use matches the data type of those corresponding in the data set. 

Sample function call:

total_matches(data, 'hour_of_day', '11')

total_matches_specific

Define a function named total_matches_specific with five parameters:

● the 1st parameter, called lod in this description, is a list of dictionaries (the data);  ● the 2nd parameter, called k in this description, is a string (a key);

● the 3rd parameter, called v in this description, is a value (a value);

● the 4th parameter, called k2 in this description, is a string (a key);

● the 5th parameter, called v2 in this description, is a value (a value);

total_matches_specific should use the accumulator pattern to calculate and  return a float. It should initialize the accumulator variable to 0. The accumulator loop  should iterate over the dictionaries in lod. Inside the loop, get the value that the current  dictionary's value associated with the key k. If that value is equivalent to v, and that  dictionary’s value associated with the key k2 is v2, then increase the accumulator  variable's value one. You SHOULD assume that k and k2 are keys in each of lod's  entries. You SHOULD make sure when testing that the v and v2 you use match the data  type of those corresponding in the data set. 

Sample function call:

total_matches_specific(data, 'census_tract_2010', '31',  'hour_of_day', '11')

remove_min

Define a function named remove_min with two parameters. The first parameter will be a  dictionary (the data). The second parameter will be an integer (minimum value  threshold). remove_min must create a new dictionary and copy all dictionary inputs  from the data to that new dictionary that have values above the minimum value  threshold. You SHOULD assume that all dictionary entries have an integer as their  value.

Sample function call:

remove_min(data, 20)

Sample Data For Testing

data = [{"case_number":"22-1710216","incident_datetime":"2022-06- 20T08:10:06.000","incident_type_primary":"LARCENY/THEFT","incident_des cription":"Buffalo Police are investigating this report of a crime.  It is important to note that this is very preliminary information and  further investigation as to the facts and circumstances of this report  may be  

necessary.","parent_incident_type":"Theft","hour_of_day":"8","day_of_w eek":"Monday","address_1":"100 Block N PARADE

AV","city":"Buffalo","state":"NY","created_at":"2022-06- 20T08:10:06.000"}

,{"case_number":"22-1790230","incident_datetime":"2022-06- 27T14:40:00.000","incident_type_primary":"LARCENY/THEFT","incident_des cription":"Buffalo Police are investigating this report of a crime.  It is important to note that this is very preliminary information and  further investigation as to the facts and circumstances of this report  may be  

necessary.","parent_incident_type":"Theft","hour_of_day":"8","day_of_w eek":"Tuesday","address_1":"300 Block DEARBORN  

ST","city":"Buffalo","state":"NY","location":{"type":"Point","coordina tes":[-78.903,42.939]},"latitude":"42.939","longitude":"- 78.903","created_at":"2022-06-

28T08:49:00.000","census_tract_2010":"59","census_block_group_2010":"3 ","census_block_2010":"3007","census_tract":"59","census_block":"3005" ,"census_block_group":"3","neighborhood_1":"Black  

Rock","police_district":"District  

D","council_district":"NORTH","tractce20":"005900","geoid20_tract":"36 029005900","geoid20_blockgroup":"360290059003","geoid20_block":"360290 059003005",":@computed_region_kwzn_pe6v":"10",":@computed_region_eziv_ p4ck":"72",":@computed_region_uh5x_q5mi":"50",":@computed_region_dwzh_ dtk5":"147",":@computed_region_xbxg_7ifr":"9",":@computed_region_tmcg_ v66k":"1",":@computed_region_fk4y_hpmh":"5",":@computed_region_ff6v_jb aa":"88",":@computed_region_h7a8_iwt4":"10",":@computed_region_vsen_jb mg":"2"}

,{"case_number":"22-1800242","incident_datetime":"2022-06- 28T23:30:00.000","incident_type_primary":"ASSAULT","incident_descripti on":"Buffalo Police are investigating this report of a crime. It is  important to note that this is very preliminary information and  further investigation as to the facts and circumstances of this report  may be  

necessary.","parent_incident_type":"Assault","hour_of_day":"9","day_of _week":"Wednesday","address_1":"19TH ST & DELAWARE  

AV","city":"Buffalo","state":"NY","created_at":"2022-06- 29T09:50:00.000"}]