Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


AcF 351b Python Stream Final Exam Part I


This is the first part of the final exam for AcF 351b: Python Stream.

Students are expected to act according to the highest ethical standards. All students enrolled at Lancaster       University are to perform their academic work according to standards set by faculty members, departments,     schools and colleges of the university; and cheating and plagiarism constitute fraudulent misrepresentation for which appropriate sanctions are warranted and will be applied. Please note that any form of violation of the      following rules will be treated as plagiarism

1. Answer the questions yourself without asking others for assistance. This is a test of your ability of data science and computer programming.

2. Do not share the questions or your answers with anyone. This includes posting the questions or your solutions publicly on services like quora, stackoverflow, or github.

We will run a system to detect any kind of plagiarism, e.g., coding scripts with high similarities.

Do NOT erase the #export at the top of any cells as it is used by notebook2script.py to extract cells for submission.


Import modules.

Do NOT change the following cell!


In [ ]:

#export

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt


If you need extra modules, use the following cell to import


In [ ]:

#export

# imported extra moduels:


Section 1: Basic Data Science

In this section, you will be asked to answer questions regarding WiFi hotspot locations in NYC.

Please make sure that the dataset entitled "NYC_Wi-Fi_Hotspot_Locations.csv" is in the same folder as the Jupyter Notebook.

Each row in the data represents one reported WiFi hotspot.

A data dictionary is also provided.

The following script reads the csv file into the memory, and stores it into a dataframe called df. Do not change it


In [ ]:

#export

# if connected to the internet, import the dataset from the internet address

try :

df = pd.read_csv("https://frankxu1987.weebly.com/uploads/6/2/5/8/62583677/nyc_wi-fi_hotspot_loca

# otherwise, import the dataset from the local .csv file

except :

df = pd.read_csv("NYC_Wi-Fi_Hotspot_Locations.csv")


Please answer the following 8 questions. For each question,

1. Please write down the script used to compute your response in the Code Cell; Conclude your scripts with a final print() function to print out final numeric answers. Important: Make sure your scripts are executable !!!

2. Please fill out the final numeric answers in the cells at the end of the section. See below.


Question 1.1: How many unique providers are there? (10 pts)


In [ ]:

#export

# Code script for Q 1.1

# Write your code script below


Question 1.2: What fraction of WiFi hotspots are in parks? For simplicity, you can consider a park a


In [ ]:

#export

# Code script for Q 1.2

# Write your code script below


Question 1.3: How are WiFi hotspots distributed across neighborhoods? For this question, calculate the number of WiFi hotspots per capita for each Neighborhood Tabulation Area (NTA). Exclude NTAs with less than 30 reported WiFi hotspots. Report the interquartile range (https://en.wikipedia.org/wiki/Interquartile_range) of the averages.

For population data for each NTA, use this dataset (https://data.cityofnewyork.us/api/views/rnsn- acs2/rows.csv); information on the dataset is found here (https://data.cityofnewyork.us/City- Government/Census-Demographics-at-the-Neighborhood-Tabulation/rnsn-acs2). Use the population data for the column corresponding to 2010. (10 pts)



In [ ]:

#export

# Code script for Q 1.3

# Hint: you probably find the following line useful ***

# df_nta= pd.read_csv ("https://data.cityofnewyork.us/api/views/rnsn-acs2/rows.csv") # Write your code script below


Question 1.4: The dataset contains information on the date the hotspot was activated. What fraction of all activations occurred on the day of week that had the most activations? In other words, if Monday had the most activations, what fraction of activations occurred on Monday? Note: there are some dates that don't make sense. Ignore them for the analysis. (10 pts)


In [ ]:

#export

# Code script for Q 1.4

# Write your code script below


Question 1.5: How many WiFi hotspots are there by the second most common provider in the Bronx?


In [ ]:

#export

# Code script for Q 1.5

# Write your code script below


Question 1.6: What is the probability that a WiFi hotspot is free (without any limitations) given that it's


In [ ]:

#export

# Code script for Q 1.6

# Write your code script below


Question 1.7: How far must one travel from one hotspot to another? For this question, report the median distance, in feet, of the average distance between each hotspot to the nearest 3 hotspots. For your distance calculation, calculate the distance "as the crow flies" (https://en.wikipedia.org/wiki/As_the_crow_flies). For simplicity, please use the spherical Earth projected to a plane equation calculating distances. Use the radius of the Earth as 6371 km. (10 pts)

Remember, report your answer in feet.



In [ ]:

#export

# Code script for Q 1.7

# Write your code script below


Question 1.8: If you plot the number of hotspot activations for each month, you'll notice a general


In [ ]:

#export

# Code script for Q 1.8

# Write your code script below


Now, please fill out the all numeric answers to Question 1.1-Question 1.8 in the following code cells. DO NOT UNCOMMENT.


In [ ]:


#export

# Your answer to ***Question 1.1***:


In [ ]:


#export

# Your answer to ***Question 1.2***:


In [ ]:


#export

# Your answer to ***Question 1.3***:


In [ ]:


#export

# Your answer to ***Question 1.4***:


In [ ]:

#export

# Your answer to ***Question 1.5***:

In [ ]:


#export

# Your answer to ***Question 1.6***:



In [ ]:

#export

# Your answer to ***Question 1.7***:

In [ ]:


#export

# Your answer to ***Question 1.8***:



Section 2: Simple Programming

Consider buildings on a grid at positions 0 to + 1. Each building has a height ℎ . You can arrange the         buildings in any order but you must leave the slot at position 0 open. Imagine a laser is shot just below the roof and to left of a building. The laser travels any number of grid points until it encounters a building of the same     height or taller or reaches the end of the grid, position 0. For example, consider 4 buildings of height 3, 3, 4,     and 1 arranged at grid points 1, 2, 3, and 4, respectively. The laser travels 1, 1, 3, and 1 grid points for each of the buildings, respectively. Let's call the sum of the lasers' distances . For this example, = 6 . For all           questions, give your answer to 10 places after the decimal point.


Please answer the following 5 questions. For each question,

1. Please write down the script used to compute your response in the Code Cell; Conclude your scripts with a final print() function to print out final numeric answers. Important: Make sure your scripts are executable !!!

2. Please fill out the final numeric answers in the cells at the end of the section.



Question 2.1: Consider 4 buildings of heights 1, 1, 3, and 4. For all possible configurations, what is the


In [ ]:

#export

# Code script for Q 2.1

# Write your code script below


Question 2.2: Consider 10 buildings of heights 1, 2, 3, ..., 10. For all possible configurations, what is the


In [ ]:

#export

# Code script for Q 2.2

# Write your code script below