CUSP-GX 5005 Urban Science Intensive I
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CUSP-GX 5005Urban Science Intensive I (Capstone)
Problem Definition & Project Timeline
Comparing How Representative Public Feedback is to the Community
Abstract
Municipalities rely heavily on resident feedback, which is often mandatory for both big and small projects. However, it can be difficult to determine whether the feedback received accurately represents the broader community. Our objective is to create a comparative tool that policymakers can use to evaluate the responsiveness of public feedback to the community at various levels. We will collaborate with the sponsor’s existing clients to improve survey respondents' quality, utilize real or proxy respondent data, create survey scenarios, collect relevant data, and design a data tool to compare and contrast respondents to the constructed community. The final product will be a prototype tool that can efficiently ingest resident data and compare it to the constructed community, allowing us to promptly report the results to clients/policymakers. Our primary focus will be on NYC, and we may extend to other smaller jurisdictions outside the region (depending on agreements with other localities). |
Introduction
Public engagement is a crucial component of democratic governance, as it allows citizens to participate in decision-making processes that affect their communities. Numerous studies have examined the challenges and limitations of current community engagement practices. These studies provide insights into the shortcomings of traditional community engagement approaches, highlighting the need for more innovative methods to represent local communities' diversity better and then respond to perspectives and needs.
One study by Hahrie Han and Elizabeth A. Bennion (2005) focused on the issue of civic engagement and found that individuals with higher levels of education and income tend to dominate the decision-making process. This finding suggests that the voices of those most marginalized in society may not be fully represented in public engagement efforts, which can have significant implications for policy outcomes.
Another study by Jennifer J. Griffin et al. (2019) explored the underrepresentation of low-income and racial/ethnic minority communities in public engagement efforts related to housing policy. This study revealed that these groups are often excluded from decision-making processes, despite being disproportionately affected by housing policies. This underrepresentation may lead to policies failing to address these communities' needs and concerns sufficiently.
Furthermore, a study by Maxwell Palmer (2020) uncovered disparities in how elected officials respond to their constituents' preferences regarding housing policy. The study found that elected officials tend to be more responsive to the preferences of white constituents over Black constituents, which can perpetuate racial inequalities in housing policy.
Problem Statement
To summarize the issue, gathering resident feedback is often required and desired on a per-project basis for local public officials. For the collected feedback, it’s often hard to be sure how representative of the community the people attending the sessions was. Then it is challenging to identify where additional sessions should be held if the initial respondents do not adequately cover the community.
To address such challenges, our sponsor aims to create a comparative tool that helps policymakers determine if the representative's public feedback truly represents the community. We are suggested to use statistical and machine learning methods. Then such a tool will allow policymakers to compare the public feedback received with the whole community in different aspects, such as demographics. This will enable policymakers to identify potential gaps and biases in public engagement efforts and take steps to address them. Ultimately, this project seeks to promote more inclusive and equitable decision-making processes in democratic governance.
Expectations of the final product mainly have three components: Structured Intake, Comparison Engine, and Location Suggestion.
The Structured Intake process aims to design a structured intake form that collects just the right amount of information from residents, ensuring that city employees can accurately determine which residents they spoke with. However, several competing factors must be considered when designing the intake form, including:
● Balancing the need to collect sufficient data by protecting residents' personally identifying information by limiting data collection to the minimum.
● Asking for the right location and socio-economic data allows city employees to analyze the data with the right level of precision (e.g., ZIP code or neighborhood).
● Choosing the most streamlined and efficient collection means, whether it's a paper and pencil form or a digital form with a clickable neighborhood map.
The Comparison Engine is the most important part out of the three. A comparison engine is needed to ensure that the feedback received from residents accurately represents the community, which is the goal of our project. This engine will compare the respondent list to “ real world” data, such as NYC Open Data and US Census Bureau Data, then clearly communicate the results. However, to ensure effectiveness, the comparison engine must meet the following requirements:
● The Comparison Engine must be given a geographic location of the feedback session.
● The engine must analyze and communicate the comparison between the intake and ground truth at various levels. Different political boundaries capture different views of New York residents, including NYC-wide, borough, community board, council district, and neighborhood.
● The Comparison Engine must present the information in a visually appealing manner. Consider using Edward Tufte's data visualization principles.
● The engine must allow for multiple respondent feedback sessions for each project or location, as there will be several feedback sessions with residents for each project.
● The city employees must be able to use commonly available software to report out, such as Google Docs, Sheets, Slides, Microsoft Word, Excel, and PowerPoint.
● The Comparison Engine must save the analysis in multiple formats, such as JPG, CSV, and PDF, for use in common software.
There are also different levels of comparison. According to our sponsor, this could be the potential order based on his previous experience:
● Location comparison
● Ethnic - Census
● Gender
● Social comparison
● Household Income Data
The Location Suggestion is the final goal for this project. To ensure effective engagement and maximum responses, city employees must identify the optimal locations to host feedback sessions after analyzing data with the Comparison Engine. The tool can help identify the most suitable spots in a particular area, based on the target audience, which may include specific socio-economic groups or a predetermined number of people in a given geography. To achieve this, city employees must determine the criteria for selecting respondents, such as age, gender, census-defined race or ethnicity, household income, children, marriage status, etc. Additionally, they need to determine the geographic scope for the search, which could be a borough, community board, council district, neighborhood, or centered on a particular address.
Methodology& Data
4/28 update:
Comparison Engine:
We did not have the historical data since there were legal issues involved, and we were a bit frustrated by using the "synthetic data" basically because they are not "real" and would not be meaningful to use. That's why we spent weeks focusing on generating synthetic datasets.
Approach & Challenges:
Then later, we realized that the input data does not really matter as long as we have our cleaned and structured database (i.e., Census Data for comparison). What we really need to focus on at this stage is a process of comparison. In other words, the codes that compare the input data with the database, no matter the input data. We should also focus on cleaning and structuring our database to process the comparison smoothly.
We struggled to find the "right" Census dataset as our database, but then we realized we could use API to get the Census data. However, based on our current knowledge learned from CUSP, API is time-sensitive and can be used for one-time analysis. In our case, using API could not achieve our long-term goal as Comparison Engine since API would expire, and it would turn out to be more like a Comparison Report that we need to make adjustments over a certain period.
We are not experts on API, but it looks like the only way to address the problem by far. We will try our best to work on the process and results.
Location Suggestion:
For the location suggestion part, the initial goal is to use the algorithm from the Comparison Engine to move forward into the Location Suggestion. In our understanding, the location suggestion and the comparison engine are logically connected: When the result of a comparison is not "representative," the tool gives optimization advice, and we collect data at the suggested location to improve the composition of the survey takers.
Challenges:
However, a different scenario would result in other considerations. There is no "fixed" algorithm for such a location suggestion tool. We would view Comparison Engine and Location Suggestion more like two separate processes since one does not need another to function. For example, the algorithm for “ redesigning the plaza” would differ from “ relocating school.” The algorithm would focus on the residence around the plaza for the “ redesigning” scenario but focus more on families with children for the “ relocating school” scenario instead.
Temporary Approach:
On the other hand, we still came up with an approach for our particular scenario: redesigning the plaza in LIC. Such a tool aims to identify the optimal location for an event in Long Island City that will generate the highest level of engagement and response. It would use a circular shape to determine the most suitable area for the event, with a radius selected to balance bias and variance. Initially, the tool considers only census block population data, but additional features such as race and age can be incorporated to identify an ideal location range. In this case, even though the scenario is different, the consideration remains the same.
Preliminary discussion of expected deliverables
● Comparison engine: A functional comparison engine that analyzes feedback data from the public questionnaire and provides a comparison report.
● User interface: A user-friendly interface that allows users to input data, select criteria, and view the comparison report.
● The technical documentation includes project requirements, system architecture, and technical specifications.
● User manual: A document that provides instructions on how to use the comparison engine.
4/28 update:
We had this vision of a UI/UX to showcase the results interactively but realized it would be way too complicated for a three-person team. Instead, we could still deliver the outputs together in a simpler form without the interaction. Or we can later use the easier platforms to create one.
Potential Tools
Most of us are very familiar with ArcGIS, but ArcGIS is more like a manual tool for data visualization. We will keep ArcGIS within our consideration since it would be helpful to give a more detailed visualization. We might need ArcGIS for certain purposes.
Project risks and mitigation strategies
● Low response rate: Populations vary throughout the boroughs of New York City. For example, we might receive 200 responses from Manhattan but only 20 from the Bronx. One possible solution would be to use the sample mean to estimate the population results.
● Hidden biases during the sampling process: People tend not to disclose information about themselves, making it difficult to obtain a survey without biases and variances. One possible solution would be to use clustered sampling methods to obtain the data and minimize biases.
● Limited resources: We do not have unlimited time, funding, or personnel to deliver our project. One possible solution would be prioritizing what is most important and deprioritizing what is less important.
● Ethical considerations: Once the data are gathered, we must use them carefully because personal data are sensitive and private. One possible solution would be to use
computers to generate data, so we don't encounter ethical issues.
Reference
● Han, H., & Bennion, E. A. (2005). Education and democratic engagement in Japan and the United States. PS: Political Science & Politics, 38(4), 687-692.
● Griffin, J. J., Jordan, R., Vines, P. L., & Reynolds, K. (2019). “Nobody's listening to us” : A study of public engagement processes related to low-income housing policy. Journal of Poverty, 23(6), 393-412.
● Palmer, M. (2021). Racial Disparities in Housing Politics: Evidence from the American Housing Survey. Journal of Race, Ethnicity, and Politics, 6(2), 311-338.
2023-06-28
Comparing How Representative Public Feedback is to the Community