Analyzing Shifting Perspectives: UC Berkeley Students’ Views on the People’s Park Project
Author
Natania Wong
Introduction
The People’s Park project in Berkeley, California, has long been a subject of controversy. This housing initiative aims to provide support for unhoused and low-income individuals in the community while also offering below-market-rate housing for over 1,100 undergraduates in a new residential facility. After years of debate, the project was recently approved by the Regents of the University of California.
To gain a deeper understanding of the opinions held by UC Berkeley students regarding this project, the UC Berkeley Chancellor’s Office conducted a student survey (https://chancellor.berkeley.edu/sites/default/files/uc_berkeley_institutional_housing_qual5678-0716-institutionalhousing.pdf) throughout the 2021-2022 academic year. The primary objective of this data analysis project is to explore the survey results, with a particular focus on how students’ perspectives on the People’s Park project evolved before and after receiving information from the university administration.
Key Objectives
Examine the baseline opinions of UC Berkeley students regarding the People’s Park project before they received additional information.
Analyze how students’ views on the project changed after they were provided with information from the university administration.
Identify any notable trends or shifts in sentiment among students.
Assess the implications of these findings for future university-community engagement and decision-making processes.
By conducting this data analysis, we aim to shed light on the dynamics of information and perception surrounding contentious community projects and offer insights that can inform future initiatives and public engagement strategies.
Part I: Data Loading and Cleaning
Data is collected and stored in the data frame “ppk”.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.2 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.3 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(stat20data)data(ppk)head(ppk)
# A tibble: 6 × 43
Q1 Q7_1 Q7_2 Q7_3 Q7_4 Q7_5 Q7_6 Q7_7 Q8 Q9_1 Q9_2 Q9_3 Q9_4
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 Senior 0 0 0 0 0 0 1 Very… 4 3 1 5
2 Junior 0 1 0 0 0 0 0 Very… 1 3 4 6
3 Gradu… NA NA NA NA NA NA NA Very… 4 2 1 5
4 Junior 1 0 1 0 1 0 0 Some… 1 2 3 4
5 Gradu… NA NA NA NA NA NA NA Some… 1 2 3 5
6 Gradu… NA NA NA NA NA NA NA Some… 1 2 3 4
# ℹ 30 more variables: Q9_5 <dbl>, Q9_6 <dbl>, Q10 <fct>, Q11_1 <dbl>,
# Q11_2 <dbl>, Q11_3 <dbl>, Q11_4 <dbl>, Q11_5 <dbl>, Q11_6 <dbl>,
# Q11_7 <dbl>, Q15_1 <dbl>, Q17 <chr>, Q18 <dbl>, Q18_words <fct>,
# Q20_1 <dbl>, Q20_2 <dbl>, Q20_3 <dbl>, Q20_4 <dbl>, Q20_5 <dbl>,
# Q20_6 <dbl>, Q20_7 <dbl>, Q21 <dbl>, Q21_words <fct>, Q22_1 <dbl>,
# Q22_2 <dbl>, Q22_3 <dbl>, Q22_4 <dbl>, Q22_5 <dbl>, Q22_6 <dbl>,
# Q22_7 <dbl>
The way I encoded the data from Question 7 is consistent with the encoding in the variable ppk. In this encoding scheme, we use “1” to indicate that students selected a particular option, and “0” to indicate that students did not choose that option.
# A tibble: 6 × 9
Q1 Q7_1 Q7_2 Q7_3 Q7_4 Q7_5 Q7_6 Q7_7 Q8
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 Senior 0 0 0 0 0 0 1 Very …
2 Junior 0 1 0 0 0 0 0 Very …
3 Graduate/Professional student NA NA NA NA NA NA NA Very …
4 Junior 1 0 1 0 1 0 0 Somew…
5 Graduate/Professional student NA NA NA NA NA NA NA Somew…
6 Graduate/Professional student NA NA NA NA NA NA NA Somew…
Part II: Exploratory Data Analysis
Visualizations: Question 10 (How important is it for UC Berkeley to provide more student housing?)
The majority of students believe that it is either “very important” or “somewhat important” for UC Berkeley to provide more student housing. This is evident as most of the data falls within these two categories. There is only a small percentage of students who do not consider it important for UC Berkeley to increase student housing.
## set the levels in order ppk_10 <-within(ppk, Q10 <-factor(Q10, levels=names(sort(table(Q10), decreasing=TRUE))))ppk_10%>%drop_na()%>%ggplot(aes(x=Q10, na.rm=TRUE, y=(..count..)/sum(..count..))) +geom_bar(color="black", fill="darksalmon", alpha=0.6) +ggtitle("Level of importance to provide more student housing") +xlab("Extent of Importance") +ylab("Density") +coord_flip() +theme_bw()
Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(count)` instead.
Visualizations: Question 18 and 21 (Change of each individual respondent before and after the information)
The data indicates a linearly positive relationship between the level of support for the People’s Park Project before and after reading the provided information. This suggests that students who initially had lower support for the project maintained lower support even after reading the information, and those with higher initial support continued to support the project.
In essence, the scatter plot demonstrates that there isn’t a significant shift in students’ attitudes toward the People’s Park Project before and after reading the information. This is because students’ levels of support remained relatively consistent; those in favor of the project before continued to be in favor after, and vice versa.
The histogram reinforces this observation, as it shows that the majority of students did not experience a change in their level of support for the project. The bin with a value of 0, indicating no change, has the highest frequency, indicating that most students’ opinions remained unchanged.
selected_ppk_1 <- ppk %>%select("Q18", "Q21") %>%ggplot(aes(x=Q18, y=Q21, na.rm=TRUE)) +geom_jitter(alpha=0.2, color="darksalmon", alpha=0.6) +ggtitle("Level of support for the People's Park Project before and after") +xlab("Level of support (pre)") +ylab("Level of support (post)") +theme_bw()
Warning: Duplicated aesthetics after name standardisation: alpha
ppk%>%mutate(change_in_support=Q18-Q21)%>%ggplot(aes(x=change_in_support, y=..density.., na.rm=TRUE)) +geom_histogram(color="black", fill="darksalmon", alpha=0.6) +xlab("Change in Support") +ylab("Density") +ggtitle("Change in Support towards the People's Park Project ")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
In analyzing the survey data, it’s evident that the level of support for the People’s Park Project varies among different class levels among the 1658 students surveyed. Among the various class levels, graduates exhibited the highest level of support at 8.7%, indicating a significant portion of graduate students expressed support for the project. Juniors also showed substantial support, with 7.0% in favor. Sophomores followed closely behind at 6.5%, while freshmen and seniors displayed relatively lower levels of support at 5.7% and 6.0%, respectively.
This distribution suggests an interesting trend: as students progress through their academic journey, their support for the People’s Park Project appears to increase. Graduates, who are likely more informed about housing issues, exhibit the highest support, possibly due to a better understanding of the project’s potential benefits. Juniors also show strong support, which could be influenced by their increased exposure to relevant information during their time at the university.
In contrast, freshmen and seniors, who may have less exposure or experience in understanding the project’s implications, show relatively lower support. This trend in support across different class levels sheds light on the importance of information dissemination and education about projects like these, as it seems to correlate with higher levels of support.
EDA: Question 15 (Mean/ median rating of the condition of People’s Park)
The condition of People’s Park, as rated by the survey participants on a scale of 0 to 10 (where 0 represents “terrible”), presents some interesting findings. The mean rating, which stands at 3.047, suggests that, on average, respondents view the condition of People’s Park as slightly below average.
The median rating, on the other hand, is notably lower at 2. This indicates that the majority of respondents tend to rate the park’s condition on the lower end of the scale, suggesting a generally negative perception of the park’s state.
These statistics collectively imply that there is a prevailing sentiment among survey participants that People’s Park is not in great condition. This could potentially tie into the controversies surrounding the People’s Park Project and its relevance to the community. Further analysis could explore how these perceptions correlate with respondents’ opinions on the project itself.
# A tibble: 1 × 2
mean median
<dbl> <dbl>
1 3.05 2
EDA: Average change in support in each class for the People’s Park Project post-reading page 14 of the questionnaire
The data reveals interesting insights into how survey participants’ support for the People’s Park Project changed after being presented with information from page 14 of the questionnaire. A new column, change_in_support, was created to measure this change by subtracting the level of support before reading the information from the level of support after reading it.
Among the different class groups, the results are as follows:
Freshman: The average change in support is notably high at 0.48, indicating that, on average, freshman students became more supportive of the People’s Park Project after reading the provided information.
Sophomore: Sophomore students also experienced a positive change in support, although it was less pronounced at 0.11. This suggests a smaller shift in opinion compared to the freshman class.
Junior: Junior students exhibited an average change in support of 0.22, indicating a moderate increase in support after reading the information.
Senior: Senior students, on average, showed a change in support of 0.32, suggesting a substantial increase in support for the People’s Park Project post-information.
Graduates: Graduate students had an average change in support of 0.29, indicating a notable positive shift in their support for the project.
These findings provide valuable insights into how different class groups responded to the information presented in the questionnaire. It appears that, across the board, students tended to become more supportive of the People’s Park Project after being provided with additional information, with freshmen exhibiting the most significant shift in support. This highlights the importance of information dissemination in influencing opinions on such projects.
Part III: Making Inferences about Berkeley Students
To gain a deeper understanding of the statistical values and determine whether they occur by chance, a 95% bootstrap confidence interval was created for the median rating of the condition of People’s Park as reported by Berkeley students. The resulting confidence interval is [3.11, 3.34].
This 95% confidence interval provides valuable insights. It means that, based on the data and the methodology used, there is a 95% level of confidence that the true median rating of the condition of People’s Park among Berkeley students falls within the range of 3.11 to 3.34. In simpler terms, it suggests that the students’ perceptions of People’s Park, as indicated by the median rating, are likely to be within this interval, and any values outside of this range might be less probable.
This interval serves as a statistical tool to assess the reliability of the reported median rating and to gauge whether it is significantly different from other potential values. It provides researchers and stakeholders with a degree of certainty regarding the central tendency of students’ opinions about the condition of People’s Park.
library("tidyr")ppk_narm <- ppk %>%drop_na()set.seed(103122)samp_1 <-sample(ppk_narm$Q15_1, nrow(ppk_narm))# Create the bootstrap populationboot_pop_1 <- samp_1boot_pop_dist_1 <-ggplot(as.data.frame(boot_pop_1), aes(x=boot_pop_1, y=..density..)) +geom_histogram(color="black", fill="darksalmon", alpha=0.6) +xlab("Rating of the Condition of People's Park") +ylab("Density") +ggtitle("Bootstrap Population") +theme_bw()boot_pop_dist_1
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Graph bootstrap sampling distributionboot_sampling_dist_1 <-ggplot(as.data.frame(boot_samp_means_1), aes(x=boot_samp_means_1, y=..density..)) +geom_histogram(bins =45, color="black", fill="darksalmon", alpha=0.6) +xlab("Median Rating of the Condition of People's Park") +ylab("Density") +ggtitle("Bootstrap Sampling Distribution") +theme_bw()boot_sampling_dist_1
# 95% bootstrap CI: Berkeley student median rating of the condition of People's Parkquantile(boot_samp_means_1, probs =c(0.025, 0.975))
2.5% 97.5%
3.112786 3.344994
To assess the proportion of Berkeley students who support the People’s Park Project with a 95% confidence level, a confidence interval was calculated using the normal distribution. Here’s how it was done:
Data Filtering: Only students who indicated a level of support towards the project as 1 to 3 (1: Very strongly support; 2: Strongly support; 3: Somewhat support) were considered for this calculation.
Calculating Mean and Standard Deviation: From the filtered data, the mean and standard deviation (sd) were computed. These statistics provide insight into the central tendency and variability of the data.
Standard Error (SE): The standard error (se) was calculated using the formula sd/sqrt(n), where n is the sample size. The SE indicates the precision of the estimate.
Z-Score: A z-score of 1.96, corresponding to the 95% confidence level, was used for this calculation.
Confidence Interval Calculation: With the mean, SE, and z-score, the 95% confidence interval (CI) was calculated.
The resulting 95% confidence interval for the proportion of Berkeley students who support the People’s Park Project is [0.2774, 0.3216]. This means that, with 95% confidence, it is estimated that the proportion of Berkeley students who support the People’s Park Project falls within this interval. In simpler terms, this interval provides a range within which the true proportion of supporting students is likely to lie based on the collected data and statistical analysis.
# A tibble: 1 × 2
mean sd
<dbl> <dbl>
1 0.299 0.459
## sample size n <-nrow(ppk)## meanmean <-0.2994924##sdsd <-0.4586181##sese <- sd/sqrt(n)## z-scorez <-1.96## 95% CI lower <- mean - z * seupper <- mean + z * sec(lower, upper)
[1] 0.2774167 0.3215681
To assess the average change in support for the People’s Park Project among Berkeley students before and after being exposed to the information on page 14 of the questionnaire, a 95% bootstrap confidence interval was calculated. Here’s what this means:
Bootstrap Resampling: The data was resampled multiple times with replacement. This process creates multiple simulated datasets, each representing a possible variation of the original data.
Change in Support Calculation: For each resampled dataset, the average change in support for the People’s Park Project was calculated by subtracting the level of support before reading the information from the level of support after reading the information. This step was repeated for each simulated dataset.
Confidence Interval: The resulting distribution of average changes in support was used to create a 95% confidence interval. This interval represents the range within which the true average change in support for the Project among Berkeley students before and after being exposed to the information is likely to lie.
The 95% bootstrap confidence interval for the average change in support for the People’s Park Project among Berkeley students before and after being exposed to the information is [-0.48, -0.37]. Importantly, this interval does not include 0, indicating that students’ opinions on the project have changed positively on average. In simpler terms, this means that, with 95% confidence, it is estimated that the average change in support for the People’s Park Project among Berkeley students falls within this interval, and this change is likely to be a positive one based on the collected data and statistical analysis.
# Question 15ppk_narm_change <- ppk %>%drop_na()%>%mutate(change_in_support=Q21-Q18)set.seed(103122)samp_3 <-sample(ppk_narm_change$change_in_support, nrow(ppk_narm_change))# Create the bootstrap populationboot_pop_3 <- samp_3boot_pop_dist_3 <-ggplot(as.data.frame(boot_pop_3), aes(x=boot_pop_3, y=..density..)) +geom_histogram(color="black", fill="darksalmon", alpha=0.6) +xlab("Average change in support pre and post") +ylab("Density") +ggtitle("Bootstrap Population") +theme_bw()boot_pop_dist_3
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Graph bootstrap sampling distributionboot_sampling_dist_3 <-ggplot(as.data.frame(boot_samp_means_3), aes(x=boot_samp_means_3, y=..density..)) +geom_histogram(bins =45, color="black", fill="darksalmon", alpha=0.6) +xlab("Average change in support pre and post") +ylab("Density") +ggtitle("Bootstrap Sampling Distribution") +theme_bw()boot_sampling_dist_3
# 95% bootstrap CI for the average change in support for the Project among Berkeley students before and after being exposed to the information on page 14 of the questionnairequantile(boot_samp_means_3, probs =c(0.025, 0.975))
2.5% 97.5%
-0.4837153 -0.3685163
Conclusion
“During the course of my exploratory data analysis, I delved into the statistical insights encapsulated within the UC Berkeley Chancellor Office’s student survey concerning the People’s Park Project. The comprehensive examination of this data led me to a significant finding: the opinions of the actual student population underwent a notable positive shift following their exposure to additional information about the project.
This discovery carries considerable implications. It suggests that informed communication and transparency can play a pivotal role in shaping public perception, particularly in a university setting where students are not only stakeholders but also active participants in the decision-making process.
However, it is imperative for the university administration to recognize the importance of ongoing consultation with the student body. While the initial data indicates a positive change in sentiment, it is vital to maintain this momentum by ensuring that students remain engaged and informed throughout the project’s development. Continued dialogue and transparency will be key to building and sustaining a sense of trust and collaboration between the university and its students.”