*Source: https://kgumusic.com/blogs/news/music-project-funding?srsltid=AfmBOopZsltJPBVD-wFIrt-zUuJB98t9L_Uu9HRRqtnqfMkfhztYEhjj

For my final Data 110 project, I have decided to use the 2022 Music Census published by the City of Austin from Data.gov. My outcome variable will be Income Level, a variable that I will get by mutating all categorical values in the income variables to quantitative values and then adding them up. My first predictor variable will be years of experience, the amount of time a musician has been playing. My second predictor variable will be annual spending on new recordings, the amount of money spent yearly to make new songs. My last predictor variable will be annual spending on gear or rentals, the amount of money spent yearly on instruments, musical equipment, and studio space. My topic question is how much do these predictor variables help determine a musician’s income level. Additionally, I want to see whether these variables are the most important in helping someone become a certain musician. I chose this dataset as I am very fond of music. I listen to music while I do everything. I thought it would be interesting to delve into the factors that make a musician financially successful. There is no README file, so the most logical explanation to how the City Of Austin gathered the data was through survey. The word census being a part of the name also implies survey as the data gathering methodology as data from censuses are typically collected through survey.

#Loading the necessary libraries
library(tidyverse)
## Warning: package 'dplyr' was built under R version 4.5.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## âś” dplyr     1.2.0     âś” readr     2.1.6
## âś” forcats   1.0.1     âś” stringr   1.6.0
## âś” ggplot2   4.0.2     âś” tibble    3.3.1
## âś” lubridate 1.9.5     âś” tidyr     1.3.2
## âś” purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## âś– dplyr::filter() masks stats::filter()
## âś– dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(ggplot2)
library(readr)
#Loading the dataset
MusicData<-readr::read_csv("C:/Users/panca/Downloads/2022_MusicCensus_Details.csv")
## Rows: 2227 Columns: 107
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (84): County of Residence, Primary Music Ecosystem Sector, Music Busines...
## dbl (20): Residence Distance from Downtown in miles, CREATIVES Percentage of...
## lgl  (3): City of Austin Resident, Restored Pre Pandemic Workload, Hispanic ...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View(MusicData)
#Checking the column names to see what variables to use
colnames(MusicData)
##   [1] "County of Residence"                                                           
##   [2] "Residence Distance from Downtown in miles"                                     
##   [3] "City of Austin Resident"                                                       
##   [4] "Primary Music Ecosystem Sector"                                                
##   [5] "Music Business Structure"                                                      
##   [6] "Work Location"                                                                 
##   [7] "Years Experience"                                                              
##   [8] "Music Education"                                                               
##   [9] "Community or Business Participation"                                           
##  [10] "Restored Pre Pandemic Workload"                                                
##  [11] "Registrations Memberships"                                                     
##  [12] "Suggested Assistance for Royalty Collecting"                                   
##  [13] "Intent to Continue Music Career Next 3 Years"                                  
##  [14] "Intent to Stay in Austin Next 3 Years"                                         
##  [15] "Work Space Status including Venue Performance Space"                           
##  [16] "Conditions of Current Lease"                                                   
##  [17] "Intent To Renew Current Lease"                                                 
##  [18] "Suggested Financial Tools"                                                     
##  [19] "Suggested Terms for Outside Investment"                                        
##  [20] "Confidence Navigating Grant Process"                                           
##  [21] "Suggested Grant Assistance"                                                    
##  [22] "Suggested Music Training Topics"                                               
##  [23] "Suggested Level of Music Training"                                             
##  [24] "CREATIVES Income from Live Performance Locally"                                
##  [25] "CREATIVES Income from Live Performance Touring"                                
##  [26] "CREATIVES Income Related to Recordings"                                        
##  [27] "CREATIVES Income Related to Songwriting"                                       
##  [28] "CREATIVES Income Related to Studio Work"                                       
##  [29] "CREATIVES Income Related to Merchandise"                                       
##  [30] "CREATIVES Income Related to Teaching"                                          
##  [31] "CREATIVES Paid Performances per Month Currently"                               
##  [32] "CREATIVES Percentage of Gigs With Base Guarantee"                              
##  [33] "CREATIVES Work Also as Gig Presenter"                                          
##  [34] "CREATIVES Annual Spending on New Recordings"                                   
##  [35] "CREATIVES Annual Spending on Publicity Promotion"                              
##  [36] "CREATIVES Annual Spending on Web or Social Media"                              
##  [37] "CREATIVES Annual Spending on Supplies"                                         
##  [38] "CREATIVES Annual Spending on Rehearsal or Work Space"                          
##  [39] "CREATIVES Annual Spending on Gear or Rentals"                                  
##  [40] "CREATIVES Annual Spending on Merchandise"                                      
##  [41] "CREATIVES Annual Spending on Accounting or Legal"                              
##  [42] "CREATIVES Percentage of Annual Spending Paid to Local Providers"               
##  [43] "CREATIVES Hiring for Booking"                                                  
##  [44] "CREATIVES Hiring for Management"                                               
##  [45] "CREATIVES Hiring for Record Label"                                             
##  [46] "CREATIVES Hiring for Merchandise"                                              
##  [47] "CREATIVES Hiring for Publicity Social Media"                                   
##  [48] "CREATIVES Hiring for Recording Studio"                                         
##  [49] "CREATIVES Hiring for Equipment Supplies"                                       
##  [50] "CREATIVES Hiring for Publishing Licensing"                                     
##  [51] "CREATIVES Hiring for Legal Accounting"                                         
##  [52] "PRESENTER Ownership Structure"                                                 
##  [53] "PRESENTER Venue Type"                                                          
##  [54] "PRESENTER Role of Live Music"                                                  
##  [55] "PRESENTER Venue Capacity"                                                      
##  [56] "PRESENTER Outdoor Live Music Possible"                                         
##  [57] "PRESENTER Ranking of Pressures TALENT COSTS 1 highest 10 lowest"               
##  [58] "PRESENTER Ranking of Pressures CHANGING AUDIENCE BEHAVIORS 1 highest 10 lowest"
##  [59] "PRESENTER Ranking of Pressures LABOR 1 highest 10 lowest"                      
##  [60] "PRESENTER Ranking of Pressures PROPERTY TAX 1 highest 10 lowest"               
##  [61] "PRESENTER Ranking of Pressures MARKETING 1 highest 10 lowest"                  
##  [62] "PRESENTER Ranking of Pressures ORDINANCES PERMITS 1 highest 10 lowest"         
##  [63] "PRESENTER Ranking of Pressures UNPREDICTABILITY OF COSTS 1 highest 10 lowest"  
##  [64] "PRESENTER Ranking of Pressures NEIGHBORHOOD REDEVELOPMENT 1 highest 10 lowest" 
##  [65] "PRESENTER Ranking of Pressures BUILDING OPERATIONS 1 highest 10 lowest"        
##  [66] "PRESENTER Percentage of Talent Deals Paid by Guarantee as Business Expense"    
##  [67] "PRESENTER Percentage of Talent Deals Paid by Guarantee from Door Proceeds"     
##  [68] "PRESENTER Percentage of Talent Deals Paid by Percentage of Door Only"          
##  [69] "PRESENTER Percentage of Talent Deals Paid by Fixed Percentage of Bar Sales"    
##  [70] "PRESENTER Percentage of Talent Deals Paid by Tips Only"                        
##  [71] "PRESENTER Local Talent Bookings as Percentage of Total Bookings"               
##  [72] "PRESENTER Confidence Booking Local Talent Next 3 years"                        
##  [73] "INDUSTRY Professional Experience in 13 subcategories"                          
##  [74] "INDUSTRY Percentage of Work Related to Music"                                  
##  [75] "INDUSTRY Ranking of Client Geography AUSTIN AREA"                              
##  [76] "INDUSTRY Ranking of Client Geography REST of USA"                              
##  [77] "INDUSTRY Ranking of Client Geography INTERNATIONAL"                            
##  [78] "Health Insurance Currently Covered"                                            
##  [79] "Health Insurance Provider"                                                     
##  [80] "Health Services"                                                               
##  [81] "Reasons For Not Pursuing Help from HAAM or SIMS"                               
##  [82] "Choice for Care When Sick or Need healthcare"                                  
##  [83] "Health Wellness Concerns"                                                      
##  [84] "Affordability Struggles"                                                       
##  [85] "Received COVID Related Relief"                                                 
##  [86] "Covid Relief Was First Ever Government Support for Music Person"               
##  [87] "Current Housing"                                                               
##  [88] "Housing Changes Over Past 2 Years"                                             
##  [89] "Race"                                                                          
##  [90] "Hispanic Latino Latina Latinx Origin"                                          
##  [91] "Age"                                                                           
##  [92] "Gender"                                                                        
##  [93] "Sexual Orientation"                                                            
##  [94] "DEI Austin music scene HOSTILE FRIENDLY"                                       
##  [95] "DEI Austin music scene HOMOGENEOUS DIVERSE"                                    
##  [96] "DEI Austin music scene RACIST INCLUSIVE"                                       
##  [97] "DEI Austin music scene DISRESPECTFUL RESPECTFUL"                               
##  [98] "DEI Austin music scene COMPETITIVE COOPERATIVE"                                
##  [99] "DEI Austin music scene CONTENTIOUS COLLEGIAL"                                  
## [100] "DEI Austin music scene  SEXIST NON SEXIST"                                     
## [101] "DEI Austin music scene INDIVIDUALISTIC COLLABORATIVE"                          
## [102] "DEI Austin music scene HOMOPHOBIC NON HOMOPHOBIC"                              
## [103] "DEI Austin music scene UNSUPPORTIVE SUPPORTIVE"                                
## [104] "DEI Austin music scene AGEIST NON AGEIST"                                      
## [105] "DEI Austin music scene UNWELCOMING WELCOMING"                                  
## [106] "DEI Austin music scene ELITIST NON ELITIST"                                    
## [107] "DEI Austin music scene UNCOMFORTABLE SAFE"
#Mutating each NA value in the income variables to None
MusicData <- MusicData |>mutate(
  `CREATIVES Income from Live Performance Locally` 
    = replace_na(`CREATIVES Income from Live Performance Locally`,"None"),
  `CREATIVES Income from Live Performance Touring` 
    = replace_na(`CREATIVES Income from Live Performance Touring`, "None"),
  `CREATIVES Income Related to Recordings`
    = replace_na(`CREATIVES Income Related to Recordings`,"None"),
  `CREATIVES Income Related to Songwriting`
    = replace_na(`CREATIVES Income Related to Songwriting`,"None"),
  `CREATIVES Income Related to Studio Work`
    = replace_na( `CREATIVES Income Related to Studio Work`,"None"),
  `CREATIVES Income Related to Merchandise`
    = replace_na(`CREATIVES Income Related to Merchandise`,"None"),
  `CREATIVES Income Related to Teaching`
    = replace_na(`CREATIVES Income Related to Teaching`,"None")
#Mutating the categorical values in the income variables to quantitative values  
) |> mutate(
  Income1 = case_when(
    `CREATIVES Income from Live Performance Locally` == "None" ~ 0,
    `CREATIVES Income from Live Performance Locally` == "Very Little" ~ 1,
    `CREATIVES Income from Live Performance Locally` == "Some" ~ 2,
    `CREATIVES Income from Live Performance Locally` == "Most" ~ 3,
    `CREATIVES Income from Live Performance Locally` == "All" ~ 4
  ),
  Income2 = case_when(
    `CREATIVES Income from Live Performance Touring` == "None" ~ 0,
    `CREATIVES Income from Live Performance Touring` == "Very Little" ~ 1,
    `CREATIVES Income from Live Performance Touring` == "Some" ~ 2,
    `CREATIVES Income from Live Performance Touring` == "Most" ~ 3,
    `CREATIVES Income from Live Performance Touring` == "All" ~ 4
  ),
   Income3 = case_when(
    `CREATIVES Income Related to Recordings` == "None" ~ 0,
    `CREATIVES Income Related to Recordings` == "Very Little" ~ 1,
    `CREATIVES Income Related to Recordings` == "Some" ~ 2,
    `CREATIVES Income Related to Recordings` == "Most" ~ 3,
    `CREATIVES Income Related to Recordings` == "All" ~ 4
  ),
  Income4 = case_when(
    `CREATIVES Income Related to Songwriting` == "None" ~ 0,
    `CREATIVES Income Related to Songwriting` == "Very Little" ~ 1,
    `CREATIVES Income Related to Songwriting` == "Some" ~ 2,
    `CREATIVES Income Related to Songwriting` == "Most" ~ 3,
    `CREATIVES Income Related to Songwriting` == "All" ~ 4
  ),
   Income5 = case_when(
    `CREATIVES Income Related to Studio Work` == "None" ~ 0,
    `CREATIVES Income Related to Studio Work` == "Very Little" ~ 1,
    `CREATIVES Income Related to Studio Work` == "Some" ~ 2,
    `CREATIVES Income Related to Studio Work` == "Most" ~ 3,
    `CREATIVES Income Related to Studio Work` == "All" ~ 4
  ),
    Income6 = case_when(
    `CREATIVES Income Related to Merchandise` == "None" ~ 0,
    `CREATIVES Income Related to Merchandise` == "Very Little" ~ 1,
    `CREATIVES Income Related to Merchandise` == "Some" ~ 2,
    `CREATIVES Income Related to Merchandise` == "Most" ~ 3,
    `CREATIVES Income Related to Merchandise` == "All" ~ 4
  ),
    Income7 = case_when(
    `CREATIVES Income Related to Teaching` == "None" ~ 0,
    `CREATIVES Income Related to Teaching` == "Very Little" ~ 1,
    `CREATIVES Income Related to Teaching` == "Some" ~ 2,
    `CREATIVES Income Related to Teaching` == "Most" ~ 3,
    `CREATIVES Income Related to Teaching` == "All" ~ 4
  ) 
)

#Creating a new income variable by adding the income variables together
CleanMusic <- MusicData |> mutate(
  IncomeLevel = Income1+Income2+Income3+Income4+Income5+Income6+Income7
  #Selecting the outcome and predictor variables  
  ) |> select(
    IncomeLevel, `CREATIVES Annual Spending on New Recordings`,`CREATIVES Annual Spending on Gear or Rentals`, `Years Experience`
    #Filtering out any missing values in the predictor variables
  ) |> filter(
    !is.na(`CREATIVES Annual Spending on New Recordings`),
    !is.na(`CREATIVES Annual Spending on Gear or Rentals`),
    !is.na(`Years Experience`)
  ) 

                                 
View(CleanMusic)
#Creating a multivariable plot
ggplot(CleanMusic, aes(
  x=`CREATIVES Annual Spending on New Recordings`,
  y=IncomeLevel, 
  color=`Years Experience`,
  size=`CREATIVES Annual Spending on Gear or Rentals`))+
  geom_point(alpha=0.5)+
  #Chooses a color palette
  scale_color_brewer(palette="Set3")+
  #Adds labels for title, x-axis, y-axis, and the caption
  labs(
    x="Annual Spending on New Recordings",
    y="Income Level",
    title="Income Level Vs Quantity of Money Spent",
    caption="Source: City of Austin, Texas - Data.Gov",
    color="Years of Experience",
    size="Annual Spending on Gear or Rentals"
  )+
  #Darker theme to be able to see the lighter colors
  theme_dark()

Tableau Visualization

Explore the interactive dashboard here: Tableau Public Dashboard

Source: City of Austin, 2022 Music Census

#Builds a multiple linear regression model to predict income level using annual spending on new recordinds, annual spending on gear or rentals, and years of experience as predictor variables
model <- lm(IncomeLevel~
              `CREATIVES Annual Spending on New Recordings`+
              `CREATIVES Annual Spending on Gear or Rentals`+
              `Years Experience`,
              data=CleanMusic
              )
#Shows results like p-value and R-squared
summary(model)
## 
## Call:
## lm(formula = IncomeLevel ~ `CREATIVES Annual Spending on New Recordings` + 
##     `CREATIVES Annual Spending on Gear or Rentals` + `Years Experience`, 
##     data = CleanMusic)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.8773 -2.2881 -0.2164  2.2861 18.6969 
## 
## Coefficients:
##                                                  Estimate Std. Error t value
## (Intercept)                                     6.635e+00  4.526e-01  14.660
## `CREATIVES Annual Spending on New Recordings`   8.299e-05  2.545e-05   3.261
## `CREATIVES Annual Spending on Gear or Rentals` -8.664e-06  4.732e-05  -0.183
## `Years Experience`6 to 10                      -4.130e-01  5.460e-01  -0.756
## `Years Experience`Less than 3                  -2.189e+00  8.230e-01  -2.659
## `Years Experience`More than 10                  5.376e-01  4.740e-01   1.134
##                                                Pr(>|t|)    
## (Intercept)                                     < 2e-16 ***
## `CREATIVES Annual Spending on New Recordings`   0.00117 ** 
## `CREATIVES Annual Spending on Gear or Rentals`  0.85478    
## `Years Experience`6 to 10                       0.44965    
## `Years Experience`Less than 3                   0.00803 ** 
## `Years Experience`More than 10                  0.25718    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.313 on 632 degrees of freedom
## Multiple R-squared:  0.04937,    Adjusted R-squared:  0.04185 
## F-statistic: 6.565 on 5 and 632 DF,  p-value: 5.739e-06
par(mfrow=c(2,2))
plot(model)

plot(model, which=1)

Final Conclusions: My visualization has the outcome variable, income level, on the y-axis. The x-axis is the annual money spent on new recordings. The color of each point represents how many years of experience the musician had. The size of each point of how much they spent annually on gear or rentals. Some interesting patterns were that musicians with less than 3 years of experience looked to have lower levels of income. Most of the points are clustered at lower spending levels, mostly between 0 and 10,000 for money spent on new recordings.There seems to a positive relationship between money spent on new recordings and income level, but it does not appear to be very strong. Annual spending on gear or rentals seem to be spread across the graph, appearing to not have any effect on income level.

While looking at the multiple linear regression, I can come up with more definitive conclusions. Musicians with 3 to 5 years of experience tend to have higher income levels that musicians with only 3 or fewer years of experience, which is backed by the p-value of 0.00544. 6 to 10 years of experience had a p-value of 0.46 and 10 or more years of experience had a p-value of 0.138, which means they are both not statistically significant. Essentially, having more than 5 years of experience does not significantly increase income. Annual spending on new recordings showed to be statistically significant with a p-value of 0.00126. Although, an increase by a dollar in annual spending on new recordings only increased income level by 0.0000829. Additionally, annual spending on gear or rentals was not statistically significant with a p-value of 0.70, which means that the amount of money spent on musical equipment or studio space did not have any real effect on a musician’s income level. Most important of all, the multiple R-squared value of 0.04937 suggests that these predictor variables only explain 4.937% of the variability in a musician’s income, which means there are many other factors that contribute to the variation.