Question 1a: Number of Cases (Rows)
case_count <- nrow(data)
case_count
## [1] 500
Question 1b: Number of Variables (Columns)
variable_count <- ncol(data)
variable_count
## [1] 10
Question 1c: First 10 Instances
head(data, 10) %>% kable() %>% kable_styling()
| Job_Title | Industry | Company_Size | Location | AI_Adoption_Level | Automation_Risk | Required_Skills | Salary_USD | Remote_Friendly | Job_Growth_Projection |
|---|---|---|---|---|---|---|---|---|---|
| Cybersecurity Analyst | Entertainment | Small | Dubai | Medium | High | UX/UI Design | 111392.17 | Yes | Growth |
| Marketing Specialist | Technology | Large | Singapore | Medium | High | Marketing | 93792.56 | No | Decline |
| AI Researcher | Technology | Large | Singapore | Medium | High | UX/UI Design | 107170.26 | Yes | Growth |
| Sales Manager | Retail | Small | Berlin | Low | High | Project Management | 93027.95 | No | Growth |
| Cybersecurity Analyst | Entertainment | Small | Tokyo | Low | Low | JavaScript | 87752.92 | Yes | Decline |
| UX Designer | Education | Large | San Francisco | Medium | Medium | Cybersecurity | 102825.01 | No | Growth |
| HR Manager | Finance | Medium | Singapore | Low | High | Sales | 102065.72 | Yes | Growth |
| Cybersecurity Analyst | Technology | Small | Dubai | Medium | Low | Machine Learning | 86607.32 | Yes | Decline |
| AI Researcher | Retail | Large | London | High | Low | JavaScript | 75015.86 | No | Stable |
| Sales Manager | Entertainment | Medium | Singapore | High | Low | Cybersecurity | 96834.58 | Yes | Decline |
Question 1d: Missing Values Analysis
missing_summary <- sapply(data, function(x) sum(is.na(x)))
missing_summary <- missing_summary[missing_summary > 0]
missing_summary
## named integer(0)
Part 2: Research Questions
Part 2a: Research Questions and Hypotheses
Question 1: Does the AI_Adoption_Level of a company impact the Salary_USD for jobs within that company?
Hypothesis 1: Higher AI_Adoption_Level is associated with higher Salary_USD.
Question 2: Can the Automation_Risk level of a job predict its Job_Growth_Projection?
Hypothesis 2: Jobs with higher Automation_Risk are more likely to have a “Decline” in Job_Growth_Projection.
Part 2b: Relevant Variables for Each Research Question
Question 1: Relevant Variables - AI_Adoption_Level, Salary_USD
Question 2: Relevant Variables - Automation_Risk, Job_Growth_Projection
Part 2c: Identify Response Variables
Question 1: Response Variable: Salary_USD
Question 2: Response Variable: Job_Growth_Projection
Part 2d: Missing Values for Response Variables
missing_salary <- sum(is.na(data$Salary_USD))
missing_growth <- sum(is.na(data$Job_Growth_Projection))
missing_values_responses <- list(Salary_USD = missing_salary, Job_Growth_Projection = missing_growth)
missing_values_responses
## $Salary_USD
## [1] 0
##
## $Job_Growth_Projection
## [1] 0
Part 2e: Distribution of Response Variables Salary Distribution
ggplot(data, aes(x = Salary_USD)) +
geom_histogram(binwidth = 5000) +
labs(title = "Salary Distribution", x = "Salary (USD)", y = "Count")
Job Growth Projection Distribution
ggplot(data, aes(x = Job_Growth_Projection)) +
geom_bar() +
labs(title = "Job Growth Projection Distribution", x = "Job Growth Projection", y = "Count")
Part 2f: Relationship Between Response and Explanatory Variables Relationship between Salary and AI Adoption Level
ggplot(data, aes(x = AI_Adoption_Level, y = Salary_USD, fill = AI_Adoption_Level)) +
geom_boxplot() +
labs(title = "Salary by AI Adoption Level", x = "AI Adoption Level", y = "Salary (USD)") +
theme_minimal()
Relationship between Salary and Company Size
ggplot(data, aes(x = Company_Size, y = Salary_USD, fill = Company_Size)) +
geom_boxplot() +
labs(title = "Salary by Company Size", x = "Company Size", y = "Salary (USD)") +
theme_minimal()
Relationship between Job Growth Projection and Automation Risk
ggplot(data, aes(x = Automation_Risk, fill = Job_Growth_Projection)) +
geom_bar(position = "dodge") +
labs(title = "Job Growth Projection by Automation Risk", x = "Automation Risk", y = "Count")
Relationship between Job Growth Projection and Industry
ggplot(data, aes(x = Industry, fill = Job_Growth_Projection)) +
geom_bar(position = "dodge") +
labs(title = "Job Growth Projection by Industry", x = "Industry", y = "Count") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))