---
title: "Sai Lasya Navoor"
output: html_document
---
#Importing data
data <- read.csv("C:\\Users\\91814\\Desktop\\Statistics\\nurses.csv")
# Install and load ggplot2
if (!requireNamespace("ggplot2", quietly = TRUE)) {
install.packages("ggplot2")
}
library(ggplot2)
#Column 1 summary
col1_summary <- summary(data$Total_Employed_RN)
print(col1_summary)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 240 12210 31160 47704 60230 307060 5
The summary statistics for Total_Employed_RN reveal insights into the distribution of employed Registered Nurses:
Min: 240 (lowest employed RNs in a state) 1st Qu.: 12,210 (25% states below) Median: 31,160 (middle value) Mean: 47,704 (average across states) 3rd Qu.: 60,230 (75% states below) Max: 307,060 (highest employed RNs in a state) NA’s: 5 missing values Significance: This data provides a snapshot of the distribution of RN employment across states, offering insights into workforce variations.
Further Questions:
Outliers: Are there states with exceptionally high or low RN employment? Regional Disparities: Are there regional patterns in RN distribution? Healthcare Impact: How does RN variation correlate with healthcare services? Temporal Trends: Are there trends in RN employment over different years?
#Column 2 summary
col2_summary <- summary(data$Annual_Salary_Avg)
print(col2_summary)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 19190 49300 58750 59248 67378 120560 6
The summary statistics for Annual_Salary_Avg provide insights into the distribution of average annual salaries for Registered Nurses:
Min: $19,190 (lowest reported salary)
1st Qu.: $49,300 (25% states below)
Median: $58,750 (middle value)
Mean: $59,248 (average across states)
3rd Qu.: $67,378 (75% states below)
Max: $120,560 (highest reported salary)
NA’s: 6 missing values
Significance: This data sheds light on the range and central tendency of RN average annual salaries, offering insights into the financial landscape for nurses across different states.
Further Questions:
Regional Disparities: Are there regional patterns in RN average annual salaries?
Correlation with Employment: How does salary variation correlate with the total number of employed RNs?
Workforce Impact: How do salary levels influence RN availability and retention?
Factors Affecting Salaries: What state-specific factors contribute to the wide range of RN salaries?
col3_unique_val <- unique(data$State)
col3_val_count <- table(data$State)
cat("Categorical Summary for State :\n")
## Categorical Summary for State :
print(data.frame(Value=col3_unique_val, Count= col3_val_count))
## Value Count.Var1 Count.Freq
## 1 Alabama Alabama 23
## 2 Alaska Alaska 23
## 3 Arizona Arizona 23
## 4 Arkansas Arkansas 23
## 5 California California 23
## 6 Colorado Colorado 23
## 7 Connecticut Connecticut 23
## 8 Delaware Delaware 23
## 9 District of Columbia District of Columbia 23
## 10 Florida Florida 23
## 11 Georgia Georgia 23
## 12 Hawaii Guam 23
## 13 Idaho Hawaii 23
## 14 Illinois Idaho 23
## 15 Indiana Illinois 23
## 16 Iowa Indiana 23
## 17 Kansas Iowa 23
## 18 Kentucky Kansas 23
## 19 Louisiana Kentucky 23
## 20 Maine Louisiana 23
## 21 Maryland Maine 23
## 22 Massachusetts Maryland 23
## 23 Michigan Massachusetts 23
## 24 Minnesota Michigan 23
## 25 Mississippi Minnesota 23
## 26 Missouri Mississippi 23
## 27 Montana Missouri 23
## 28 Nebraska Montana 23
## 29 Nevada Nebraska 23
## 30 New Hampshire Nevada 23
## 31 New Jersey New Hampshire 23
## 32 New Mexico New Jersey 23
## 33 New York New Mexico 23
## 34 North Carolina New York 23
## 35 North Dakota North Carolina 23
## 36 Ohio North Dakota 23
## 37 Oklahoma Ohio 23
## 38 Oregon Oklahoma 23
## 39 Pennsylvania Oregon 23
## 40 Rhode Island Pennsylvania 23
## 41 South Carolina Puerto Rico 23
## 42 South Dakota Rhode Island 23
## 43 Tennessee South Carolina 23
## 44 Texas South Dakota 23
## 45 Utah Tennessee 23
## 46 Vermont Texas 23
## 47 Virginia Utah 23
## 48 Washington Vermont 23
## 49 West Virginia Virgin Islands 23
## 50 Wisconsin Virginia 23
## 51 Wyoming Washington 23
## 52 Guam West Virginia 23
## 53 Puerto Rico Wisconsin 23
## 54 Virgin Islands Wyoming 23
The summary for the variable State shows that each state is consistently represented 23 times in the dataset. This ensures a balanced sample across diverse regions, including states, territories, and the District of Columbia.
Significance:
The uniform distribution supports representative analyses of RN employment and salaries.
Inclusion of territories and the District of Columbia enriches the dataset’s geographical diversity.
Further Questions:
Regional Patterns: Are there distinct regional patterns in RN employment and salaries?
Territorial Influence: How do territories and the District of Columbia impact overall healthcare workforce trends?
State-specific Analyses: Should certain analyses be tailored to specific states based on unique characteristics?
Temporal Trends: How have the representation and characteristics of states evolved over different years?
What are the variations in average and median hourly wages for Registered Nurses (RN) across different states in 2020?
How does the total number of employed Registered Nurses vary among states in 2020?
Is there a strong correlation between the hourly wage and annual salary for RNs across all states?
Hourly Wage Variations:
Insight: Significant variations in average and median hourly wages for RNs across states in 2020.
Significance: Essential for assessing financial landscapes, aiding workforce planning, and attracting healthcare professionals.
Further Questions:
Are there specific states with significantly higher or lower hourly wages?
What factors contribute to the observed variations?
RN Employment Variability:
Insight: Disparities in the total number of employed RNs across states in 2020.
Significance: Crucial for healthcare resource allocation, addressing shortages, and optimizing workforce distribution.
Further Questions:
Are there states experiencing shortages or surpluses of RNs?
What regional factors contribute to the observed variability?
Hourly Wage vs. Annual Salary Correlation:
Insight: Examining the correlation between hourly wages and annual salaries for RNs across states.
Significance: Helps assess how changes in hourly wages may impact annual salaries, aiding both employers and employees.
Further Questions:
Are there states with notably strong or weak correlations?
What external factors contribute to observed correlation patterns?
Overall Implications: These insights inform healthcare policies, workforce planning, and financial considerations for RNs across states, contributing to more informed decision-making and efficient healthcare delivery. Further investigations into specific states and long-term trends can deepen our understanding of nursing workforce dynamics.
total_employed_rn_by_state <- aggregate(Total_Employed_RN ~ State, data= data,sum)
print(total_employed_rn_by_state)
## State Total_Employed_RN
## 1 Alabama 946180
## 2 Alaska 121130
## 3 Arizona 937700
## 4 Arkansas 503920
## 5 California 5566540
## 6 Colorado 902540
## 7 Connecticut 767150
## 8 Delaware 207590
## 9 District of Columbia 222020
## 10 Florida 3526190
## 11 Georgia 1461450
## 12 Guam 10070
## 13 Hawaii 219830
## 14 Idaho 253310
## 15 Illinois 2569850
## 16 Indiana 1309490
## 17 Iowa 698910
## 18 Kansas 593800
## 19 Kentucky 926360
## 20 Louisiana 909870
## 21 Maine 312040
## 22 Maryland 1122460
## 23 Massachusetts 1825050
## 24 Michigan 1979710
## 25 Minnesota 1272460
## 26 Mississippi 606990
## 27 Missouri 1394100
## 28 Montana 194690
## 29 Nebraska 434100
## 30 Nevada 374640
## 31 New Hampshire 290730
## 32 New Jersey 1756880
## 33 New Mexico 305610
## 34 New York 3882230
## 35 North Carolina 1888280
## 36 North Dakota 171380
## 37 Ohio 2672810
## 38 Oklahoma 592930
## 39 Oregon 681800
## 40 Pennsylvania 2934050
## 41 Puerto Rico 379720
## 42 Rhode Island 264480
## 43 South Carolina 838800
## 44 South Dakota 237720
## 45 Tennessee 1283370
## 46 Texas 3935710
## 47 Utah 402360
## 48 Vermont 135440
## 49 Virgin Islands 7480
## 50 Virginia 1323510
## 51 Washington 1146690
## 52 West Virginia 405170
## 53 Wisconsin 1202980
## 54 Wyoming 99430
Insights: The aggregation of employed Registered Nurses (RNs) by state in 2020 reveals the total RN workforce distribution across different states.
Significance:
Workforce Magnitude: Highlights states with larger or smaller RN populations.
Healthcare Capacity: Crucial for assessing each state’s healthcare workforce capacity.
Resource Planning: Aids policymakers in resource allocation and targeted interventions.
Further Questions:
Regional Patterns: Are there regional disparities in RN distribution?
Healthcare Impact: How does RN workforce variation correlate with healthcare service quality?
Influencing Factors: What contributes to differences in RN workforce size across states?
Temporal Trends: How has RN distribution evolved over different years?
boxplot(data$Hourly_Wage_Avg, main="Boxplot for Hourly Wages", ylab="Hourly Wage", col="skyblue")
Insights: The boxplot illustrates the distribution of average hourly wages for Registered Nurses (RNs) across states in 2020.
Significance:
Variability Highlighted: The plot reveals the range and central tendency of hourly wages, identifying potential outliers.
Outlier Identification: States with exceptionally high or low wages stand out, providing insights into potential disparities.
Financial Landscape Snapshot: A visual representation aids in understanding how hourly wages vary, crucial for assessing the financial aspects of RN positions.
Further Questions:
Factors Behind Outliers: What contributes to the extreme values in hourly wages in certain states?
Regional Patterns: Are there discernible regional patterns in RN hourly wage distributions?
Cost of Living Comparison: How does the wage distribution align with the cost of living in different states?
Temporal Trends: How has the distribution evolved over different years?
barplot(height = data$Annual_Salary_Avg, names.arg = data$State,
main = "Barplot for Annual Salaries", xlab = "State", ylab = "Annual Salary",
col = "lightcoral", border = "black", space = 0.5)
Insights: The barplot displays the average annual salaries for Registered Nurses (RNs) in different states in 2020.
Significance:
Quick Comparison: Allows for a rapid comparison of RN salaries across states.
Disparity Identification: Highlights variations in salaries, indicating potential economic and workforce differences.
Policy Considerations: Useful for policymakers in allocating resources and planning interventions based on states with specific salary challenges.
Further Questions:
Regional Patterns: Are there noticeable regional trends in RN annual salary distributions?
Workforce Impact: How do salary variations relate to RN availability and retention in different states?
External Factors: What external factors contribute to observed salary disparities?
Temporal Analysis: How has the distribution of annual salaries changed over different years?
pairs(data[, c("Hourly_Wage_Avg", "Annual_Salary_Avg", "Total_Employed_RN")],
main="Scatterplot Matrix", col="darkgreen", pch=16)
Insights: The scatterplot matrix visually explores potential trends and correlations among hourly wages, annual salaries, and the total number of employed Registered Nurses (RNs) in 2020.
Significance:
Pattern Identification: Reveals patterns and relationships, helping understand how changes in one variable may impact others.
Correlation Assessment: Aids in assessing the strength and direction of correlations between key workforce variables.
Comprehensive Analysis: Examining all variables simultaneously provides a comprehensive view of their interdependencies, enhancing our understanding of workforce dynamics.
Further Questions:
Correlation Strength: How strong are the observed correlations?
Outlier Analysis: Are there states with outlier behavior in these relationships?
Causal Factors: What contributes to the observed trends, and can causality be inferred?
Temporal Changes: How have these relationships evolved over different years?
scatterplot_matrix <- ggplot(data, aes(x = Hourly_Wage_Avg, y = Annual_Salary_Avg, color = State)) +
geom_point() +
labs(title = "Scatterplot Matrix with Interactions", x = "Hourly Wage", y = "Annual Salary") +
theme_minimal()
print(scatterplot_matrix)
## Warning: Removed 6 rows containing missing values (`geom_point()`).
Insights: The scatterplot matrix with interactions visually explores state-wise relationships between hourly wages and annual salaries for Registered Nurses (RNs) in 2020, with each state color-coded.
Significance:
State-wise Patterns: Color-coded points reveal variations in wage-salary relationships among states.
Interactive Exploration: Offers an interactive view of how states differ in these relationships.
Cluster Identification: Clusters indicate groups of states with similar dynamics, hinting at regional trends.
Further Questions:
Cluster Analysis: What factors contribute to distinct clusters in wage-salary relationships among states?
Outlier Identification: Do specific states exhibit outlier behavior, and what factors explain such outliers?
Regional Disparities: How do regional variations impact observed patterns?
Temporal Analysis: How have state-wise relationships evolved over different years?
interaction_plot <- ggplot(data, aes(x = Hourly_Wage_Avg, y = Annual_Salary_Avg, color = State, shape = as.factor(Location_Quotient))) +
geom_point() +
labs(title = "Interactions between Categorical and Continuous Variables",
x = "Hourly Wage", y = "Annual Salary", color = "State", shape = "Location Quotient") +
theme_minimal()
print(interaction_plot)
## Warning: The shape palette can deal with a maximum of 6 discrete values because more
## than 6 becomes difficult to discriminate
## ℹ you have requested 130 values. Consider specifying shapes manually if you
## need that many have them.
## Warning: Removed 1235 rows containing missing values (`geom_point()`).
Insights: The interaction plot examines relationships between hourly wages and annual salaries for Registered Nurses (RNs) in 2020, considering both states and Location Quotient (LQ) as categorical variables.
Significance:
Dual Categorical Exploration: Simultaneously explores state and LQ influences on wage-salary relationships.
Visual Differentiation: Color indicates states, and shape represents LQ, facilitating trend identification.
Interactive Exploration: Offers an interactive view of how both categorical variables interact with continuous variables.
Further Questions:
State-specific Trends: What trends emerge for states, and how do they relate to workforce dynamics?
LQ Impact: How does LQ influence patterns, especially in states with notable variations?
Policy Implications: How can policymakers use this information for targeted interventions?
Temporal Changes: How have these interactions evolved over different years?