R Markdown

Question
Question
library(ggplot2)

# Read the CSV file (adjust the path as needed)
data <- read.csv("dataa.csv")

# Renaming columns for ease of use
colnames(data) <- c("Country", "Internet", "Facebook")

# Display the first few rows of the dataset
head(data)
##     Country Internet Facebook
## 1 Argentina   55.80%    48.8%
## 2 Australia   82.40%    51.5%
## 3   Belgium      82%    44.2%
## 4    Brazil   49.90%    29.5%
## 5    Canada   86.80%    51.9%
## 6     Chile   61.40%    55.5%
# Convert Internet and Facebook columns to numeric (handling non-numeric values)

data$Internet <-  as.numeric(gsub("%", "", data$Internet))
data$Facebook <-  as.numeric(gsub("%", "", data$Facebook))
# Graphical Summary: Histogram for Internet Penetration

ggplot(data, aes(x = Internet)) + 
  geom_histogram(binwidth = 5, fill = "blue", color = "black", alpha = 0.7) +
  labs(title = "Histogram of Internet Penetration", x = "Internet Penetration (%)", y = "Frequency")

# Graphical Summary: Histogram for Facebook Penetration
ggplot(data, aes(x = Facebook)) + 
  geom_histogram(binwidth = 5, fill = "green", color = "black", alpha = 0.7) +
  labs(title = "Histogram of Facebook Penetration", x = "Facebook Penetration (%)", y = "Frequency")

# Numerical Summaries: Internet Penetration
internet_summary <- summary(data$Internet)
internet_sd <- sd(data$Internet)

# Numerical Summaries: Facebook Penetration
facebook_summary <- summary(data$Facebook)
facebook_sd <- sd(data$Facebook)

# Display summaries
print(internet_summary)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   12.60   43.65   56.90   59.16   81.25   94.00
print(internet_sd)
## [1] 22.38135
print(facebook_summary)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.10   24.38   34.45   33.82   47.08   56.40
print(facebook_sd)
## [1] 15.92896
# Scatterplot to display the relationship between Internet and Facebook penetration
ggplot(data, aes(x = Internet, y = Facebook)) + 
  geom_point(color = "purple") +
  labs(title = "Scatterplot of Internet vs. Facebook Penetration", x = "Internet Penetration (%)", y = "Facebook Penetration (%)") +
  theme_minimal()

# Conclusion : More internet penetration , more facebook user.
# Calculate the correlation coefficient
correlation_coefficient <- cor(data$Internet, data$Facebook)
print(correlation_coefficient)
## [1] 0.6108464
R = 0.6108, interpret the result

The value 0.6108 is a positive correlation, which means that there is a moderate positive linear relationship between Internet penetration and Facebook penetration in the dataset.

This suggests that, in general, countries with a higher Internet penetration rate tend to also have a higher Facebook penetration rate.

However, since the correlation is not very close to 1, it is not a strong linear relationship. There may be other factors influencing Facebook penetration beyond just Internet penetration, or the relationship might not be perfectly linear.

The correlation coefficient of 0.6108 indicates a moderate positive linear relationship between Internet penetration and Facebook penetration in the data. This suggests a tendency that countries with more Internet usage might also have a higher usage of Facebook, but it's not a strong or perfect relationship.