1. Dataset airquality

(a) Statistik Deskriptif untuk Ozone

mean_ozone <- mean(airquality$Ozone, na.rm = TRUE)
median_ozone <- median(airquality$Ozone, na.rm = TRUE)
sd_ozone <- sd(airquality$Ozone, na.rm = TRUE)
mean_ozone
## [1] 42.12931
median_ozone
## [1] 31.5
sd_ozone
## [1] 32.98788

(b) Diagram Pencar antara Wind dan Temp

plot(airquality$Wind, airquality$Temp,
main="Scatter Plot of Wind vs Temp",
xlab="Wind",
ylab="Temp")

2. Bar Chart untuk Variabel cyl dari Dataset mtcars

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.2
ggplot(mtcars, aes(x=factor(cyl))) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=-0.5) +
labs(title="Bar Chart of Cylinders", x="Number of Cylinders", y="Count")
## Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(count)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## 3. Dataset iris

(a) Boxplot untuk Petal.Width berdasarkan Species

boxplot(Petal.Width ~ Species, data=iris,
main="Boxplot of Petal Width by Species")

(b) Korelasi antara Sepal.Length dan Petal.Length

correlation <- cor(iris$Sepal.Length, iris$Petal.Length)
correlation
## [1] 0.8717538

(c) Scatter Plot antara Sepal.Length dan Sepal.Width dengan Garis Regresi

ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) +
geom_point() +
geom_smooth(method='lm') +
labs(title="Scatter Plot of Sepal Length vs Sepal Width")
## `geom_smooth()` using formula = 'y ~ x'

text
## function (x, ...) 
## UseMethod("text")
## <bytecode: 0x0000021987d46b60>
## <environment: namespace:graphics>

4. Uji Chi-Square untuk Variabel vs dan am dalam Dataset mtcars

chisq_test_result <- chisq.test(table(mtcars$vs, mtcars$am))
chisq_test_result
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table(mtcars$vs, mtcars$am)
## X-squared = 0.34754, df = 1, p-value = 0.5555

5. Model Regresi Linear Sederhana menggunakan Dataset airquality

(a) Ringkasan Model

model <- lm(Temp ~ Solar.R, data=airquality)
summary(model)
## 
## Call:
## lm(formula = Temp ~ Solar.R, data = airquality)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.3787  -4.9572   0.8932   5.9111  18.4013 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 72.863012   1.693951  43.014  < 2e-16 ***
## Solar.R      0.028255   0.008205   3.444 0.000752 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.898 on 144 degrees of freedom
##   (7 observations deleted due to missingness)
## Multiple R-squared:  0.07609,    Adjusted R-squared:  0.06967 
## F-statistic: 11.86 on 1 and 144 DF,  p-value: 0.0007518

(b) Scatter Plot dengan Garis Regresi

plot(airquality$Solar.R, airquality$Temp,
main="Scatter Plot of Solar.R vs Temp")
abline(model)

(c) Interpretasi Hasil

Solar.R berpengaruh signifikan terhadap suhu, tetapi kontribusinya kecil karena 𝑅² rendah. Model ini memberikan prediksi yang masuk akal tetapi tidak sepenuhnya menjelaskan variasi suhu. Penambahan variabel lain mungkin diperlukan untuk meningkatkan performa model.