Membuat Distribusi Normal Acak
set.seed(300)
myData <- rnorm(2000,20,4.5) # random normal distribution (n=300, mean=20, sd=4.5)
Untuk memastikan bahwa vektor myData telah dibuat dengan benar, kita memverifikasi bahwa: 1. Terdapat 2000 observasi (menggunakan length()), 2. Nilai rata-rata (mean) mendekati 20 (menggunakan mean()), dan 3. Standar deviasi mendekati 4.5 (menggunakan sd())
length(myData)
## [1] 2000
mean(myData)
## [1] 20.25773
sd(myData)
## [1] 4.590852
Membuat Grafik myData
hist(myData, breaks = 30, col = "lightblue", main = "Distribusi myData", xlab = "Nilai", ylab = "Frekuensi") # Membuat histogram
abline(v = mean(myData), col = "pink", lwd = 2, lty = 2)
legend("topright", legend = paste("Mean =", round(mean(myData), 5)), col = "red", lty = 2, lwd = 2)
Bootstrapping Resampling sebanyak 1000 kali menggunakan for
set.seed(200)
sample.size <- 2000
n.samples <- 1000
bootstrap.results <- c()
for (i in 1:n.samples)
{
obs <- sample(1:sample.size, replace=TRUE)
bootstrap.results[i] <- mean(myData[obs])
}
length(bootstrap.results)
## [1] 1000
summary(bootstrap.results)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 19.92 20.19 20.26 20.26 20.33 20.57
sd(bootstrap.results)
## [1] 0.1021229
par(mfrow=c(2,1), pin=c(5.8,0.98))
hist(bootstrap.results,
col="#d83737",
xlab="Mean",
main=paste("Means of 1000 bootstrap samples from myData"))
hist(myData,
col="#37aad8",
xlab="Value",
main=paste("Distribution of myData"))
Pengambilan sampel ulang sebanyak 1000 kali dari prosespembuatan data
aktual menggunakan for(i in x)
set.seed(200)
sample.size <- 2000
n.samples <- 1000
bootstrap.results <- c()
for (i in 1:n.samples)
{
bootstrap.results[i] <- mean(rnorm(2000,20,4.5))
}
length(bootstrap.results)
## [1] 1000
summary(bootstrap.results)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 19.64 19.93 20.00 20.00 20.07 20.32
sd(bootstrap.results)
## [1] 0.1041927
par(mfrow=c(2,1), pin=c(5.8,0.98))
hist(bootstrap.results,
col="#d83737",
xlab="Mean",
main=paste("Means of 1000 bootstrap samples from the DGP"))
hist(myData,
col="#37aad8",
xlab="Value",
main=paste("Distribution of myData"))
EXERCISE Latihan 1 Tetapkan benih Anda pada angka 150. Hasilkan
distribusi normal acak dari 1000 observasi, dengan rata-rata30 dan
simpangan baku 2,5. Hitung rata-rata dari 50 sampel dari 1000 observasi
dari kumpulan data tersebut.Simpan hasil Anda dalam vektor. Fungsi yang
relevan: set.seed(), rnorm(), for(i in x), sample().
set.seed(150)
data <- rnorm(1000,30,2.5)
length(data)
## [1] 1000
mean(data)
## [1] 29.92068
sd(data)
## [1] 2.475175
hist(data, breaks = 30, col = "lightblue", main = "Distribusi data", xlab = "Nilai", ylab = "Frekuensi") # Membuat histogram
abline(v = mean(data), col = "red", lwd = 2, lty = 2)
legend("topright", legend = paste("Mean =", round(mean(data), 5)), col = "red", lty = 2, lwd = 2)
set.seed(150)
n.samples <- 1000
bootstrap.results <- c()
for (i in 1:50)
{
bootstrap.results[i] <- mean(rnorm(1000,30,2.5))
}
length(bootstrap.results)
## [1] 50
summary(bootstrap.results)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 29.84 29.97 30.03 30.02 30.08 30.18
sd(bootstrap.results)
## [1] 0.08389592
Latihan 2 Hasilkan dua histogram untuk menampilkan secara grafis distribusi rata-rata yang diperoleh dalam Latihan 1serta nilai dari 1000 observasi dalam kumpulan data asli Anda. Gabungkan histogram ini menjadi satu grafikkeseluruhan. Fungsi yang relevan: par(), hist().
par(mfrow=c(2,1), pin=c(5.8,0.98))
hist(bootstrap.results,
col="pink",
xlab="Mean",
main=paste("Means of 1000 bootstrap samples from the DGP"))
hist(data,
col="#37aad8",
xlab="Value",
main=paste("Distribution of myData"))