Membuat Distribusi Normal Acak

set.seed(300)
myData <- rnorm(2000,20,4.5) # random normal distribution (n=300, mean=20, sd=4.5)

Untuk memastikan bahwa vektor myData telah dibuat dengan benar, kita memverifikasi bahwa: 1. Terdapat 2000 observasi (menggunakan length()), 2. Nilai rata-rata (mean) mendekati 20 (menggunakan mean()), dan 3. Standar deviasi mendekati 4.5 (menggunakan sd())

length(myData)
## [1] 2000
mean(myData)
## [1] 20.25773
sd(myData)
## [1] 4.590852

Membuat Grafik myData

hist(myData, breaks = 30, col = "lightblue", main = "Distribusi myData", xlab = "Nilai", ylab = "Frekuensi") # Membuat histogram 
abline(v = mean(myData), col = "pink", lwd = 2, lty = 2)
legend("topright", legend = paste("Mean =", round(mean(myData), 5)), col = "red", lty = 2, lwd = 2)

Bootstrapping Resampling sebanyak 1000 kali menggunakan for

set.seed(200) 
sample.size <- 2000 
n.samples <- 1000 
bootstrap.results <- c() 
for (i in 1:n.samples)
{
 obs <- sample(1:sample.size, replace=TRUE)
 bootstrap.results[i] <- mean(myData[obs]) 
 }
length(bootstrap.results) 
## [1] 1000
summary(bootstrap.results)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   19.92   20.19   20.26   20.26   20.33   20.57
sd(bootstrap.results)
## [1] 0.1021229
par(mfrow=c(2,1), pin=c(5.8,0.98)) 
hist(bootstrap.results, 
 col="#d83737", 
 xlab="Mean", 
 main=paste("Means of 1000 bootstrap samples from myData")) 
 
hist(myData, 
col="#37aad8", 
 xlab="Value", 
 main=paste("Distribution of myData"))

Pengambilan sampel ulang sebanyak 1000 kali dari prosespembuatan data aktual menggunakan for(i in x)

set.seed(200)
sample.size <- 2000 
n.samples <- 1000 
bootstrap.results <- c() 
for (i in 1:n.samples)
{
 bootstrap.results[i] <- mean(rnorm(2000,20,4.5)) 
 }
length(bootstrap.results)
## [1] 1000
summary(bootstrap.results) 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   19.64   19.93   20.00   20.00   20.07   20.32
sd(bootstrap.results)
## [1] 0.1041927
par(mfrow=c(2,1), pin=c(5.8,0.98)) 
hist(bootstrap.results, 
 col="#d83737", 
 xlab="Mean", 
 main=paste("Means of 1000 bootstrap samples from the DGP")) 

hist(myData, 
 col="#37aad8", 
 xlab="Value", 
 main=paste("Distribution of myData"))

EXERCISE Latihan 1 Tetapkan benih Anda pada angka 150. Hasilkan distribusi normal acak dari 1000 observasi, dengan rata-rata30 dan simpangan baku 2,5. Hitung rata-rata dari 50 sampel dari 1000 observasi dari kumpulan data tersebut.Simpan hasil Anda dalam vektor. Fungsi yang relevan: set.seed(), rnorm(), for(i in x), sample().

set.seed(150)
data <- rnorm(1000,30,2.5)
length(data)
## [1] 1000
mean(data)
## [1] 29.92068
sd(data)
## [1] 2.475175
hist(data, breaks = 30, col = "lightblue", main = "Distribusi data", xlab = "Nilai", ylab = "Frekuensi") # Membuat histogram 
abline(v = mean(data), col = "red", lwd = 2, lty = 2)
legend("topright", legend = paste("Mean =", round(mean(data), 5)), col = "red", lty = 2, lwd = 2)

set.seed(150)
n.samples <- 1000 
bootstrap.results <- c() 
for (i in 1:50)
{
 bootstrap.results[i] <- mean(rnorm(1000,30,2.5)) 
 }
length(bootstrap.results)
## [1] 50
summary(bootstrap.results)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   29.84   29.97   30.03   30.02   30.08   30.18
sd(bootstrap.results)
## [1] 0.08389592

Latihan 2 Hasilkan dua histogram untuk menampilkan secara grafis distribusi rata-rata yang diperoleh dalam Latihan 1serta nilai dari 1000 observasi dalam kumpulan data asli Anda. Gabungkan histogram ini menjadi satu grafikkeseluruhan. Fungsi yang relevan: par(), hist().

par(mfrow=c(2,1), pin=c(5.8,0.98)) 
hist(bootstrap.results, 
 col="pink", 
 xlab="Mean", 
 main=paste("Means of 1000 bootstrap samples from the DGP")) 

hist(data, 
 col="#37aad8", 
 xlab="Value", 
 main=paste("Distribution of myData"))