The content of this webpage belongs to Lecture 1, activity 2.
First, create and return 10 random numbers of a normal distribution
where the mean is 20 and the standard deviation is 2.
set.seed(1)
sample1= rnorm(n= 10, mean= 20, sd= 2)
sample1
[1] 18.74709 20.36729 18.32874 23.19056 20.65902 18.35906 20.97486 21.47665
[9] 21.15156 19.38922
Calculate the mean of these 10 numbers and round it to 3 decimal
digits:
round (mean (sample1), 3)
[1] 20.264
Generate a second sample, again creating 10 random normally
distributed numbers with the same mean and standard deviation as in
sample1.
set.seed(2)
sample2= rnorm(n= 10, mean= 20, sd= 2)
sample2
[1] 18.20617 20.36970 23.17569 17.73925 19.83950 20.26484 21.41591 19.52060
[9] 23.96895 19.72243
Calculate the mean for this second sample and round it to 3 decimal
digits:
round (mean (sample2), 3)
[1] 20.422
The two means are different, even though the random numbers come from
a normal distribution with same mean and standard deviation. Applying
the Central Limit Theorem, we would observe that by increasing the
number of samples, the means would approach a normal distribution.
Next, generate 100 more samples with mean 20 and standard deviation
of 2 and calculate the sample means and round the them to 2 decimal
numbers.
sample_means= vector(mode= "numeric")
for(i in 1:100){
set.seed(i)
sample_means= c(sample_means, round (mean(rnorm(n= 10, mean= 20, sd= 2)), 2))
}
Further, to investigate the claim above, plot all the sample means in
one plot:
plot (sample_means, xlab= "Sample", ylab= "Xbar", yaxt = "n")
axis(2, at=seq(18,22,0.5), labels=seq(18,22,0.5))
abline(h= 20)

It can be seen that the variance seems constant with increasing
sample index. This is exactly what we would expect.
For the last observation, change the sample size from 10 to 50. So.
for each sample, we generate 50 values instead of 10. Again, we consider
100 sample and determine the mean for each individual sample.
sample_means2= vector(mode= "numeric")
for(i in 1:100){
set.seed(i)
sample_means2= c(sample_means2, round (mean(rnorm(n= 50, mean= 20, sd= 2)), 2))
}
Just as for sample1, we plot all 100 sample means in one scatter
plot:
plot (sample_means2, xlab= "Sample", ylab= "Xbar", yaxt = "n")
axis(2, at=seq(18,22,0.5), labels=seq(18,22,0.5))
abline(h= 20)

Comparing the two graphs, we find that by increasing the sample size
the sample means are a closer approach to the population mean 20.
LS0tCnRpdGxlOiAiVGhlIFNhbXBsZSBtZWFuIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tClRoZSBjb250ZW50IG9mIHRoaXMgd2VicGFnZSBiZWxvbmdzIHRvIExlY3R1cmUgMSwgYWN0aXZpdHkgMi4KCkZpcnN0LCBjcmVhdGUgYW5kIHJldHVybiAxMCByYW5kb20gbnVtYmVycyBvZiBhIG5vcm1hbCBkaXN0cmlidXRpb24gd2hlcmUgdGhlIG1lYW4gaXMgMjAgYW5kIHRoZSBzdGFuZGFyZCBkZXZpYXRpb24gaXMgMi4KYGBge3J9CnNldC5zZWVkKDEpCnNhbXBsZTE9IHJub3JtKG49IDEwLCBtZWFuPSAyMCwgc2Q9IDIpCnNhbXBsZTEKYGBgCgoKQ2FsY3VsYXRlIHRoZSBtZWFuIG9mIHRoZXNlIDEwIG51bWJlcnMgYW5kIHJvdW5kIGl0IHRvIDMgZGVjaW1hbCBkaWdpdHM6CmBgYHtyfQpyb3VuZCAobWVhbiAoc2FtcGxlMSksIDMpCmBgYAoKCkdlbmVyYXRlIGEgc2Vjb25kIHNhbXBsZSwgYWdhaW4gY3JlYXRpbmcgMTAgcmFuZG9tIG5vcm1hbGx5IGRpc3RyaWJ1dGVkIG51bWJlcnMgd2l0aCB0aGUgc2FtZSBtZWFuIGFuZCBzdGFuZGFyZCBkZXZpYXRpb24gYXMgaW4gc2FtcGxlMS4KYGBge3J9CnNldC5zZWVkKDIpCnNhbXBsZTI9IHJub3JtKG49IDEwLCBtZWFuPSAyMCwgc2Q9IDIpCnNhbXBsZTIKYGBgCgoKQ2FsY3VsYXRlIHRoZSBtZWFuIGZvciB0aGlzIHNlY29uZCBzYW1wbGUgYW5kIHJvdW5kIGl0IHRvIDMgZGVjaW1hbCBkaWdpdHM6CmBgYHtyfQpyb3VuZCAobWVhbiAoc2FtcGxlMiksIDMpCmBgYApUaGUgdHdvIG1lYW5zIGFyZSBkaWZmZXJlbnQsIGV2ZW4gdGhvdWdoIHRoZSByYW5kb20gbnVtYmVycyBjb21lIGZyb20gYSBub3JtYWwgZGlzdHJpYnV0aW9uIHdpdGggc2FtZSBtZWFuIGFuZCBzdGFuZGFyZCBkZXZpYXRpb24uIEFwcGx5aW5nIHRoZSBDZW50cmFsIExpbWl0IFRoZW9yZW0sIHdlIHdvdWxkIG9ic2VydmUgdGhhdCBieSBpbmNyZWFzaW5nIHRoZSBudW1iZXIgb2Ygc2FtcGxlcywgdGhlIG1lYW5zIHdvdWxkIGFwcHJvYWNoIGEgbm9ybWFsIGRpc3RyaWJ1dGlvbi4KCk5leHQsIGdlbmVyYXRlIDEwMCBtb3JlIHNhbXBsZXMgd2l0aCBtZWFuIDIwIGFuZCBzdGFuZGFyZCBkZXZpYXRpb24gb2YgMiBhbmQgY2FsY3VsYXRlIHRoZSBzYW1wbGUgbWVhbnMgYW5kIHJvdW5kIHRoZSB0aGVtIHRvIDIgZGVjaW1hbCBudW1iZXJzLiAKYGBge3J9CnNhbXBsZV9tZWFucz0gdmVjdG9yKG1vZGU9ICJudW1lcmljIikKCmZvcihpIGluIDE6MTAwKXsKICBzZXQuc2VlZChpKQogIHNhbXBsZV9tZWFucz0gYyhzYW1wbGVfbWVhbnMsIHJvdW5kIChtZWFuKHJub3JtKG49IDEwLCBtZWFuPSAyMCwgc2Q9IDIpKSwgMikpCn0KYGBgCgoKRnVydGhlciwgdG8gaW52ZXN0aWdhdGUgdGhlIGNsYWltIGFib3ZlLCBwbG90IGFsbCB0aGUgc2FtcGxlIG1lYW5zIGluIG9uZSBwbG90OgpgYGB7cn0KcGxvdCAoc2FtcGxlX21lYW5zLCB4bGFiPSAiU2FtcGxlIiwgeWxhYj0gIlhiYXIiLCB5YXh0ID0gIm4iKQpheGlzKDIsIGF0PXNlcSgxOCwyMiwwLjUpLCBsYWJlbHM9c2VxKDE4LDIyLDAuNSkpCmFibGluZShoPSAyMCkKYGBgCkl0IGNhbiBiZSBzZWVuIHRoYXQgdGhlIHZhcmlhbmNlIHNlZW1zIGNvbnN0YW50IHdpdGggaW5jcmVhc2luZyBzYW1wbGUgaW5kZXguIFRoaXMgaXMgZXhhY3RseSB3aGF0IHdlIHdvdWxkIGV4cGVjdC4KCgpGb3IgdGhlIGxhc3Qgb2JzZXJ2YXRpb24sIGNoYW5nZSB0aGUgc2FtcGxlIHNpemUgZnJvbSAxMCB0byA1MC4gU28uIGZvciBlYWNoIHNhbXBsZSwgd2UgZ2VuZXJhdGUgNTAgdmFsdWVzIGluc3RlYWQgb2YgMTAuIEFnYWluLCB3ZSBjb25zaWRlciAxMDAgc2FtcGxlIGFuZCBkZXRlcm1pbmUgdGhlIG1lYW4gZm9yIGVhY2ggaW5kaXZpZHVhbCBzYW1wbGUuCmBgYHtyfQpzYW1wbGVfbWVhbnMyPSB2ZWN0b3IobW9kZT0gIm51bWVyaWMiKQoKZm9yKGkgaW4gMToxMDApewogIHNldC5zZWVkKGkpCiAgc2FtcGxlX21lYW5zMj0gYyhzYW1wbGVfbWVhbnMyLCByb3VuZCAobWVhbihybm9ybShuPSA1MCwgbWVhbj0gMjAsIHNkPSAyKSksIDIpKQp9CmBgYAoKSnVzdCBhcyBmb3Igc2FtcGxlMSwgd2UgcGxvdCBhbGwgMTAwIHNhbXBsZSBtZWFucyBpbiBvbmUgc2NhdHRlciBwbG90OgpgYGB7cn0KcGxvdCAoc2FtcGxlX21lYW5zMiwgeGxhYj0gIlNhbXBsZSIsIHlsYWI9ICJYYmFyIiwgeWF4dCA9ICJuIikKYXhpcygyLCBhdD1zZXEoMTgsMjIsMC41KSwgbGFiZWxzPXNlcSgxOCwyMiwwLjUpKQphYmxpbmUoaD0gMjApCmBgYApDb21wYXJpbmcgdGhlIHR3byBncmFwaHMsIHdlIGZpbmQgdGhhdCBieSBpbmNyZWFzaW5nIHRoZSBzYW1wbGUgc2l6ZSB0aGUgc2FtcGxlIG1lYW5zIGFyZSBhIGNsb3NlciBhcHByb2FjaCB0byB0aGUgcG9wdWxhdGlvbiBtZWFuIDIwLg==