Exercise 01

Load “Demographics_Uzbekistan_Worldbank2.csv” into R and convert the first column to rownames.

setwd("F:/IAMO_B/2022_DSIK/eLearning/Module_0_R-Basics/data/")
data <- read.csv("Demographics_Uzbekistan_Worldbank2.csv", sep=";", header=T)
rownames(data) <- data[,1]
data <- data[,-1]

head(data)
X1960 X1961 X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972 X1973 X1974 X1975 X1976 X1977 X1978 X1979 X1980 X1981 X1982 X1983 X1984 X1985 X1986 X1987 X1988 X1989 X1990 X1991 X1992 X1993 X1994 X1995 X1996 X1997 X1998 X1999 X2000 X2001 X2002 X2003 X2004 X2005 X2006 X2007 X2008 X2009 X2010 X2011 X2012 X2013 X2014 X2015 X2016
Population ages 40-44, male (% of male population) 2.802203 2.896451 3.069046 3.336136 3.690905 4.101962 4.590526 5.059065 5.379471 5.496317 5.466242 5.254891 4.969459 4.751594 4.665228 4.655549 4.750850 4.865208 4.886789 4.770856 4.560047 4.269012 3.948933 3.661139 3.435415 3.242697 3.077853 2.972073 2.973244 3.105117 3.349855 3.715974 4.138677 4.529740 4.840484 5.078634 5.250469 5.377954 5.495599 5.621172 5.762577 5.902289 6.020272 6.087027 6.088726 6.028875 5.911667 5.774145 5.668374 5.622695 5.611772 5.664594 5.755805 5.848225 5.922188 5.988523 6.044512
Population ages 40-44, female (% of female population) 3.822331 3.982644 4.192144 4.410170 4.609609 4.781583 4.959838 5.115214 5.201543 5.194364 5.147529 4.994136 4.800338 4.650012 4.576971 4.545546 4.579625 4.613422 4.567457 4.417975 4.223169 3.962299 3.678375 3.426918 3.238143 3.094232 2.992777 2.952878 3.002819 3.154991 3.398435 3.737865 4.125107 4.493590 4.805055 5.062565 5.270672 5.447141 5.619374 5.799410 5.981864 6.166992 6.330853 6.435402 6.460404 6.406166 6.277317 6.117453 5.987465 5.920420 5.886418 5.917532 5.989325 6.063903 6.120755 6.172542 6.213516
Population ages 35-39, male (% of male population) 5.068229 5.680684 6.247044 6.623900 6.751899 6.647504 6.345956 5.962686 5.661149 5.535142 5.551201 5.683749 5.832172 5.863054 5.719383 5.428033 5.042695 4.631868 4.277190 4.018055 3.809305 3.641951 3.543927 3.561232 3.716860 3.997206 4.410961 4.886901 5.324493 5.666861 5.907056 6.062704 6.161391 6.248383 6.348410 6.452520 6.570107 6.674770 6.727213 6.709495 6.651315 6.528360 6.382762 6.271256 6.222817 6.227487 6.285753 6.376802 6.468611 6.543308 6.593019 6.642547 6.697085 6.768352 6.864493 6.989650 7.130977
Population ages 35-39, female (% of female population) 5.821614 6.025183 6.178266 6.258219 6.250714 6.132714 5.919202 5.659269 5.437076 5.310146 5.323403 5.406589 5.479677 5.442876 5.261683 4.965847 4.602250 4.240978 3.953282 3.760947 3.608704 3.503818 3.466747 3.527378 3.698605 3.969144 4.343000 4.767395 5.168218 5.502434 5.766902 5.967794 6.124475 6.270773 6.424605 6.588602 6.762677 6.916353 7.006158 7.012098 6.957745 6.827064 6.663755 6.529739 6.459250 6.441669 6.478105 6.548501 6.621664 6.680331 6.713866 6.749273 6.792030 6.854483 6.945114 7.069720 7.212531
Population ages 30-34, male (% of male population) 8.245330 7.852095 7.316644 6.854373 6.584949 6.479938 6.579324 6.780373 6.899371 6.816811 6.458001 5.942376 5.381791 4.908180 4.580652 4.344821 4.194960 4.139638 4.209644 4.429236 4.762383 5.237618 5.775239 6.263017 6.637779 6.908633 7.078251 7.186157 7.285387 7.394526 7.471382 7.556395 7.620322 7.623810 7.552469 7.425810 7.244411 7.049147 6.897504 6.821205 6.833832 6.908167 7.020846 7.132666 7.218315 7.291266 7.343232 7.389954 7.455972 7.554622 7.663350 7.802630 7.964620 8.140442 8.324214 8.526776 8.732077
Population ages 30-34, female (% of female population) 7.484621 7.253412 6.944133 6.660939 6.468239 6.352704 6.373839 6.448928 6.433025 6.242900 5.869036 5.357310 4.832392 4.430377 4.198384 4.070369 4.027605 4.049673 4.141750 4.326083 4.639296 5.061803 5.538733 5.989117 6.360052 6.653069 6.870051 7.035156 7.188262 7.350081 7.490770 7.638435 7.758649 7.803858 7.754286 7.641860 7.454868 7.241276 7.068332 6.973337 6.959592 7.011847 7.106726 7.203510 7.276410 7.335002 7.373669 7.409420 7.468848 7.565715 7.673258 7.815668 7.981538 8.156203 8.330237 8.516460 8.698230

Exercise 02

Select only the entries corresponding to the years 2000 to 2010.

names(data)
##  [1] "X1960" "X1961" "X1962" "X1963" "X1964" "X1965" "X1966" "X1967" "X1968"
## [10] "X1969" "X1970" "X1971" "X1972" "X1973" "X1974" "X1975" "X1976" "X1977"
## [19] "X1978" "X1979" "X1980" "X1981" "X1982" "X1983" "X1984" "X1985" "X1986"
## [28] "X1987" "X1988" "X1989" "X1990" "X1991" "X1992" "X1993" "X1994" "X1995"
## [37] "X1996" "X1997" "X1998" "X1999" "X2000" "X2001" "X2002" "X2003" "X2004"
## [46] "X2005" "X2006" "X2007" "X2008" "X2009" "X2010" "X2011" "X2012" "X2013"
## [55] "X2014" "X2015" "X2016"
data_sel <- data[,41:51]
data_sel
X2000 X2001 X2002 X2003 X2004 X2005 X2006 X2007 X2008 X2009 X2010
Population ages 40-44, male (% of male population) 5.762577e+00 5.902289e+00 6.020272e+00 6.087027e+00 6.088726e+00 6.028875e+00 5.911667e+00 5.774145e+00 5.668374e+00 5.622695e+00 5.611772e+00
Population ages 40-44, female (% of female population) 5.981864e+00 6.166992e+00 6.330853e+00 6.435402e+00 6.460404e+00 6.406166e+00 6.277317e+00 6.117453e+00 5.987465e+00 5.920420e+00 5.886418e+00
Population ages 35-39, male (% of male population) 6.651315e+00 6.528360e+00 6.382762e+00 6.271256e+00 6.222817e+00 6.227487e+00 6.285753e+00 6.376802e+00 6.468611e+00 6.543308e+00 6.593019e+00
Population ages 35-39, female (% of female population) 6.957745e+00 6.827064e+00 6.663755e+00 6.529739e+00 6.459250e+00 6.441669e+00 6.478105e+00 6.548501e+00 6.621664e+00 6.680331e+00 6.713866e+00
Population ages 30-34, male (% of male population) 6.833832e+00 6.908167e+00 7.020846e+00 7.132666e+00 7.218315e+00 7.291266e+00 7.343232e+00 7.389954e+00 7.455972e+00 7.554622e+00 7.663350e+00
Population ages 30-34, female (% of female population) 6.959592e+00 7.011847e+00 7.106726e+00 7.203510e+00 7.276410e+00 7.335002e+00 7.373669e+00 7.409420e+00 7.468848e+00 7.565715e+00 7.673258e+00
Population ages 25-29, male (% of male population) 8.020528e+00 8.087925e+00 8.146005e+00 8.217777e+00 8.315383e+00 8.452951e+00 8.601684e+00 8.764026e+00 8.943727e+00 9.139615e+00 9.325499e+00
Population ages 25-29, female (% of female population) 7.944221e+00 8.002587e+00 8.057768e+00 8.131390e+00 8.234766e+00 8.375513e+00 8.530318e+00 8.697937e+00 8.877428e+00 9.064741e+00 9.233348e+00
Population ages 20-24, male (% of male population) 9.322579e+00 9.473200e+00 9.640012e+00 9.826196e+00 1.002867e+01 1.025542e+01 1.047710e+01 1.067921e+01 1.084407e+01 1.095944e+01 1.099881e+01
Population ages 20-24, female (% of female population) 9.106619e+00 9.273912e+00 9.459578e+00 9.657137e+00 9.857714e+00 1.006867e+01 1.026646e+01 1.043683e+01 1.056266e+01 1.063254e+01 1.062081e+01
Population ages 15-64 (% of total) 5.896779e+01 5.983194e+01 6.072071e+01 6.163776e+01 6.256140e+01 6.344965e+01 6.427497e+01 6.503368e+01 6.572899e+01 6.635761e+01 6.672003e+01
Population ages 15-64, total 1.472099e+07 1.512064e+07 1.552486e+07 1.594221e+07 1.637044e+07 1.680670e+07 1.726940e+07 1.775602e+07 1.825123e+07 1.895333e+07 1.957526e+07
Population ages 15-64, male (% of total) 5.877059e+01 5.960738e+01 6.046599e+01 6.136212e+01 6.227816e+01 6.316269e+01 6.399530e+01 6.476767e+01 6.547526e+01 6.611085e+01 6.647054e+01
Population ages 15-64, male 7.294513e+06 7.487792e+06 7.683002e+06 7.886239e+06 8.097208e+06 8.313447e+06 8.545025e+06 8.790071e+06 9.039822e+06 9.391832e+06 9.703133e+06
Population ages 15-64, female (% of total) 5.916270e+01 6.005386e+01 6.097237e+01 6.191006e+01 6.284114e+01 6.373316e+01 6.455142e+01 6.529666e+01 6.597994e+01 6.660181e+01 6.696704e+01
Population ages 15-64, female 7.426462e+06 7.632843e+06 7.841858e+06 8.055975e+06 8.273236e+06 8.493267e+06 8.724394e+06 8.965953e+06 9.211413e+06 9.561496e+06 9.872118e+06
Population ages 15-19, male (% of male population) 1.122094e+01 1.144487e+01 1.166122e+01 1.184738e+01 1.198269e+01 1.205458e+01 1.206638e+01 1.200389e+01 1.184740e+01 1.158972e+01 1.120364e+01
Population ages 15-19, female (% of female population) 1.090845e+01 1.111925e+01 1.131272e+01 1.146169e+01 1.154564e+01 1.156455e+01 1.151282e+01 1.138848e+01 1.118981e+01 1.091903e+01 1.055011e+01
Population ages 10-14, male (% of male population) 1.302027e+01 1.302936e+01 1.298148e+01 1.285302e+01 1.260701e+01 1.222476e+01 1.173672e+01 1.118685e+01 1.063899e+01 1.013833e+01 9.653681e+00
Population ages 10-14, female (% of female population) 1.244343e+01 1.238944e+01 1.227633e+01 1.209432e+01 1.182299e+01 1.145913e+01 1.101485e+01 1.052513e+01 1.003789e+01 9.585812e+00 9.143348e+00
Population ages 5-9, male (% of male population) 1.317894e+01 1.273059e+01 1.216944e+01 1.155742e+01 1.098422e+01 1.048905e+01 1.005015e+01 9.685773e+00 9.404078e+00 9.202041e+00 9.089541e+00
Population ages 5-9, female (% of female population) 1.232941e+01 1.190460e+01 1.140234e+01 1.086505e+01 1.035541e+01 9.910157e+00 9.505080e+00 9.158054e+00 8.883284e+00 8.687207e+00 8.579245e+00
Population ages 0-14 (% of total) 3.634240e+01 3.541057e+01 3.446377e+01 3.351523e+01 3.259638e+01 3.172937e+01 3.095429e+01 3.026875e+01 2.966257e+01 2.913205e+01 2.886115e+01
Population ages 0-14, total 9.072681e+06 8.948906e+06 8.811575e+06 8.668497e+06 8.529495e+06 8.404556e+06 8.316798e+06 8.264217e+06 8.236524e+06 8.320813e+06 8.467690e+06
Population ages 0-14, male (% of total) 3.735575e+01 3.642683e+01 3.548707e+01 3.453782e+01 3.360585e+01 3.271785e+01 3.191018e+01 3.118728e+01 3.055033e+01 3.000167e+01 2.972236e+01
Population ages 0-14, male 4.636537e+06 4.575885e+06 4.509100e+06 4.438790e+06 4.369325e+06 4.306309e+06 4.260833e+06 4.232643e+06 4.217922e+06 4.262094e+06 4.338764e+06
Population ages 0-14, female (% of total) 3.534074e+01 3.440633e+01 3.345267e+01 3.250502e+01 3.159943e+01 3.075273e+01 3.000944e+01 2.936062e+01 2.878452e+01 2.827154e+01 2.800841e+01
Population ages 0-14, female 4.436184e+06 4.373043e+06 4.302459e+06 4.229677e+06 4.160166e+06 4.098199e+06 4.055902e+06 4.031538e+06 4.018587e+06 4.058722e+06 4.128931e+06
Population ages 0-4, male (% of male population) 1.115654e+01 1.066688e+01 1.033615e+01 1.012739e+01 1.001461e+01 1.000403e+01 1.012331e+01 1.031466e+01 1.050726e+01 1.066130e+01 1.097914e+01
Population ages 0-4, female (% of female population) 1.056789e+01 1.011228e+01 9.774001e+00 9.545643e+00 9.421040e+00 9.383441e+00 9.489514e+00 9.677438e+00 9.863345e+00 9.998519e+00 1.028581e+01

Exercise 03

From that, create a table that only contains information about females.

data_female <- data_sel[c(2,4,6,8,10,15,16,18,20,22,27,28,30),]
head(data_female)
X2000 X2001 X2002 X2003 X2004 X2005 X2006 X2007 X2008 X2009 X2010
Population ages 40-44, female (% of female population) 5.981864 6.166992 6.330853 6.435402 6.460404 6.406166 6.277317 6.117453 5.987465 5.920420 5.886418
Population ages 35-39, female (% of female population) 6.957745 6.827064 6.663755 6.529739 6.459250 6.441669 6.478105 6.548501 6.621664 6.680331 6.713866
Population ages 30-34, female (% of female population) 6.959592 7.011847 7.106726 7.203510 7.276410 7.335002 7.373669 7.409420 7.468848 7.565715 7.673258
Population ages 25-29, female (% of female population) 7.944221 8.002587 8.057768 8.131390 8.234766 8.375513 8.530318 8.697937 8.877428 9.064741 9.233348
Population ages 20-24, female (% of female population) 9.106619 9.273912 9.459578 9.657137 9.857714 10.068667 10.266459 10.436828 10.562662 10.632545 10.620811
Population ages 15-64, female (% of total) 59.162702 60.053857 60.972368 61.910062 62.841136 63.733156 64.551419 65.296661 65.979943 66.601810 66.967035

Exercise 04

From that, delete everything that is not percentage information, and the rows that say “% of total”.

data_female <- data_female[-c(6,7,11,12),]
head(data_female)
X2000 X2001 X2002 X2003 X2004 X2005 X2006 X2007 X2008 X2009 X2010
Population ages 40-44, female (% of female population) 5.981864 6.166992 6.330853 6.435402 6.460404 6.406166 6.277317 6.117453 5.987465 5.920420 5.886418
Population ages 35-39, female (% of female population) 6.957745 6.827064 6.663755 6.529739 6.459250 6.441669 6.478105 6.548501 6.621664 6.680331 6.713866
Population ages 30-34, female (% of female population) 6.959592 7.011847 7.106726 7.203510 7.276410 7.335002 7.373669 7.409420 7.468848 7.565715 7.673258
Population ages 25-29, female (% of female population) 7.944221 8.002587 8.057768 8.131390 8.234766 8.375513 8.530318 8.697937 8.877428 9.064741 9.233348
Population ages 20-24, female (% of female population) 9.106619 9.273912 9.459578 9.657137 9.857714 10.068667 10.266459 10.436828 10.562662 10.632545 10.620811
Population ages 15-19, female (% of female population) 10.908450 11.119250 11.312716 11.461694 11.545645 11.564548 11.512824 11.388480 11.189811 10.919026 10.550105

Exercise 05

Create a table with the lower and upper age limits of the remaining age classes.

limits <- data.frame(lower=rev(seq(from=0, to=40, by=5)), upper=rev(seq(from=4, to=44, by=5)))   # rev reverses the elements of a vector!
limits
lower upper
40 44
35 39
30 34
25 29
20 24
15 19
10 14
5 9
0 4

Exercise 06

Transpose the table about female population classes and shorten the column names.

data_female_t <- as.data.frame(t(data_female))
names(data_female_t) <- paste("female", paste(limits[,1], limits[,2], sep="_"), sep="_")
data_female_t
female_40_44 female_35_39 female_30_34 female_25_29 female_20_24 female_15_19 female_10_14 female_5_9 female_0_4
X2000 5.981864 6.957745 6.959592 7.944221 9.106619 10.90845 12.443434 12.329414 10.567889
X2001 6.166992 6.827064 7.011847 8.002587 9.273912 11.11925 12.389444 11.904599 10.112283
X2002 6.330853 6.663755 7.106726 8.057768 9.459578 11.31272 12.276334 11.402337 9.774001
X2003 6.435402 6.529739 7.203510 8.131390 9.657137 11.46169 12.094322 10.865050 9.545643
X2004 6.460404 6.459250 7.276410 8.234766 9.857714 11.54564 11.822990 10.355405 9.421040
X2005 6.406166 6.441669 7.335002 8.375513 10.068667 11.56455 11.459134 9.910157 9.383441
X2006 6.277317 6.478105 7.373669 8.530318 10.266459 11.51282 11.014852 9.505080 9.489514
X2007 6.117453 6.548501 7.409420 8.697937 10.436828 11.38848 10.525128 9.158054 9.677438
X2008 5.987465 6.621664 7.468848 8.877428 10.562662 11.18981 10.037894 8.883284 9.863345
X2009 5.920420 6.680331 7.565715 9.064741 10.632545 10.91903 9.585812 8.687207 9.998519
X2010 5.886418 6.713866 7.673258 9.233348 10.620811 10.55011 9.143348 8.579245 10.285814

Exercise 07

What is the percentage of females older than 44 years for the years 2000 to 2010?

100-apply(data_female_t, 1, sum)
##    X2000    X2001    X2002    X2003    X2004    X2005    X2006    X2007 
## 16.80077 17.19202 17.61593 18.07611 18.56638 19.05570 19.55186 20.04076 
##    X2008    X2009    X2010 
## 20.50760 20.94568 21.31379

Exercise 08

What was the mean percentage of females aged 35-39 between 2000 and 2005, and between 2000 and 2010?

mean(data_female_t[c(1:6), 2])
## [1] 6.646537
mean(data_female_t$female_35_39)
## [1] 6.629244

Exercise 09

Go back to the first table and create a new table with all information on people aged 0-14.

data_0_14 <- data[which(substr(rownames(data), 1, 20) == "Population ages 0-14"),]
data_0_14
X1960 X1961 X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972 X1973 X1974 X1975 X1976 X1977 X1978 X1979 X1980 X1981 X1982 X1983 X1984 X1985 X1986 X1987 X1988 X1989 X1990 X1991 X1992 X1993 X1994 X1995 X1996 X1997 X1998 X1999 X2000 X2001 X2002 X2003 X2004 X2005 X2006 X2007 X2008 X2009 X2010 X2011 X2012 X2013 X2014 X2015 X2016
Population ages 0-14 (% of total) 4.071612e+01 4.186784e+01 4.298322e+01 4.388336e+01 4.449996e+01 4.504029e+01 4.529597e+01 4.534851e+01 4.530255e+01 4.518551e+01 4.482240e+01 4.447456e+01 4.410435e+01 4.365193e+01 4.310739e+01 4.270729e+01 4.218003e+01 4.160947e+01 4.111059e+01 4.073476e+01 4.048721e+01 4.037297e+01 4.035358e+01 4.035898e+01 4.034875e+01 4.048204e+01 4.057987e+01 4.064106e+01 4.067171e+01 4.067156e+01 4.073321e+01 4.071987e+01 4.063673e+01 4.047158e+01 4.020406e+01 3.982813e+01 3.933266e+01 3.872872e+01 3.803474e+01 3.726417e+01 3.634240e+01 3.541057e+01 3.446377e+01 3.351523e+01 3.259638e+01 3.172937e+01 3.095429e+01 3.026875e+01 2.966257e+01 2.913205e+01 2.886115e+01 2.856638e+01 2.830386e+01 2.813108e+01 2.805466e+01 2.797840e+01 2.798451e+01
Population ages 0-14, total 3.598226e+06 3.825924e+06 4.063741e+06 4.295736e+06 4.513960e+06 4.738634e+06 4.945163e+06 5.134855e+06 5.311201e+06 5.471978e+06 5.592517e+06 5.705474e+06 5.810126e+06 5.903964e+06 5.988753e+06 6.098225e+06 6.192644e+06 6.281371e+06 6.379074e+06 6.493017e+06 6.625149e+06 6.778649e+06 6.949375e+06 7.127385e+06 7.306472e+06 7.515683e+06 7.722300e+06 7.925505e+06 8.125853e+06 8.341737e+06 8.534422e+06 8.734006e+06 8.916512e+06 9.056327e+06 9.160496e+06 9.250084e+06 9.308862e+06 9.314644e+06 9.246872e+06 9.185767e+06 9.072681e+06 8.948906e+06 8.811575e+06 8.668497e+06 8.529495e+06 8.404556e+06 8.316798e+06 8.264217e+06 8.236524e+06 8.320813e+06 8.467690e+06 8.505498e+06 8.559994e+06 8.652474e+06 8.780801e+06 8.910534e+06 9.063398e+06
Population ages 0-14, male (% of total) 4.272459e+01 4.382624e+01 4.493039e+01 4.584365e+01 4.648068e+01 4.701742e+01 4.728065e+01 4.733719e+01 4.727930e+01 4.712993e+01 4.666728e+01 4.621012e+01 4.573872e+01 4.520412e+01 4.459384e+01 4.413426e+01 4.355289e+01 4.292959e+01 4.237862e+01 4.195376e+01 4.167187e+01 4.152444e+01 4.147693e+01 4.146128e+01 4.143465e+01 4.155961e+01 4.165014e+01 4.170190e+01 4.172058e+01 4.170854e+01 4.175783e+01 4.173396e+01 4.164178e+01 4.146618e+01 4.118420e+01 4.082848e+01 4.034172e+01 3.974018e+01 3.904989e+01 3.828701e+01 3.735575e+01 3.642683e+01 3.548707e+01 3.453782e+01 3.360585e+01 3.271785e+01 3.191018e+01 3.118728e+01 3.055033e+01 3.000167e+01 2.972236e+01 2.942955e+01 2.917704e+01 2.901647e+01 2.895058e+01 2.889303e+01 2.891223e+01
Population ages 0-14, male 1.815435e+06 1.928130e+06 2.047913e+06 2.166533e+06 2.279349e+06 2.394552e+06 2.501921e+06 2.601182e+06 2.693137e+06 2.776234e+06 2.835444e+06 2.889894e+06 2.940447e+06 2.986753e+06 3.029652e+06 3.085006e+06 3.133295e+06 3.178787e+06 3.228598e+06 3.286465e+06 3.354377e+06 3.432882e+06 3.520159e+06 3.611331e+06 3.703050e+06 3.809833e+06 3.915014e+06 4.018354e+06 4.120709e+06 4.232072e+06 4.332720e+06 4.438241e+06 4.535734e+06 4.610902e+06 4.666392e+06 4.717204e+06 4.750181e+06 4.754844e+06 4.722015e+06 4.693326e+06 4.636537e+06 4.575885e+06 4.509100e+06 4.438790e+06 4.369325e+06 4.306309e+06 4.260833e+06 4.232643e+06 4.217922e+06 4.262094e+06 4.338764e+06 4.361333e+06 4.393616e+06 4.445335e+06 4.514702e+06 4.585999e+06 4.667807e+06
Population ages 0-14, female (% of total) 3.885587e+01 4.004961e+01 4.117077e+01 4.205366e+01 4.264616e+01 4.318524e+01 4.342894e+01 4.347294e+01 4.343403e+01 4.334387e+01 4.307071e+01 4.282309e+01 4.254608e+01 4.216921e+01 4.168474e+01 4.133878e+01 4.086070e+01 4.033824e+01 3.988732e+01 3.955680e+01 3.934011e+01 3.925613e+01 3.926229e+01 3.928646e+01 3.929056e+01 3.943098e+01 3.953493e+01 3.960436e+01 3.964574e+01 3.965623e+01 3.972783e+01 3.972264e+01 3.964621e+01 3.948941e+01 3.923452e+01 3.883798e+01 3.833386e+01 3.772782e+01 3.703048e+01 3.625239e+01 3.534074e+01 3.440633e+01 3.345267e+01 3.250502e+01 3.159943e+01 3.075273e+01 3.000944e+01 2.936062e+01 2.878452e+01 2.827154e+01 2.800841e+01 2.771110e+01 2.743799e+01 2.725245e+01 2.716502e+01 2.706964e+01 2.706237e+01
Population ages 0-14, female 1.782782e+06 1.897795e+06 2.015837e+06 2.129206e+06 2.234607e+06 2.344084e+06 2.443229e+06 2.533641e+06 2.618036e+06 2.695742e+06 2.757036e+06 2.815537e+06 2.869647e+06 2.917197e+06 2.959098e+06 3.013213e+06 3.059334e+06 3.102558e+06 3.150461e+06 3.206554e+06 3.270762e+06 3.345773e+06 3.429241e+06 3.516078e+06 3.603421e+06 3.705851e+06 3.807259e+06 3.907099e+06 4.005094e+06 4.109664e+06 4.201685e+06 4.295763e+06 4.380799e+06 4.445453e+06 4.494103e+06 4.532893e+06 4.558717e+06 4.559849e+06 4.524898e+06 4.492442e+06 4.436184e+06 4.373043e+06 4.302459e+06 4.229677e+06 4.160166e+06 4.098199e+06 4.055902e+06 4.031538e+06 4.018587e+06 4.058722e+06 4.128931e+06 4.144176e+06 4.166385e+06 4.207140e+06 4.266101e+06 4.324528e+06 4.395593e+06

Exercise 10

What was the net change in numbers of male and female children between 1960 and 2016?

dim(data_0_14)
## [1]  6 57
(data_0_14[,57]-data_0_14[,1])[c(4,6)]
## [1] 2852372 2612811

Exercise 11

Create a vector x that consists of the numbers from 1 to 20. Run a FOR-loop for i in 1 to 20, and at each step, print the product of the value i and value i+1. Then, avoid the last printed value, which is “NA”, by transforming the previous FOR-loop into a WHILE-loop.

x <- c(1:20)

for (i in 1:20)
{ print(x[i]*x[i+1]) }
## [1] 2
## [1] 6
## [1] 12
## [1] 20
## [1] 30
## [1] 42
## [1] 56
## [1] 72
## [1] 90
## [1] 110
## [1] 132
## [1] 156
## [1] 182
## [1] 210
## [1] 240
## [1] 272
## [1] 306
## [1] 342
## [1] 380
## [1] NA

i <- 1
while (i <= 19)
{ print(x[i]*x[i+1]) 
  i <- i+1}
## [1] 2
## [1] 6
## [1] 12
## [1] 20
## [1] 30
## [1] 42
## [1] 56
## [1] 72
## [1] 90
## [1] 110
## [1] 132
## [1] 156
## [1] 182
## [1] 210
## [1] 240
## [1] 272
## [1] 306
## [1] 342
## [1] 380

Exercise 12

Create a vector y that is a sequence from 10 to 200, consisting of values divisible by 10. Run a FOR-loop for i in 1 to 20, and at each iteration, print y[i] if y[i] is divisible by 50, and print “not divisible by 50” if it is not. Do that by including an IF-statement in the FOR-loop. Then, change the IF-statement so only values that are above 50 and below 150 are printed.

y <- seq(from=10, to=200, by=10)

for (i in 1:20)
{ if (y[i]%%50==0)
  print(y[i])
  else print("not divisible by 50")}
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] 50
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] 100
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] 150
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] 200

for (i in 1:20)
{ if ((y[i]>50) & (y[i]<150))
  print(y[i])
}
## [1] 60
## [1] 70
## [1] 80
## [1] 90
## [1] 100
## [1] 110
## [1] 120
## [1] 130
## [1] 140

Exercise 13

Create a function “sum_mean” that uses two vectors a and b as arguments, calculates the sum of both vectors “vectors_sum” and returns the mean value of “vectors_sum”. Run the function with the vectors x and y from the previous two exercises

sum_mean <- function(a, b) 
  { vectors_sum <- a+b
    return(mean(vectors_sum))
  }

sum_mean(a=x, b=y)
## [1] 115.5

Exercise 14

Write a FOR-loop that computes a vector of all prime numbers between 3 and 100.

prime_numbers <- NULL         # initiate an empty prime_number vector
for (i in 3:100)
{ division <- NULL            # initiate an empty division vector
  for (j in (2:(i-1)))
  { if (i%%j == 0)            # check if i is divisible by j
    division[j] <- 1          # if it is divisible, paste the number "1" in the division vector
  }
 if (1 %in% division == F)      # only append the prime_number vector if there is no number "1" in the division vector
 prime_numbers <- c(prime_numbers, i)
}
prime_numbers
##  [1]  3  5  7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

# Try to see how long it takes to compute all prime numbers below 1000, 10000, 100000 ...

Exercise 15

Write a function that computes the binomial coefficient of two values n and k, and test it for n=49 and k=6.

bin_coeff <- function(n, k)
{ n_factorial <- 1
  for (i in 1:n)
  { n_factorial <- n_factorial*i }

  k_factorial <- 1
  for (i in 1:k)
  { k_factorial <- k_factorial*i }

  nk_factorial <- 1
  for (i in 1:(n-k))
  { nk_factorial <- nk_factorial*i }

  return(n_factorial/(k_factorial*nk_factorial))
}

bin_coeff(49, 6)
## [1] 13983816

Exercise 16

For the following exercise, we will work with the built-in dataset iris. You can view a description of this dataset [here] (https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html) and have a look at the data by executing “head(iris)”.

Create a copy of “iris”" that you name “data_iris”. Create a new column in “data_iris” and name it “description”. Characterize each entry according to whether “Sepal.Length”" and “Petal.Length” is less/equal to, or greater than 5.8, and less/equal to, or greater than 4.35, respectively, by writing into the fourth column one of the following four statements:

  • “short sepals, short petals”
  • “short sepals, long petals”
  • “long sepals, short petals”
  • “long sepals, long petals”
data_iris <- iris
data_iris$description <- 0
for (i in 1:length(data_iris[,1]))
{ if (iris$Sepal.Length[i] <= 5.8)
  { if (iris$Petal.Length[i] <= 4.35)
    data_iris[i,6] <- "short sepals, short petals"
    else data_iris[i,6] <- "short sepals, long petals" 
  }
  else if (iris$Petal.Length[i] <= 4.35)
    data_iris[i,6] <- "long sepals, short petals"
    else data_iris[i,6] <- "long sepals, long petals" 
}

head(data_iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species description
5.1 3.5 1.4 0.2 setosa short sepals, short petals
4.9 3.0 1.4 0.2 setosa short sepals, short petals
4.7 3.2 1.3 0.2 setosa short sepals, short petals
4.6 3.1 1.5 0.2 setosa short sepals, short petals
5.0 3.6 1.4 0.2 setosa short sepals, short petals
5.4 3.9 1.7 0.4 setosa short sepals, short petals

Exercise 17

Write a function that takes (1) a table with the same structure as the iris data and (2) a threshold value as inputs and computes the mean value for “Sepal.Length”" of all entries whose value for “Sepal.Width”" exceeds the threshold value. Run this function on “data_iris”, setting the threshold to 3.

my.function <- function(table, threshold)
{ return(mean(subset(table, Sepal.Width > threshold)$Sepal.Length))
}

my.function(data_iris, 3)
## [1] 5.683582