Load “Demographics_Uzbekistan_Worldbank2.csv” into R and convert the first column to rownames.
setwd("F:/IAMO_B/2022_DSIK/eLearning/Module_0_R-Basics/data/")
data <- read.csv("Demographics_Uzbekistan_Worldbank2.csv", sep=";", header=T)
rownames(data) <- data[,1]
data <- data[,-1]
head(data)
| X1960 | X1961 | X1962 | X1963 | X1964 | X1965 | X1966 | X1967 | X1968 | X1969 | X1970 | X1971 | X1972 | X1973 | X1974 | X1975 | X1976 | X1977 | X1978 | X1979 | X1980 | X1981 | X1982 | X1983 | X1984 | X1985 | X1986 | X1987 | X1988 | X1989 | X1990 | X1991 | X1992 | X1993 | X1994 | X1995 | X1996 | X1997 | X1998 | X1999 | X2000 | X2001 | X2002 | X2003 | X2004 | X2005 | X2006 | X2007 | X2008 | X2009 | X2010 | X2011 | X2012 | X2013 | X2014 | X2015 | X2016 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Population ages 40-44, male (% of male population) | 2.802203 | 2.896451 | 3.069046 | 3.336136 | 3.690905 | 4.101962 | 4.590526 | 5.059065 | 5.379471 | 5.496317 | 5.466242 | 5.254891 | 4.969459 | 4.751594 | 4.665228 | 4.655549 | 4.750850 | 4.865208 | 4.886789 | 4.770856 | 4.560047 | 4.269012 | 3.948933 | 3.661139 | 3.435415 | 3.242697 | 3.077853 | 2.972073 | 2.973244 | 3.105117 | 3.349855 | 3.715974 | 4.138677 | 4.529740 | 4.840484 | 5.078634 | 5.250469 | 5.377954 | 5.495599 | 5.621172 | 5.762577 | 5.902289 | 6.020272 | 6.087027 | 6.088726 | 6.028875 | 5.911667 | 5.774145 | 5.668374 | 5.622695 | 5.611772 | 5.664594 | 5.755805 | 5.848225 | 5.922188 | 5.988523 | 6.044512 |
| Population ages 40-44, female (% of female population) | 3.822331 | 3.982644 | 4.192144 | 4.410170 | 4.609609 | 4.781583 | 4.959838 | 5.115214 | 5.201543 | 5.194364 | 5.147529 | 4.994136 | 4.800338 | 4.650012 | 4.576971 | 4.545546 | 4.579625 | 4.613422 | 4.567457 | 4.417975 | 4.223169 | 3.962299 | 3.678375 | 3.426918 | 3.238143 | 3.094232 | 2.992777 | 2.952878 | 3.002819 | 3.154991 | 3.398435 | 3.737865 | 4.125107 | 4.493590 | 4.805055 | 5.062565 | 5.270672 | 5.447141 | 5.619374 | 5.799410 | 5.981864 | 6.166992 | 6.330853 | 6.435402 | 6.460404 | 6.406166 | 6.277317 | 6.117453 | 5.987465 | 5.920420 | 5.886418 | 5.917532 | 5.989325 | 6.063903 | 6.120755 | 6.172542 | 6.213516 |
| Population ages 35-39, male (% of male population) | 5.068229 | 5.680684 | 6.247044 | 6.623900 | 6.751899 | 6.647504 | 6.345956 | 5.962686 | 5.661149 | 5.535142 | 5.551201 | 5.683749 | 5.832172 | 5.863054 | 5.719383 | 5.428033 | 5.042695 | 4.631868 | 4.277190 | 4.018055 | 3.809305 | 3.641951 | 3.543927 | 3.561232 | 3.716860 | 3.997206 | 4.410961 | 4.886901 | 5.324493 | 5.666861 | 5.907056 | 6.062704 | 6.161391 | 6.248383 | 6.348410 | 6.452520 | 6.570107 | 6.674770 | 6.727213 | 6.709495 | 6.651315 | 6.528360 | 6.382762 | 6.271256 | 6.222817 | 6.227487 | 6.285753 | 6.376802 | 6.468611 | 6.543308 | 6.593019 | 6.642547 | 6.697085 | 6.768352 | 6.864493 | 6.989650 | 7.130977 |
| Population ages 35-39, female (% of female population) | 5.821614 | 6.025183 | 6.178266 | 6.258219 | 6.250714 | 6.132714 | 5.919202 | 5.659269 | 5.437076 | 5.310146 | 5.323403 | 5.406589 | 5.479677 | 5.442876 | 5.261683 | 4.965847 | 4.602250 | 4.240978 | 3.953282 | 3.760947 | 3.608704 | 3.503818 | 3.466747 | 3.527378 | 3.698605 | 3.969144 | 4.343000 | 4.767395 | 5.168218 | 5.502434 | 5.766902 | 5.967794 | 6.124475 | 6.270773 | 6.424605 | 6.588602 | 6.762677 | 6.916353 | 7.006158 | 7.012098 | 6.957745 | 6.827064 | 6.663755 | 6.529739 | 6.459250 | 6.441669 | 6.478105 | 6.548501 | 6.621664 | 6.680331 | 6.713866 | 6.749273 | 6.792030 | 6.854483 | 6.945114 | 7.069720 | 7.212531 |
| Population ages 30-34, male (% of male population) | 8.245330 | 7.852095 | 7.316644 | 6.854373 | 6.584949 | 6.479938 | 6.579324 | 6.780373 | 6.899371 | 6.816811 | 6.458001 | 5.942376 | 5.381791 | 4.908180 | 4.580652 | 4.344821 | 4.194960 | 4.139638 | 4.209644 | 4.429236 | 4.762383 | 5.237618 | 5.775239 | 6.263017 | 6.637779 | 6.908633 | 7.078251 | 7.186157 | 7.285387 | 7.394526 | 7.471382 | 7.556395 | 7.620322 | 7.623810 | 7.552469 | 7.425810 | 7.244411 | 7.049147 | 6.897504 | 6.821205 | 6.833832 | 6.908167 | 7.020846 | 7.132666 | 7.218315 | 7.291266 | 7.343232 | 7.389954 | 7.455972 | 7.554622 | 7.663350 | 7.802630 | 7.964620 | 8.140442 | 8.324214 | 8.526776 | 8.732077 |
| Population ages 30-34, female (% of female population) | 7.484621 | 7.253412 | 6.944133 | 6.660939 | 6.468239 | 6.352704 | 6.373839 | 6.448928 | 6.433025 | 6.242900 | 5.869036 | 5.357310 | 4.832392 | 4.430377 | 4.198384 | 4.070369 | 4.027605 | 4.049673 | 4.141750 | 4.326083 | 4.639296 | 5.061803 | 5.538733 | 5.989117 | 6.360052 | 6.653069 | 6.870051 | 7.035156 | 7.188262 | 7.350081 | 7.490770 | 7.638435 | 7.758649 | 7.803858 | 7.754286 | 7.641860 | 7.454868 | 7.241276 | 7.068332 | 6.973337 | 6.959592 | 7.011847 | 7.106726 | 7.203510 | 7.276410 | 7.335002 | 7.373669 | 7.409420 | 7.468848 | 7.565715 | 7.673258 | 7.815668 | 7.981538 | 8.156203 | 8.330237 | 8.516460 | 8.698230 |
Select only the entries corresponding to the years 2000 to 2010.
names(data)
## [1] "X1960" "X1961" "X1962" "X1963" "X1964" "X1965" "X1966" "X1967" "X1968"
## [10] "X1969" "X1970" "X1971" "X1972" "X1973" "X1974" "X1975" "X1976" "X1977"
## [19] "X1978" "X1979" "X1980" "X1981" "X1982" "X1983" "X1984" "X1985" "X1986"
## [28] "X1987" "X1988" "X1989" "X1990" "X1991" "X1992" "X1993" "X1994" "X1995"
## [37] "X1996" "X1997" "X1998" "X1999" "X2000" "X2001" "X2002" "X2003" "X2004"
## [46] "X2005" "X2006" "X2007" "X2008" "X2009" "X2010" "X2011" "X2012" "X2013"
## [55] "X2014" "X2015" "X2016"
data_sel <- data[,41:51]
data_sel
| X2000 | X2001 | X2002 | X2003 | X2004 | X2005 | X2006 | X2007 | X2008 | X2009 | X2010 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Population ages 40-44, male (% of male population) | 5.762577e+00 | 5.902289e+00 | 6.020272e+00 | 6.087027e+00 | 6.088726e+00 | 6.028875e+00 | 5.911667e+00 | 5.774145e+00 | 5.668374e+00 | 5.622695e+00 | 5.611772e+00 |
| Population ages 40-44, female (% of female population) | 5.981864e+00 | 6.166992e+00 | 6.330853e+00 | 6.435402e+00 | 6.460404e+00 | 6.406166e+00 | 6.277317e+00 | 6.117453e+00 | 5.987465e+00 | 5.920420e+00 | 5.886418e+00 |
| Population ages 35-39, male (% of male population) | 6.651315e+00 | 6.528360e+00 | 6.382762e+00 | 6.271256e+00 | 6.222817e+00 | 6.227487e+00 | 6.285753e+00 | 6.376802e+00 | 6.468611e+00 | 6.543308e+00 | 6.593019e+00 |
| Population ages 35-39, female (% of female population) | 6.957745e+00 | 6.827064e+00 | 6.663755e+00 | 6.529739e+00 | 6.459250e+00 | 6.441669e+00 | 6.478105e+00 | 6.548501e+00 | 6.621664e+00 | 6.680331e+00 | 6.713866e+00 |
| Population ages 30-34, male (% of male population) | 6.833832e+00 | 6.908167e+00 | 7.020846e+00 | 7.132666e+00 | 7.218315e+00 | 7.291266e+00 | 7.343232e+00 | 7.389954e+00 | 7.455972e+00 | 7.554622e+00 | 7.663350e+00 |
| Population ages 30-34, female (% of female population) | 6.959592e+00 | 7.011847e+00 | 7.106726e+00 | 7.203510e+00 | 7.276410e+00 | 7.335002e+00 | 7.373669e+00 | 7.409420e+00 | 7.468848e+00 | 7.565715e+00 | 7.673258e+00 |
| Population ages 25-29, male (% of male population) | 8.020528e+00 | 8.087925e+00 | 8.146005e+00 | 8.217777e+00 | 8.315383e+00 | 8.452951e+00 | 8.601684e+00 | 8.764026e+00 | 8.943727e+00 | 9.139615e+00 | 9.325499e+00 |
| Population ages 25-29, female (% of female population) | 7.944221e+00 | 8.002587e+00 | 8.057768e+00 | 8.131390e+00 | 8.234766e+00 | 8.375513e+00 | 8.530318e+00 | 8.697937e+00 | 8.877428e+00 | 9.064741e+00 | 9.233348e+00 |
| Population ages 20-24, male (% of male population) | 9.322579e+00 | 9.473200e+00 | 9.640012e+00 | 9.826196e+00 | 1.002867e+01 | 1.025542e+01 | 1.047710e+01 | 1.067921e+01 | 1.084407e+01 | 1.095944e+01 | 1.099881e+01 |
| Population ages 20-24, female (% of female population) | 9.106619e+00 | 9.273912e+00 | 9.459578e+00 | 9.657137e+00 | 9.857714e+00 | 1.006867e+01 | 1.026646e+01 | 1.043683e+01 | 1.056266e+01 | 1.063254e+01 | 1.062081e+01 |
| Population ages 15-64 (% of total) | 5.896779e+01 | 5.983194e+01 | 6.072071e+01 | 6.163776e+01 | 6.256140e+01 | 6.344965e+01 | 6.427497e+01 | 6.503368e+01 | 6.572899e+01 | 6.635761e+01 | 6.672003e+01 |
| Population ages 15-64, total | 1.472099e+07 | 1.512064e+07 | 1.552486e+07 | 1.594221e+07 | 1.637044e+07 | 1.680670e+07 | 1.726940e+07 | 1.775602e+07 | 1.825123e+07 | 1.895333e+07 | 1.957526e+07 |
| Population ages 15-64, male (% of total) | 5.877059e+01 | 5.960738e+01 | 6.046599e+01 | 6.136212e+01 | 6.227816e+01 | 6.316269e+01 | 6.399530e+01 | 6.476767e+01 | 6.547526e+01 | 6.611085e+01 | 6.647054e+01 |
| Population ages 15-64, male | 7.294513e+06 | 7.487792e+06 | 7.683002e+06 | 7.886239e+06 | 8.097208e+06 | 8.313447e+06 | 8.545025e+06 | 8.790071e+06 | 9.039822e+06 | 9.391832e+06 | 9.703133e+06 |
| Population ages 15-64, female (% of total) | 5.916270e+01 | 6.005386e+01 | 6.097237e+01 | 6.191006e+01 | 6.284114e+01 | 6.373316e+01 | 6.455142e+01 | 6.529666e+01 | 6.597994e+01 | 6.660181e+01 | 6.696704e+01 |
| Population ages 15-64, female | 7.426462e+06 | 7.632843e+06 | 7.841858e+06 | 8.055975e+06 | 8.273236e+06 | 8.493267e+06 | 8.724394e+06 | 8.965953e+06 | 9.211413e+06 | 9.561496e+06 | 9.872118e+06 |
| Population ages 15-19, male (% of male population) | 1.122094e+01 | 1.144487e+01 | 1.166122e+01 | 1.184738e+01 | 1.198269e+01 | 1.205458e+01 | 1.206638e+01 | 1.200389e+01 | 1.184740e+01 | 1.158972e+01 | 1.120364e+01 |
| Population ages 15-19, female (% of female population) | 1.090845e+01 | 1.111925e+01 | 1.131272e+01 | 1.146169e+01 | 1.154564e+01 | 1.156455e+01 | 1.151282e+01 | 1.138848e+01 | 1.118981e+01 | 1.091903e+01 | 1.055011e+01 |
| Population ages 10-14, male (% of male population) | 1.302027e+01 | 1.302936e+01 | 1.298148e+01 | 1.285302e+01 | 1.260701e+01 | 1.222476e+01 | 1.173672e+01 | 1.118685e+01 | 1.063899e+01 | 1.013833e+01 | 9.653681e+00 |
| Population ages 10-14, female (% of female population) | 1.244343e+01 | 1.238944e+01 | 1.227633e+01 | 1.209432e+01 | 1.182299e+01 | 1.145913e+01 | 1.101485e+01 | 1.052513e+01 | 1.003789e+01 | 9.585812e+00 | 9.143348e+00 |
| Population ages 5-9, male (% of male population) | 1.317894e+01 | 1.273059e+01 | 1.216944e+01 | 1.155742e+01 | 1.098422e+01 | 1.048905e+01 | 1.005015e+01 | 9.685773e+00 | 9.404078e+00 | 9.202041e+00 | 9.089541e+00 |
| Population ages 5-9, female (% of female population) | 1.232941e+01 | 1.190460e+01 | 1.140234e+01 | 1.086505e+01 | 1.035541e+01 | 9.910157e+00 | 9.505080e+00 | 9.158054e+00 | 8.883284e+00 | 8.687207e+00 | 8.579245e+00 |
| Population ages 0-14 (% of total) | 3.634240e+01 | 3.541057e+01 | 3.446377e+01 | 3.351523e+01 | 3.259638e+01 | 3.172937e+01 | 3.095429e+01 | 3.026875e+01 | 2.966257e+01 | 2.913205e+01 | 2.886115e+01 |
| Population ages 0-14, total | 9.072681e+06 | 8.948906e+06 | 8.811575e+06 | 8.668497e+06 | 8.529495e+06 | 8.404556e+06 | 8.316798e+06 | 8.264217e+06 | 8.236524e+06 | 8.320813e+06 | 8.467690e+06 |
| Population ages 0-14, male (% of total) | 3.735575e+01 | 3.642683e+01 | 3.548707e+01 | 3.453782e+01 | 3.360585e+01 | 3.271785e+01 | 3.191018e+01 | 3.118728e+01 | 3.055033e+01 | 3.000167e+01 | 2.972236e+01 |
| Population ages 0-14, male | 4.636537e+06 | 4.575885e+06 | 4.509100e+06 | 4.438790e+06 | 4.369325e+06 | 4.306309e+06 | 4.260833e+06 | 4.232643e+06 | 4.217922e+06 | 4.262094e+06 | 4.338764e+06 |
| Population ages 0-14, female (% of total) | 3.534074e+01 | 3.440633e+01 | 3.345267e+01 | 3.250502e+01 | 3.159943e+01 | 3.075273e+01 | 3.000944e+01 | 2.936062e+01 | 2.878452e+01 | 2.827154e+01 | 2.800841e+01 |
| Population ages 0-14, female | 4.436184e+06 | 4.373043e+06 | 4.302459e+06 | 4.229677e+06 | 4.160166e+06 | 4.098199e+06 | 4.055902e+06 | 4.031538e+06 | 4.018587e+06 | 4.058722e+06 | 4.128931e+06 |
| Population ages 0-4, male (% of male population) | 1.115654e+01 | 1.066688e+01 | 1.033615e+01 | 1.012739e+01 | 1.001461e+01 | 1.000403e+01 | 1.012331e+01 | 1.031466e+01 | 1.050726e+01 | 1.066130e+01 | 1.097914e+01 |
| Population ages 0-4, female (% of female population) | 1.056789e+01 | 1.011228e+01 | 9.774001e+00 | 9.545643e+00 | 9.421040e+00 | 9.383441e+00 | 9.489514e+00 | 9.677438e+00 | 9.863345e+00 | 9.998519e+00 | 1.028581e+01 |
From that, create a table that only contains information about females.
data_female <- data_sel[c(2,4,6,8,10,15,16,18,20,22,27,28,30),]
head(data_female)
| X2000 | X2001 | X2002 | X2003 | X2004 | X2005 | X2006 | X2007 | X2008 | X2009 | X2010 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Population ages 40-44, female (% of female population) | 5.981864 | 6.166992 | 6.330853 | 6.435402 | 6.460404 | 6.406166 | 6.277317 | 6.117453 | 5.987465 | 5.920420 | 5.886418 |
| Population ages 35-39, female (% of female population) | 6.957745 | 6.827064 | 6.663755 | 6.529739 | 6.459250 | 6.441669 | 6.478105 | 6.548501 | 6.621664 | 6.680331 | 6.713866 |
| Population ages 30-34, female (% of female population) | 6.959592 | 7.011847 | 7.106726 | 7.203510 | 7.276410 | 7.335002 | 7.373669 | 7.409420 | 7.468848 | 7.565715 | 7.673258 |
| Population ages 25-29, female (% of female population) | 7.944221 | 8.002587 | 8.057768 | 8.131390 | 8.234766 | 8.375513 | 8.530318 | 8.697937 | 8.877428 | 9.064741 | 9.233348 |
| Population ages 20-24, female (% of female population) | 9.106619 | 9.273912 | 9.459578 | 9.657137 | 9.857714 | 10.068667 | 10.266459 | 10.436828 | 10.562662 | 10.632545 | 10.620811 |
| Population ages 15-64, female (% of total) | 59.162702 | 60.053857 | 60.972368 | 61.910062 | 62.841136 | 63.733156 | 64.551419 | 65.296661 | 65.979943 | 66.601810 | 66.967035 |
From that, delete everything that is not percentage information, and the rows that say “% of total”.
data_female <- data_female[-c(6,7,11,12),]
head(data_female)
| X2000 | X2001 | X2002 | X2003 | X2004 | X2005 | X2006 | X2007 | X2008 | X2009 | X2010 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Population ages 40-44, female (% of female population) | 5.981864 | 6.166992 | 6.330853 | 6.435402 | 6.460404 | 6.406166 | 6.277317 | 6.117453 | 5.987465 | 5.920420 | 5.886418 |
| Population ages 35-39, female (% of female population) | 6.957745 | 6.827064 | 6.663755 | 6.529739 | 6.459250 | 6.441669 | 6.478105 | 6.548501 | 6.621664 | 6.680331 | 6.713866 |
| Population ages 30-34, female (% of female population) | 6.959592 | 7.011847 | 7.106726 | 7.203510 | 7.276410 | 7.335002 | 7.373669 | 7.409420 | 7.468848 | 7.565715 | 7.673258 |
| Population ages 25-29, female (% of female population) | 7.944221 | 8.002587 | 8.057768 | 8.131390 | 8.234766 | 8.375513 | 8.530318 | 8.697937 | 8.877428 | 9.064741 | 9.233348 |
| Population ages 20-24, female (% of female population) | 9.106619 | 9.273912 | 9.459578 | 9.657137 | 9.857714 | 10.068667 | 10.266459 | 10.436828 | 10.562662 | 10.632545 | 10.620811 |
| Population ages 15-19, female (% of female population) | 10.908450 | 11.119250 | 11.312716 | 11.461694 | 11.545645 | 11.564548 | 11.512824 | 11.388480 | 11.189811 | 10.919026 | 10.550105 |
Create a table with the lower and upper age limits of the remaining age classes.
limits <- data.frame(lower=rev(seq(from=0, to=40, by=5)), upper=rev(seq(from=4, to=44, by=5))) # rev reverses the elements of a vector!
limits
| lower | upper |
|---|---|
| 40 | 44 |
| 35 | 39 |
| 30 | 34 |
| 25 | 29 |
| 20 | 24 |
| 15 | 19 |
| 10 | 14 |
| 5 | 9 |
| 0 | 4 |
Transpose the table about female population classes and shorten the column names.
data_female_t <- as.data.frame(t(data_female))
names(data_female_t) <- paste("female", paste(limits[,1], limits[,2], sep="_"), sep="_")
data_female_t
| female_40_44 | female_35_39 | female_30_34 | female_25_29 | female_20_24 | female_15_19 | female_10_14 | female_5_9 | female_0_4 | |
|---|---|---|---|---|---|---|---|---|---|
| X2000 | 5.981864 | 6.957745 | 6.959592 | 7.944221 | 9.106619 | 10.90845 | 12.443434 | 12.329414 | 10.567889 |
| X2001 | 6.166992 | 6.827064 | 7.011847 | 8.002587 | 9.273912 | 11.11925 | 12.389444 | 11.904599 | 10.112283 |
| X2002 | 6.330853 | 6.663755 | 7.106726 | 8.057768 | 9.459578 | 11.31272 | 12.276334 | 11.402337 | 9.774001 |
| X2003 | 6.435402 | 6.529739 | 7.203510 | 8.131390 | 9.657137 | 11.46169 | 12.094322 | 10.865050 | 9.545643 |
| X2004 | 6.460404 | 6.459250 | 7.276410 | 8.234766 | 9.857714 | 11.54564 | 11.822990 | 10.355405 | 9.421040 |
| X2005 | 6.406166 | 6.441669 | 7.335002 | 8.375513 | 10.068667 | 11.56455 | 11.459134 | 9.910157 | 9.383441 |
| X2006 | 6.277317 | 6.478105 | 7.373669 | 8.530318 | 10.266459 | 11.51282 | 11.014852 | 9.505080 | 9.489514 |
| X2007 | 6.117453 | 6.548501 | 7.409420 | 8.697937 | 10.436828 | 11.38848 | 10.525128 | 9.158054 | 9.677438 |
| X2008 | 5.987465 | 6.621664 | 7.468848 | 8.877428 | 10.562662 | 11.18981 | 10.037894 | 8.883284 | 9.863345 |
| X2009 | 5.920420 | 6.680331 | 7.565715 | 9.064741 | 10.632545 | 10.91903 | 9.585812 | 8.687207 | 9.998519 |
| X2010 | 5.886418 | 6.713866 | 7.673258 | 9.233348 | 10.620811 | 10.55011 | 9.143348 | 8.579245 | 10.285814 |
What is the percentage of females older than 44 years for the years 2000 to 2010?
100-apply(data_female_t, 1, sum)
## X2000 X2001 X2002 X2003 X2004 X2005 X2006 X2007
## 16.80077 17.19202 17.61593 18.07611 18.56638 19.05570 19.55186 20.04076
## X2008 X2009 X2010
## 20.50760 20.94568 21.31379
What was the mean percentage of females aged 35-39 between 2000 and 2005, and between 2000 and 2010?
mean(data_female_t[c(1:6), 2])
## [1] 6.646537
mean(data_female_t$female_35_39)
## [1] 6.629244
Go back to the first table and create a new table with all information on people aged 0-14.
data_0_14 <- data[which(substr(rownames(data), 1, 20) == "Population ages 0-14"),]
data_0_14
| X1960 | X1961 | X1962 | X1963 | X1964 | X1965 | X1966 | X1967 | X1968 | X1969 | X1970 | X1971 | X1972 | X1973 | X1974 | X1975 | X1976 | X1977 | X1978 | X1979 | X1980 | X1981 | X1982 | X1983 | X1984 | X1985 | X1986 | X1987 | X1988 | X1989 | X1990 | X1991 | X1992 | X1993 | X1994 | X1995 | X1996 | X1997 | X1998 | X1999 | X2000 | X2001 | X2002 | X2003 | X2004 | X2005 | X2006 | X2007 | X2008 | X2009 | X2010 | X2011 | X2012 | X2013 | X2014 | X2015 | X2016 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Population ages 0-14 (% of total) | 4.071612e+01 | 4.186784e+01 | 4.298322e+01 | 4.388336e+01 | 4.449996e+01 | 4.504029e+01 | 4.529597e+01 | 4.534851e+01 | 4.530255e+01 | 4.518551e+01 | 4.482240e+01 | 4.447456e+01 | 4.410435e+01 | 4.365193e+01 | 4.310739e+01 | 4.270729e+01 | 4.218003e+01 | 4.160947e+01 | 4.111059e+01 | 4.073476e+01 | 4.048721e+01 | 4.037297e+01 | 4.035358e+01 | 4.035898e+01 | 4.034875e+01 | 4.048204e+01 | 4.057987e+01 | 4.064106e+01 | 4.067171e+01 | 4.067156e+01 | 4.073321e+01 | 4.071987e+01 | 4.063673e+01 | 4.047158e+01 | 4.020406e+01 | 3.982813e+01 | 3.933266e+01 | 3.872872e+01 | 3.803474e+01 | 3.726417e+01 | 3.634240e+01 | 3.541057e+01 | 3.446377e+01 | 3.351523e+01 | 3.259638e+01 | 3.172937e+01 | 3.095429e+01 | 3.026875e+01 | 2.966257e+01 | 2.913205e+01 | 2.886115e+01 | 2.856638e+01 | 2.830386e+01 | 2.813108e+01 | 2.805466e+01 | 2.797840e+01 | 2.798451e+01 |
| Population ages 0-14, total | 3.598226e+06 | 3.825924e+06 | 4.063741e+06 | 4.295736e+06 | 4.513960e+06 | 4.738634e+06 | 4.945163e+06 | 5.134855e+06 | 5.311201e+06 | 5.471978e+06 | 5.592517e+06 | 5.705474e+06 | 5.810126e+06 | 5.903964e+06 | 5.988753e+06 | 6.098225e+06 | 6.192644e+06 | 6.281371e+06 | 6.379074e+06 | 6.493017e+06 | 6.625149e+06 | 6.778649e+06 | 6.949375e+06 | 7.127385e+06 | 7.306472e+06 | 7.515683e+06 | 7.722300e+06 | 7.925505e+06 | 8.125853e+06 | 8.341737e+06 | 8.534422e+06 | 8.734006e+06 | 8.916512e+06 | 9.056327e+06 | 9.160496e+06 | 9.250084e+06 | 9.308862e+06 | 9.314644e+06 | 9.246872e+06 | 9.185767e+06 | 9.072681e+06 | 8.948906e+06 | 8.811575e+06 | 8.668497e+06 | 8.529495e+06 | 8.404556e+06 | 8.316798e+06 | 8.264217e+06 | 8.236524e+06 | 8.320813e+06 | 8.467690e+06 | 8.505498e+06 | 8.559994e+06 | 8.652474e+06 | 8.780801e+06 | 8.910534e+06 | 9.063398e+06 |
| Population ages 0-14, male (% of total) | 4.272459e+01 | 4.382624e+01 | 4.493039e+01 | 4.584365e+01 | 4.648068e+01 | 4.701742e+01 | 4.728065e+01 | 4.733719e+01 | 4.727930e+01 | 4.712993e+01 | 4.666728e+01 | 4.621012e+01 | 4.573872e+01 | 4.520412e+01 | 4.459384e+01 | 4.413426e+01 | 4.355289e+01 | 4.292959e+01 | 4.237862e+01 | 4.195376e+01 | 4.167187e+01 | 4.152444e+01 | 4.147693e+01 | 4.146128e+01 | 4.143465e+01 | 4.155961e+01 | 4.165014e+01 | 4.170190e+01 | 4.172058e+01 | 4.170854e+01 | 4.175783e+01 | 4.173396e+01 | 4.164178e+01 | 4.146618e+01 | 4.118420e+01 | 4.082848e+01 | 4.034172e+01 | 3.974018e+01 | 3.904989e+01 | 3.828701e+01 | 3.735575e+01 | 3.642683e+01 | 3.548707e+01 | 3.453782e+01 | 3.360585e+01 | 3.271785e+01 | 3.191018e+01 | 3.118728e+01 | 3.055033e+01 | 3.000167e+01 | 2.972236e+01 | 2.942955e+01 | 2.917704e+01 | 2.901647e+01 | 2.895058e+01 | 2.889303e+01 | 2.891223e+01 |
| Population ages 0-14, male | 1.815435e+06 | 1.928130e+06 | 2.047913e+06 | 2.166533e+06 | 2.279349e+06 | 2.394552e+06 | 2.501921e+06 | 2.601182e+06 | 2.693137e+06 | 2.776234e+06 | 2.835444e+06 | 2.889894e+06 | 2.940447e+06 | 2.986753e+06 | 3.029652e+06 | 3.085006e+06 | 3.133295e+06 | 3.178787e+06 | 3.228598e+06 | 3.286465e+06 | 3.354377e+06 | 3.432882e+06 | 3.520159e+06 | 3.611331e+06 | 3.703050e+06 | 3.809833e+06 | 3.915014e+06 | 4.018354e+06 | 4.120709e+06 | 4.232072e+06 | 4.332720e+06 | 4.438241e+06 | 4.535734e+06 | 4.610902e+06 | 4.666392e+06 | 4.717204e+06 | 4.750181e+06 | 4.754844e+06 | 4.722015e+06 | 4.693326e+06 | 4.636537e+06 | 4.575885e+06 | 4.509100e+06 | 4.438790e+06 | 4.369325e+06 | 4.306309e+06 | 4.260833e+06 | 4.232643e+06 | 4.217922e+06 | 4.262094e+06 | 4.338764e+06 | 4.361333e+06 | 4.393616e+06 | 4.445335e+06 | 4.514702e+06 | 4.585999e+06 | 4.667807e+06 |
| Population ages 0-14, female (% of total) | 3.885587e+01 | 4.004961e+01 | 4.117077e+01 | 4.205366e+01 | 4.264616e+01 | 4.318524e+01 | 4.342894e+01 | 4.347294e+01 | 4.343403e+01 | 4.334387e+01 | 4.307071e+01 | 4.282309e+01 | 4.254608e+01 | 4.216921e+01 | 4.168474e+01 | 4.133878e+01 | 4.086070e+01 | 4.033824e+01 | 3.988732e+01 | 3.955680e+01 | 3.934011e+01 | 3.925613e+01 | 3.926229e+01 | 3.928646e+01 | 3.929056e+01 | 3.943098e+01 | 3.953493e+01 | 3.960436e+01 | 3.964574e+01 | 3.965623e+01 | 3.972783e+01 | 3.972264e+01 | 3.964621e+01 | 3.948941e+01 | 3.923452e+01 | 3.883798e+01 | 3.833386e+01 | 3.772782e+01 | 3.703048e+01 | 3.625239e+01 | 3.534074e+01 | 3.440633e+01 | 3.345267e+01 | 3.250502e+01 | 3.159943e+01 | 3.075273e+01 | 3.000944e+01 | 2.936062e+01 | 2.878452e+01 | 2.827154e+01 | 2.800841e+01 | 2.771110e+01 | 2.743799e+01 | 2.725245e+01 | 2.716502e+01 | 2.706964e+01 | 2.706237e+01 |
| Population ages 0-14, female | 1.782782e+06 | 1.897795e+06 | 2.015837e+06 | 2.129206e+06 | 2.234607e+06 | 2.344084e+06 | 2.443229e+06 | 2.533641e+06 | 2.618036e+06 | 2.695742e+06 | 2.757036e+06 | 2.815537e+06 | 2.869647e+06 | 2.917197e+06 | 2.959098e+06 | 3.013213e+06 | 3.059334e+06 | 3.102558e+06 | 3.150461e+06 | 3.206554e+06 | 3.270762e+06 | 3.345773e+06 | 3.429241e+06 | 3.516078e+06 | 3.603421e+06 | 3.705851e+06 | 3.807259e+06 | 3.907099e+06 | 4.005094e+06 | 4.109664e+06 | 4.201685e+06 | 4.295763e+06 | 4.380799e+06 | 4.445453e+06 | 4.494103e+06 | 4.532893e+06 | 4.558717e+06 | 4.559849e+06 | 4.524898e+06 | 4.492442e+06 | 4.436184e+06 | 4.373043e+06 | 4.302459e+06 | 4.229677e+06 | 4.160166e+06 | 4.098199e+06 | 4.055902e+06 | 4.031538e+06 | 4.018587e+06 | 4.058722e+06 | 4.128931e+06 | 4.144176e+06 | 4.166385e+06 | 4.207140e+06 | 4.266101e+06 | 4.324528e+06 | 4.395593e+06 |
What was the net change in numbers of male and female children between 1960 and 2016?
dim(data_0_14)
## [1] 6 57
(data_0_14[,57]-data_0_14[,1])[c(4,6)]
## [1] 2852372 2612811
Create a vector x that consists of the numbers from 1 to 20. Run a FOR-loop for i in 1 to 20, and at each step, print the product of the value i and value i+1. Then, avoid the last printed value, which is “NA”, by transforming the previous FOR-loop into a WHILE-loop.
x <- c(1:20)
for (i in 1:20)
{ print(x[i]*x[i+1]) }
## [1] 2
## [1] 6
## [1] 12
## [1] 20
## [1] 30
## [1] 42
## [1] 56
## [1] 72
## [1] 90
## [1] 110
## [1] 132
## [1] 156
## [1] 182
## [1] 210
## [1] 240
## [1] 272
## [1] 306
## [1] 342
## [1] 380
## [1] NA
i <- 1
while (i <= 19)
{ print(x[i]*x[i+1])
i <- i+1}
## [1] 2
## [1] 6
## [1] 12
## [1] 20
## [1] 30
## [1] 42
## [1] 56
## [1] 72
## [1] 90
## [1] 110
## [1] 132
## [1] 156
## [1] 182
## [1] 210
## [1] 240
## [1] 272
## [1] 306
## [1] 342
## [1] 380
Create a vector y that is a sequence from 10 to 200, consisting of values divisible by 10. Run a FOR-loop for i in 1 to 20, and at each iteration, print y[i] if y[i] is divisible by 50, and print “not divisible by 50” if it is not. Do that by including an IF-statement in the FOR-loop. Then, change the IF-statement so only values that are above 50 and below 150 are printed.
y <- seq(from=10, to=200, by=10)
for (i in 1:20)
{ if (y[i]%%50==0)
print(y[i])
else print("not divisible by 50")}
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] 50
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] 100
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] 150
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] "not divisible by 50"
## [1] 200
for (i in 1:20)
{ if ((y[i]>50) & (y[i]<150))
print(y[i])
}
## [1] 60
## [1] 70
## [1] 80
## [1] 90
## [1] 100
## [1] 110
## [1] 120
## [1] 130
## [1] 140
Create a function “sum_mean” that uses two vectors a and b as arguments, calculates the sum of both vectors “vectors_sum” and returns the mean value of “vectors_sum”. Run the function with the vectors x and y from the previous two exercises
sum_mean <- function(a, b)
{ vectors_sum <- a+b
return(mean(vectors_sum))
}
sum_mean(a=x, b=y)
## [1] 115.5
Write a FOR-loop that computes a vector of all prime numbers between 3 and 100.
prime_numbers <- NULL # initiate an empty prime_number vector
for (i in 3:100)
{ division <- NULL # initiate an empty division vector
for (j in (2:(i-1)))
{ if (i%%j == 0) # check if i is divisible by j
division[j] <- 1 # if it is divisible, paste the number "1" in the division vector
}
if (1 %in% division == F) # only append the prime_number vector if there is no number "1" in the division vector
prime_numbers <- c(prime_numbers, i)
}
prime_numbers
## [1] 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
# Try to see how long it takes to compute all prime numbers below 1000, 10000, 100000 ...
Write a function that computes the binomial coefficient of two values n and k, and test it for n=49 and k=6.
bin_coeff <- function(n, k)
{ n_factorial <- 1
for (i in 1:n)
{ n_factorial <- n_factorial*i }
k_factorial <- 1
for (i in 1:k)
{ k_factorial <- k_factorial*i }
nk_factorial <- 1
for (i in 1:(n-k))
{ nk_factorial <- nk_factorial*i }
return(n_factorial/(k_factorial*nk_factorial))
}
bin_coeff(49, 6)
## [1] 13983816
For the following exercise, we will work with the built-in dataset iris. You can view a description of this dataset [here] (https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html) and have a look at the data by executing “head(iris)”.
Create a copy of “iris”" that you name “data_iris”. Create a new column in “data_iris” and name it “description”. Characterize each entry according to whether “Sepal.Length”" and “Petal.Length” is less/equal to, or greater than 5.8, and less/equal to, or greater than 4.35, respectively, by writing into the fourth column one of the following four statements:
data_iris <- iris
data_iris$description <- 0
for (i in 1:length(data_iris[,1]))
{ if (iris$Sepal.Length[i] <= 5.8)
{ if (iris$Petal.Length[i] <= 4.35)
data_iris[i,6] <- "short sepals, short petals"
else data_iris[i,6] <- "short sepals, long petals"
}
else if (iris$Petal.Length[i] <= 4.35)
data_iris[i,6] <- "long sepals, short petals"
else data_iris[i,6] <- "long sepals, long petals"
}
head(data_iris)
| Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species | description |
|---|---|---|---|---|---|
| 5.1 | 3.5 | 1.4 | 0.2 | setosa | short sepals, short petals |
| 4.9 | 3.0 | 1.4 | 0.2 | setosa | short sepals, short petals |
| 4.7 | 3.2 | 1.3 | 0.2 | setosa | short sepals, short petals |
| 4.6 | 3.1 | 1.5 | 0.2 | setosa | short sepals, short petals |
| 5.0 | 3.6 | 1.4 | 0.2 | setosa | short sepals, short petals |
| 5.4 | 3.9 | 1.7 | 0.4 | setosa | short sepals, short petals |
Write a function that takes (1) a table with the same structure as the iris data and (2) a threshold value as inputs and computes the mean value for “Sepal.Length”" of all entries whose value for “Sepal.Width”" exceeds the threshold value. Run this function on “data_iris”, setting the threshold to 3.
my.function <- function(table, threshold)
{ return(mean(subset(table, Sepal.Width > threshold)$Sepal.Length))
}
my.function(data_iris, 3)
## [1] 5.683582