Stephen C. Loftus Division of Science, Technology, Engineering and Math Sweet Briar College Sweet Briar, VA, United States

Chapter 2 : An introduction to R

2.5 Vectors

membuat vektor

height = c(75, 74, 67, 83, 75)

height + 5

## [1] 80 79 72 88 80

x = height/12
x

## [1] 6.250000 6.166667 5.583333 6.916667 6.250000

Membuat Vektor

name = c("Alex Ovechkin", "Mike Trout", "Lionel Messi",
         "Giannis Antetokounmpo", "Patrick Mahomes")

menggabungkan vektor menjadi dataframe

athlete=data.frame(name, height)

Memanggi variable name dengna dolar

athlete$name

## [1] "Alex Ovechkin"         "Mike Trout"            "Lionel Messi"         
## [4] "Giannis Antetokounmpo" "Patrick Mahomes"

2.7 Practice Problems

Consider the following set of attributes about the American Film Institute’s topfive movies ever from their 2007 list.

1. What code would you use to create a vector named Movie with the values Citizen Kane, The Godfather, Casablanca, Raging Bull, and Singing in the Rain?

Movie = c("Citizen Kane", "The Godfather", "Casablanca", "Raging Bull", "Singing in the Rain")

2. What code would you use to create a vector—giving the year that the movies in Problem 1 were made—named Year with the values 1941, 1972, 1942, 1980, and 1952?

Year = c(1941, 1972, 1942,
         1980, 1952)

3. What code would you use to create a vector—giving the run times in minutes of the movies in Problem 1—named RunTime with the values 119, 177, 102, 129, and 103?

RunTime = c(119, 177, 102,
            129, 103)

4. What code would you use to find the run times of the movies in hours and save them in a vector called RunTimeHours?

RunTimeHours = RunTime/60
RunTimeHours

## [1] 1.983333 2.950000 1.700000 2.150000 1.716667

5. What code would you use to create a data frame named MovieInfo containing the vectors created in Problem 1, Problem 2, and Problem 3?

MovieInfo=data.frame(Movie, Year, RunTime)
MovieInfo

##                 Movie Year RunTime
## 1        Citizen Kane 1941     119
## 2       The Godfather 1972     177
## 3          Casablanca 1942     102
## 4         Raging Bull 1980     129
## 5 Singing in the Rain 1952     103

Consider the following set of attributes about a series of LucasArts—anearly video game company under the umbrella of George Lucas’s Lucasfilm company—video games.

6. What code would you use to create a vector named Title with the values The Secret of Monkey Island, Indiana Jones, and the Fate of Atlantis, Day of the Tentacle, and Grim Fandango?

Title = c("The Secret of Monkey Island", "Indiana Jones and the Fate of Atlantis", "Day of the Tentacle", "Grim Fandango")
Title

## [1] "The Secret of Monkey Island"           
## [2] "Indiana Jones and the Fate of Atlantis"
## [3] "Day of the Tentacle"                   
## [4] "Grim Fandango"

7. What code would you use to create a vector—giving the year that the games in Problem 6 were released—named Release with the values 1990, 1992, 1993, and 1998?

Release = c(1990, 1992, 1993, 1998)
Release

## [1] 1990 1992 1993 1998

8. LucasArts was founded in 1982. What code would you use to calculate how many years after the founding of the company was the game released?

Release-1982

## [1]  8 10 11 16

9. Each of these games fall under the genre of adventure games. In 2011, adventuregamers.com created a ranking of the top 100 adventure games of all time. Create a vector—containing the rankings of the aforementioned games—named Rank with the values 14, 11, 6, and 1 [6].

Rank = c(14, 11, 6, 1)
Rank

## [1] 14 11  6  1

10. What code would you use to create a data frame called AdventureGames containing the vectors contained in Problem 6, Problem 7, and Problem 9?

AdventureGames=data.frame(Title, Release, Rank)
AdventureGames

##                                    Title Release Rank
## 1            The Secret of Monkey Island    1990   14
## 2 Indiana Jones and the Fate of Atlantis    1992   11
## 3                    Day of the Tentacle    1993    6
## 4                          Grim Fandango    1998    1

Chapter 4 :subsetting data, random numbers, and selecting a random sample

4.2 Subsetting vectors

hours=c(8.84, 3.26, 2.81, 0.64, 0.60, 0.53, 0.37, 0.35, 0.31, 0.24)
hours[1]

## [1] 8.84

hours[c(1,3,9)]

## [1] 8.84 2.81 0.31

hours[hours>1]

## [1] 8.84 3.26 2.81

hours[hours>=0]

##  [1] 8.84 3.26 2.81 0.64 0.60 0.53 0.37 0.35 0.31 0.24

4.3 Subsetting data frames

Membuat Dataframe

DailyACT = c("Sleeping", "Working", "Watching Television", "Socializing", "Food Preparation", "Housework", "Childcare", "Consumer Goods Purchase", "Participating in Recreation", "Attending Class")
AverageHours = c(8.84, 3.26, 2.81, 0.64, 0.60, 0.53, 0.37, 0.35, 0.31, 0.24)
Category = c("Personal Care", "Work-Related", "Leisure", "Leisure", "Household", "Household", "Caring for Household", "Purchasing", "Leisure", "Education")
Aktivitas = data.frame(DailyACT, AverageHours, Category)

Melihat dataframe

head(Aktivitas)

##              DailyACT AverageHours      Category
## 1            Sleeping         8.84 Personal Care
## 2             Working         3.26  Work-Related
## 3 Watching Television         2.81       Leisure
## 4         Socializing         0.64       Leisure
## 5    Food Preparation         0.60     Household
## 6           Housework         0.53     Household

Aktivitas[5, 3]

## [1] "Household"

Aktivitas[9,1]

## [1] "Participating in Recreation"

Aktivitas[10,]

##           DailyACT AverageHours  Category
## 10 Attending Class         0.24 Education

Aktivitas$Category

##  [1] "Personal Care"        "Work-Related"         "Leisure"             
##  [4] "Leisure"              "Household"            "Household"           
##  [7] "Caring for Household" "Purchasing"           "Leisure"             
## [10] "Education"

Aktivitas[Aktivitas$AverageHours>1,]

##              DailyACT AverageHours      Category
## 1            Sleeping         8.84 Personal Care
## 2             Working         3.26  Work-Related
## 3 Watching Television         2.81       Leisure

4.4 Random numbers

set.seed(8)
rnorm(10)

##  [1] -0.08458607  0.84040013 -0.46348277 -0.55083500  0.73604043 -0.10788140
##  [7] -0.17028915 -1.08833171 -3.01105168 -0.59317433

4.5 Select a random sample

Tentukan jumlah subjek dalam kerangka sampel

n <- 100

Tentukan jumlah subjek yang diinginkan dalam sampel Anda

size <- 10

Hasilkan sampel acak berupa nomor baris

random_sample <- sample.int(n = n, size = size)
random_sample

##  [1] 68  9 76 62  7 40 19 63 70 96

contoh lain dalam buku

sample.int(n=8812107, size= 1000)

##    [1] 4271829 7883242  516158 3425146 4074800 5509424 7817189  350986 2031459
##   [10] 1615368 6491507 3482055 7622768 7634721  240647  171314 6417297 6414028
##   [19] 5418383 3032930 7321369 1996852 2302789 8744108 3385074 2330105 2156020
##   [28] 5107968 1726049 4071228  194802  829172  840589 8096716 1383227 3345306
##   [37] 8080102 1042726 8090272 6585146 5327243  310393 1399965 4921543 4061119
##   [46] 5561438 7955717 1134578 7469032  829744 8307534 1141845 5745019 8003636
##   [55] 7209261 1193529 2764216 7770812 3359599 6699308 4049959 4475573 5290976
##   [64] 1452193 4032128 5169260 2329492 5958923 8492609 1419064 1559107 4751100
##   [73] 5987273 6395466  725013 1935126 3294068 4498272 3346226 7582313 6352515
##   [82] 4677659 4566833  599740 2564571 2261270 3172226 4445036 4918309 6380517
##   [91] 3435313 2180726 5060162 6270426  476831 3470488 5344196 8706977 6876999
##  [100] 3687628  815852 3357356 4888526 5259193 6221265 7987635 3927789 6507797
##  [109] 1698549 1976527 2936567 5317581 5006013 7792187 5123571 3448014 2585142
##  [118] 2686020  357978 3712576 4866754  452818 4712681 7215346 7846210 1690122
##  [127] 3195453 2412602 1236611 1913410 3723712 2713912 8679455 4433481 2036552
##  [136] 7966383 4522605 8399907 1667993 6513255  274748 8212602 7745306 6446066
##  [145] 3422094 4402091 3301379 2292327 8475306 7256278 3259276  318545 8428836
##  [154] 4560964 1517730 8683323 2935348 6689299 8067570 4156207 8558373 2024017
##  [163] 1088663 1509896 4153615 6842766 1969503 2840411 8151934 1333516 7894294
##  [172] 2878913 5425831 7104480 4252883 7620686 3542420 7854367 1791106 2181795
##  [181]   29843 6465553 1344981  490836 1884936 4156913 1855823 4382364 6390413
##  [190] 6658943 7931012 1681660 7317272 3869699 6166125 7058752 6491809 4620144
##  [199] 5878988 4197218  705484 7384535 5705224 2131567 2256468 4233301 8646768
##  [208] 4260877 5269940  505741 2997424 8544427 6987870 3815575 3850376 1190619
##  [217] 6565280 8331079  788628 8198881 2992068 1777572 6712366 8792100 3973709
##  [226] 5690806 3373373 7062895 4248562 3056871 6441274 5539274 6986615  830912
##  [235] 3390843  603422 5954848 7097158 1839672 4749556 5064247  986196 2614119
##  [244] 2849146 7433018 6906465 7685802 8592452 5276141  158100 4919966  398693
##  [253] 1198931 7723355  281618 4776576 3655783 3222734 2131839 6716590 5122825
##  [262] 7258874 6403856 8749029  257271 2458266 7725561 5323939 6389986 5907399
##  [271] 3120192 2969170 3615067 2180496 2013330 3270700 5786467 3832364 5453654
##  [280] 5921920 7419364 3802582 5682970 6875093 2519466 5039047 8765753 2892967
##  [289] 1117175 7951061 2541857  834084 5567109 1042082  964050 4069229 1686913
##  [298] 6376018 8788537 8395798 5781658 3125514 3484678 4053922 4513939 2206960
##  [307] 1918627  904562 1349855 4664091 5802168 2623742 8075472 1220713 1219778
##  [316] 7991722 8352875 1416256 7286641 7200875 3909410 5610079 5410986  982654
##  [325] 8044399 2440294 2447845 1966998 2104295 1469505 3959361 6385763 4109851
##  [334] 8241812 8243477 7109170 6841713 8286160 2196442 8508759 5414838 2699696
##  [343] 5503315 2513118 8418946 8645092 2153109 1089953 8784453 3508969 2664910
##  [352] 3390839  840495  818225  758820 1243583 6608497 4461838 4581465 1349550
##  [361] 2123729 4705242 7465880  634046  219760 3905795 6562998 8698434 5976274
##  [370] 4929822   37768 6769244 5366105 7614102 5871604 1556159 8718481 1005031
##  [379] 2039829 7650946  385563 2177774 4346146 7556131 3110981 6821135   82195
##  [388] 6849722 6949009 5145052 6809496 5119805  826873 6201404 8475223 7635805
##  [397] 1326707 4840086 2767169 4350911  216319 4274609 8448048 7044993 5661050
##  [406] 5005584 8108684 1111038  979657 1140874 2803458 8599015   24494 2609536
##  [415] 4260577 1948026 6316762 2370561 6084719 5732608 4080046 8807132 6127735
##  [424] 4036531 5056167  677212 3050528  252655 1317620  897679 6457463  626973
##  [433] 1611854 5436198 7799677 2273965 8761538 3832644 7016967 7618006  136467
##  [442] 8477209 3195245  654857 7039752 3854249 6304046 8743295 2857353  703012
##  [451] 5485863 5157479 8508922 8071148 3081753 5048658 8584572 6802095  545178
##  [460] 7092494 2923931 5176867 3475613 3456004 1553567 7878361 1327858   23368
##  [469] 3242227 7212672 1794369 7687232 4838119 5134135 2218901 1189753 3364172
##  [478]  199398 5645429 7156737 7088329 8631182 4084487 4997931 4903745 2188707
##  [487] 1492828 5673210 5053869 5006377 2882899  500481 6880247 7726401 8403430
##  [496]  938941 6989647 1867960 4810006 1470617  662691 5287722 3806818 3975445
##  [505] 5590025 8218759 4115023 1109832 5778089 3334258  706239 8657537 1032991
##  [514] 7195506 7677726 4242373 5630659 7925215 7828070 5471394  207054 1559204
##  [523] 8259476 3707097 7949407 7107488 4898739 1294191 3075815 6404163 6008811
##  [532] 6770657 7751427 1004215 6268859 2407488 6628042 3694341 4994346 4937975
##  [541] 7073399  514447 8187332 2132697 7390789  804356 3581088 2289842 1213310
##  [550] 1410823 4922246 4365189 6912688 7899740 4675616 6169184 4041444 5080312
##  [559] 8428300 1721368  795490 6227779 1657035 1318247 4776478  825973 1196123
##  [568] 5960082 3647493 8382297 1369619 7206608 4359356 6158495 5802888 2224129
##  [577] 6287457 4356388 8028274 2921177 4326213 7205113 5043105 7627632 3783198
##  [586] 4994446 8343175 5885847 5028496 5589867 5800194 8757510 2244932 2612454
##  [595] 6283216 5615711  919661 1284863 1066514  914581 3277724 6192598 8414409
##  [604]  173462 2470682 4284129 3772647 1806560 3449675  951466  610867 4890999
##  [613] 8380345 6871391 5278872 8560138 2861468 2292382 5453061 2245549 7258181
##  [622] 5464338 6511616 5287992 6193316 5547729 6258618 1818292 1970915 7258424
##  [631] 2872897 2509675 6887492 3777237  594220 4732813 5468087 6012645 3472312
##  [640] 3494996 1914378 3829207 1791209 2132008 1357482 7305407 5035536 2318257
##  [649] 3102210 7041324 7962822 4962871 5824485  849631 2916267  596991   97704
##  [658] 2453778 8785615 8441573 6931931 8539985 3406351 6459109 3958335 8681646
##  [667] 6472499 3068353 5033495 4547001 5213115 1966962 5331646 8785496 4535285
##  [676] 6918776 4522721 6023006 5655951 2836522 6441896 3344375 1234265 3981976
##  [685] 4197800 3371195 8546271  692737 8482824 7132996 3886607 2963511 2982561
##  [694] 3284349 4200410 1338334 6762452 1584456  632394 5462110 2451333 6455272
##  [703] 1547168 8476020 1058029 3380453 2733584  578629 7280143  819991 7421858
##  [712] 2482893 7114116  152708 2396031 7532251 1087428 6406019 3550991 7872612
##  [721] 2617871  707917  380503 6263710 3140857  764303 8346650 8599097 8663015
##  [730] 6547207 5568531 1915540 8027846 1926865  900268 8672559 4992294 5806163
##  [739] 3793178 7205484 4757134 7806621 5584813 1000820 7510249 4204365 4192058
##  [748]  948981 4711187 2682968 2850203 4216776 6503033 7586717 3311050 1979135
##  [757] 6862373 4573186 7991855 1155097 2144900 7867598 8586399 4086509 5156881
##  [766] 5851656 4893898 3089270 1898110 3706104 2643923 3689325 7460163 6192967
##  [775] 6013003 3701662 4946382 3071865 1771439 2846275 4162456 1009057 3525193
##  [784] 1072425 5697445 4955385 4249996  454371   75964 2289365 5677856 2953445
##  [793] 7580983 6518456  250138 1490788 7515317  699913 1097007 2276087  207507
##  [802] 5039603 1290683 2925938 7501871 2385114 3568495 1346528 7987012 3911732
##  [811] 1460070 7934729 5140754 7406497 2212524 7587068 7059891 5383838 5084112
##  [820] 5652358 5808575 5601537 8479206  855704 8556738 5112787 1969785 8079967
##  [829] 4047807 3161380  472271 1961678 5631073 5221636 3938526 3692792 7085156
##  [838] 1554011 5773951 2625575 7401493  684239 3071030 4348576 6027912 7442619
##  [847] 1398527 4327069 4188800 4440148 1072751 3156371 5410950 1166643  419507
##  [856] 2534076 6605876 6628227 3306892 4083384 3951014  369995 1099649 2989525
##  [865] 4013080 6448580 3904626 4787271 8394699 1057363 6061389 6012091 8256692
##  [874] 8083551 5527872 3570343  282233 5097626 1389435  357518   64889 2243807
##  [883] 4408119 1824229 6674336  619359 4306968 6654700 4301621  137846 7323687
##  [892] 3409875 6483143 5197112 8257413 7034795 8158537 5455055 7200663 2744274
##  [901] 7507768 8513005 1975820 7077859 8407775 4548109 7152996 1620364 5996381
##  [910] 7547893 4288650 3583014 5249479 4346253 7745196  360886 8476491 3280398
##  [919] 2666970 5832461 2769776 2482379 1657026 7521232 4997715  917340 8447607
##  [928] 5513599 7556848 2542471 6355370 6850719 4606323 1863898 2970483 2732960
##  [937] 1367279  725549 3341106 8222665 2819914 2506230 5480257 7091635 3968474
##  [946] 7874365 7435954 6729905 4555060  760982 7288108 7733542 3564377 2595193
##  [955] 6756859 5694450 4190194 4933048 4304363 6030377 6314690 8040268 3841491
##  [964] 8639102 4889119 4288549 1845260 3656536 4741212 4043956 3131241 6394714
##  [973] 8047918 6265813 7431930 1780495 2413862 3722822  972645 4929768 5025095
##  [982] 1814356 4141355 2785073 5350992 6833487 6833012 3627276 5919997 6926338
##  [991] 4349979 7579935 5594703 7689142 6298751 2216152 2580806 8383231 8132424
## [1000] 6208310

4.6 Getting help in R

help(sample.int)

## starting httpd help server ... done

?sample.int

4.7 Practice problems

1. What code would you use to select the first, third, tenth, and twelfth entries in the TopSalary vector from the Colleges data frame?

College = c("William and Mary", "Christopher Newport", "George Mason", "James Madison", "Longwood", "Norfolk State", "Old Dominion", "Radford", "Mary Washington", "Virginia", "Virginia Commonwealth", "Virginia Military Institute", "Virginia Tech", "Virginia State")
Employees = c(2104, 922, 4043, 2833, 746, 919, 2369, 1273, 721, 7431, 5825, 550, 7303, 761)
TopSalary = c(425000, 381486, 536714, 428400, 322868, 295000, 448272, 312080, 449865, 561099, 503154, 364269,  500000, 356524)
MedianSalary = c(56496, 47895, 63029, 53080, 52000, 49605, 54416, 51000, 53045, 60048, 55000, 44999, 51656, 55925)
Colleges <- data.frame(College,Employees,TopSalary,MedianSalary)

2. What code would you use to select the elements of the MedianSalary vector where the TopSalary is greater than $400,000?

selected_median_salaries <- Colleges$MedianSalary[Colleges$TopSalary > 400000]

print(selected_median_salaries)

## [1] 56496 63029 53080 54416 53045 60048 55000 51656

3. What code would you use to select the rows of the data frame for colleges with less than or equal to 1000 employees?

selected_colleges <- Colleges[Colleges$Employees <= 1000, ]

print(selected_colleges)

##                        College Employees TopSalary MedianSalary
## 2          Christopher Newport       922    381486        47895
## 5                     Longwood       746    322868        52000
## 6                Norfolk State       919    295000        49605
## 9              Mary Washington       721    449865        53045
## 12 Virginia Military Institute       550    364269        44999
## 14              Virginia State       761    356524        55925

4. What code would you use to select a sample of 5 colleges from this data frame (there are 14 rows)?

sampled_colleges <- Colleges[sample(1:nrow(Colleges), size = 5), ]

print(sampled_colleges)

##               College Employees TopSalary MedianSalary
## 2 Christopher Newport       922    381486        47895
## 7        Old Dominion      2369    448272        54416
## 4       James Madison      2833    428400        53080
## 3        George Mason      4043    536714        63029
## 9     Mary Washington       721    449865        53045

Suppose we have the following data frame named Countries:

# Membuat data frame 'Countries'
Countries <- data.frame(
  Nation = c("China", "India", "United States", "Indonesia", "Brazil", 
             "Pakistan", "Nigeria", "Bangladesh", "Russia", "Mexico"),
  Region = c("Asia", "Asia", "North America", "Asia", "South America", 
             "Asia", "Africa", "Asia", "Europe", "North America"),
  Population = c(1409517397, 1339180127, 324459463, 263991379, 209288278,
                 197015955, 190886311, 164669751, 143989754, 129163276),
  PctIncrease = c(0.40, 1.10, 0.70, 1.10, 0.80, 
                  2.00, 2.60, 1.10, 0.00, 1.30),
  GDPcapita = c(8582, 1852, 57467, 3895, 10309, 
                1629, 2640, 1524, 10248, 8562)
)

# Menampilkan data frame
print(Countries)

##           Nation        Region Population PctIncrease GDPcapita
## 1          China          Asia 1409517397         0.4      8582
## 2          India          Asia 1339180127         1.1      1852
## 3  United States North America  324459463         0.7     57467
## 4      Indonesia          Asia  263991379         1.1      3895
## 5         Brazil South America  209288278         0.8     10309
## 6       Pakistan          Asia  197015955         2.0      1629
## 7        Nigeria        Africa  190886311         2.6      2640
## 8     Bangladesh          Asia  164669751         1.1      1524
## 9         Russia        Europe  143989754         0.0     10248
## 10        Mexico North America  129163276         1.3      8562

5. What could would you use to select the rows of the data frame that have GDP per capita less than 10000 and are not in the Asia region?

selected_countries <- Countries[Countries$GDPcapita < 10000 & Countries$Region != "Asia", ]

print(selected_countries)

##     Nation        Region Population PctIncrease GDPcapita
## 7  Nigeria        Africa  190886311         2.6      2640
## 10  Mexico North America  129163276         1.3      8562

6. What code would you use to select a sample of three nations from this data frame (There are 10 rows)?

# Set seed for reproducibility (optional)
set.seed(123)

# Select a random sample of 3 nations
sample_countries <- Countries[sample(1:nrow(Countries), 3), ]

# Display the sampled nations
print(sample_countries)

##           Nation        Region Population PctIncrease GDPcapita
## 3  United States North America  324459463         0.7     57467
## 10        Mexico North America  129163276         1.3      8562
## 2          India          Asia 1339180127         1.1      1852

7. What code would you use to select which nations saw a population percent increase greater that 1.5%?

# Filter rows where the population percent increase is greater than 1.5%
high_increase_nations <- Countries[Countries$PctIncrease > 1.5, ]

# Display the nations with a population percent increase greater than 1.5%
print(high_increase_nations)

##     Nation Region Population PctIncrease GDPcapita
## 6 Pakistan   Asia  197015955         2.0      1629
## 7  Nigeria Africa  190886311         2.6      2640

Suppose we have the following data frame named Olympics:

Year = c(1992, 1992, 1994, 1996, 1998, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016, 2018)
Type = c("Summer", "Winter", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter")
Host = c("Spain", "France", "Norway", "United States", "Japan", "Australia", "United States", "Greece", "Italy", "China", "Canada", "United Kingdom", "Russia", "Brazil", "South Korea")
Competitors = c(9356, 1801, 1737, 10318, 2176, 10651, 2399, 10625, 2508, 10942, 2566, 10768, 2873, 11238, 2922)
Events = c(257, 57, 61, 271, 68, 300, 78, 301, 84, 302, 86, 302, 98, 306, 102)
Nations = c(169, 64, 67, 197, 72, 199, 77, 201, 80, 204, 82, 204, 88, 207, 92)
Leader = c("Unified Team", "Germany", "Russia", "United States", "Germany", "United States", "Norway", "United States", "Germany", "China", "Canada", "United States", "Russia", "United States", "Norway")
Olympics <- data.frame(Year,Type,Host,Competitors,Events,Nations,Leader)

What code would you use to select the rows of the data frame where the host nation was also the medal leader?

# Filter rows where the host nation is also the medal leader
host_leader_match <- Olympics[Olympics$Host == Olympics$Leader, ]

# Display the filtered data
print(host_leader_match)

##    Year   Type          Host Competitors Events Nations        Leader
## 4  1996 Summer United States       10318    271     197 United States
## 10 2008 Summer         China       10942    302     204         China
## 11 2010 Winter        Canada        2566     86      82        Canada
## 13 2014 Winter        Russia        2873     98      88        Russia

9. What code would you use to select the rows of the data frame where the number of competitors per event is greater than 35?

# Filter rows where the number of competitors per event is greater than 35
competitors_per_event <- Olympics[Olympics$Competitors / Olympics$Events > 35, ]

# Display the filtered data
print(competitors_per_event)

##    Year   Type           Host Competitors Events Nations        Leader
## 1  1992 Summer          Spain        9356    257     169  Unified Team
## 4  1996 Summer  United States       10318    271     197 United States
## 6  2000 Summer      Australia       10651    300     199 United States
## 8  2004 Summer         Greece       10625    301     201 United States
## 10 2008 Summer          China       10942    302     204         China
## 12 2012 Summer United Kingdom       10768    302     204 United States
## 14 2016 Summer         Brazil       11238    306     207 United States

10.What code would you use to select the rows of the data frame where the number of competing nations in the Winter Olympics is at least 80?

# Filter rows where the Olympic type is Winter and the number of competing nations is at least 80
winter_nations_80 <- Olympics[Olympics$Type == "Winter" & Olympics$Nations >= 80, ]

# Display the filtered data
print(winter_nations_80)

##    Year   Type        Host Competitors Events Nations  Leader
## 9  2006 Winter       Italy        2508     84      80 Germany
## 11 2010 Winter      Canada        2566     86      82  Canada
## 13 2014 Winter      Russia        2873     98      88  Russia
## 15 2018 Winter South Korea        2922    102      92  Norway

Praktikum H1 Komputasi Statistika

Muhamad Abdul Qodir Dani

2024-09-06

Basic Statistics With R Reaching Decisions With Data

Stephen C. Loftus Division of Science, Technology, Engineering and Math Sweet Briar College Sweet Briar, VA, United States

Chapter 2 : An introduction to R

2.5 Vectors

membuat vektor

Membuat Vektor

menggabungkan vektor menjadi dataframe

Memanggi variable name dengna dolar

2.7 Practice Problems

Consider the following set of attributes about the American Film Institute’s topfive movies ever from their 2007 list.

1. What code would you use to create a vector named Movie with the values Citizen Kane, The Godfather, Casablanca, Raging Bull, and Singing in the Rain?

2. What code would you use to create a vector—giving the year that the movies in Problem 1 were made—named Year with the values 1941, 1972, 1942, 1980, and 1952?

3. What code would you use to create a vector—giving the run times in minutes of the movies in Problem 1—named RunTime with the values 119, 177, 102, 129, and 103?

4. What code would you use to find the run times of the movies in hours and save them in a vector called RunTimeHours?

5. What code would you use to create a data frame named MovieInfo containing the vectors created in Problem 1, Problem 2, and Problem 3?

Consider the following set of attributes about a series of LucasArts—anearly video game company under the umbrella of George Lucas’s Lucasfilm company—video games.

6. What code would you use to create a vector named Title with the values The Secret of Monkey Island, Indiana Jones, and the Fate of Atlantis, Day of the Tentacle, and Grim Fandango?

7. What code would you use to create a vector—giving the year that the games in Problem 6 were released—named Release with the values 1990, 1992, 1993, and 1998?

8. LucasArts was founded in 1982. What code would you use to calculate how many years after the founding of the company was the game released?

9. Each of these games fall under the genre of adventure games. In 2011, adventuregamers.com created a ranking of the top 100 adventure games of all time. Create a vector—containing the rankings of the aforementioned games—named Rank with the values 14, 11, 6, and 1 [6].

10. What code would you use to create a data frame called AdventureGames containing the vectors contained in Problem 6, Problem 7, and Problem 9?

Chapter 4 :subsetting data, random numbers, and selecting a random sample

4.2 Subsetting vectors

4.3 Subsetting data frames

Membuat Dataframe

Melihat dataframe

4.4 Random numbers

4.5 Select a random sample

Tentukan jumlah subjek dalam kerangka sampel

Tentukan jumlah subjek yang diinginkan dalam sampel Anda

Hasilkan sampel acak berupa nomor baris

contoh lain dalam buku

4.6 Getting help in R

4.7 Practice problems

1. What code would you use to select the first, third, tenth, and twelfth entries in the TopSalary vector from the Colleges data frame?

2. What code would you use to select the elements of the MedianSalary vector where the TopSalary is greater than $400,000?

3. What code would you use to select the rows of the data frame for colleges with less than or equal to 1000 employees?

4. What code would you use to select a sample of 5 colleges from this data frame (there are 14 rows)?

Suppose we have the following data frame named Countries:

5. What could would you use to select the rows of the data frame that have GDP per capita less than 10000 and are not in the Asia region?

6. What code would you use to select a sample of three nations from this data frame (There are 10 rows)?

7. What code would you use to select which nations saw a population percent increase greater that 1.5%?

Suppose we have the following data frame named Olympics:

What code would you use to select the rows of the data frame where the host nation was also the medal leader?

9. What code would you use to select the rows of the data frame where the number of competitors per event is greater than 35?

10.What code would you use to select the rows of the data frame where the number of competing nations in the Winter Olympics is at least 80?