Latihan 2.7 Practice problems

1.What code would you use to create a vector named Movie with the values Citizen Kane, The Godfather, Casablanca, Raging Bull, and Singing in the Rain?

Movie <- c("Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing")
Movie
## [1] "Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing"

2. What code would you use to create a vector—giving the year that the moviesin Problem 1 were made—named Year with the values 1941, 1972, 1942, 1980, and 1952?

Year <- c(1941, 1972, 1942, 1980, 1952)
Year
## [1] 1941 1972 1942 1980 1952

3. What code would you use to create a vector—giving the run times in minutes of the movies in Problem 1—named RunTime with the values 119, 177, 102, 129, and 103?

RunTime <- c(119, 177, 102, 129, 103)

4.What code would you use to find the run times of the movies in hours and save them in a vector called RunTimeHours?

RunTimeHours <- RunTime / 60
RunTimeHours
## [1] 1.983333 2.950000 1.700000 2.150000 1.716667

5 What code would you use to create a data frame named MovieInfo containing the vectors created in Problem 1, Problem 2, and Problem 3?

Movieinfo <- data.frame(Movie, Year, RunTime)
Movieinfo
##                                                           Movie Year RunTime
## 1 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1941     119
## 2 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1972     177
## 3 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1942     102
## 4 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1980     129
## 5 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1952     103

6 What code would you use to create a vector named Title with the values The Secret of Monkey Island, Indiana Jones, and the Fate of Atlantis, Day of the Tentacle, and Grim Fandango?

Title <- c("The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango")

7 What code would you use to create a vector—giving the year that the games in Problem 6 were released—named Release with the values 1990, 1992, 1993, and 1998?

Release <- c(1990, 1992, 1993, 1998)
Release
## [1] 1990 1992 1993 1998

8 LucasArts was founded in 1982. What code would you use to calculate how many years after the founding of the company was the game released?

Lucas = 1982
Release - Lucas
## [1]  8 10 11 16

9 Each of these games fall under the genre of adventure games. In 2011, adventuregamers.com created a ranking of the top 100 adventure games of all time. Create a vector—containing the rankings of the aforementioned games—named Rank with the values 14, 11, 6, and 1 [6].

Rank <- c(14, 11, 6, 1)

10 What code would you use to create a data frame called AdventureGames containing the vectors contained in Problem 6, Problem 7, and Problem 9?

AdventureGames <- data.frame(Title, Release, Rank)
AdventureGames
##                                                                                                  Title
## 1 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 2 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 3 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 4 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
##   Release Rank
## 1    1990   14
## 2    1992   11
## 3    1993    6
## 4    1998    1

Latihan 4.7 Practice problems

colleges <- data.frame(
  College = c("William and Mary", "Christopher Newport", "George Mason", "James Madison", 
             "Longwood", "Norfolk State", "Old Dominion", "Radford", "Mary Washington",
             "Virginia", "Virginia Commonwealth", "Virginia Military Institute", "Virginia Tech", "Virginia State"),
  Employees = c(2104, 922, 4043, 2833, 746, 919, 2369, 1273, 721, 7431, 5825, 550, 7303, 761),
  TopSalary = c(425000, 381486, 536714, 428400, 328268, 295000, 448272, 312080, 449865, 561099, 503154, 364269, 500000, 356524),
  MedianSalary = c(56496, 47895, 63029, 53080, 52000, 49605, 54416, 51000, 53045, 60048, 55000, 44999, 51656, 55925)
)

1. What code would you use to select the first, third, tenth, and twelfth entriesin the TopSalary vector from the Colleges data frame?

selected_entries <- colleges$TopSalary[c(1, 3, 10, 12)]
selected_entries
## [1] 425000 536714 561099 364269

2. What code would you use to select the elements of the MedianSalary vector where the TopSalary is greater than $400,000?

# Selecting MedianSalary elements where TopSalary > 400000
selected_median_salary <- colleges$MedianSalary[colleges$TopSalary > 400000]
selected_median_salary
## [1] 56496 63029 53080 54416 53045 60048 55000 51656

3. # Selecting rows where Employees <= 1000

colleges_few_employees <- colleges[colleges$Employees <= 1000, ]
colleges_few_employees
##                        College Employees TopSalary MedianSalary
## 2          Christopher Newport       922    381486        47895
## 5                     Longwood       746    328268        52000
## 6                Norfolk State       919    295000        49605
## 9              Mary Washington       721    449865        53045
## 12 Virginia Military Institute       550    364269        44999
## 14              Virginia State       761    356524        55925

4 Selecting a random sample of 5 colleges

sampled_colleges <- colleges[sample(nrow(colleges), 5), ]
sampled_colleges
##                  College Employees TopSalary MedianSalary
## 4          James Madison      2833    428400        53080
## 9        Mary Washington       721    449865        53045
## 7           Old Dominion      2369    448272        54416
## 11 Virginia Commonwealth      5825    503154        55000
## 6          Norfolk State       919    295000        49605

Membuat data frame

Countries <- data.frame(
National <- c("China", "India", "United States", "Indonesia", "Brazil", "Pakistan", "Nigeria", "Bangladesh", "Russia", "Mexico"),
Region <- c("Asia", "Asia", "North America", "Asia", "South America", "Asia", "Africa", "Asia", "Europe", "North America"),
Populasi <- c(1409517397, 1339180127, 324459463, 263991379, 209288278, 197015955, 190886311, 164669751, 143989754, 129163276),
Pct_increase <- c(0.4, 1.1, 0.7, 1.1, 0.8, 2.0, 2.6, 1.1, 0.0, 1.3),
GDP_capita <- c(8582, 1852, 57467, 3895, 10309, 1629, 2640, 1524, 10248, 8562)
)

Countries
##    National....c..China....India....United.States....Indonesia...
## 1                                                           China
## 2                                                           India
## 3                                                   United States
## 4                                                       Indonesia
## 5                                                          Brazil
## 6                                                        Pakistan
## 7                                                         Nigeria
## 8                                                      Bangladesh
## 9                                                          Russia
## 10                                                         Mexico
##    Region....c..Asia....Asia....North.America....Asia....South.America...
## 1                                                                    Asia
## 2                                                                    Asia
## 3                                                           North America
## 4                                                                    Asia
## 5                                                           South America
## 6                                                                    Asia
## 7                                                                  Africa
## 8                                                                    Asia
## 9                                                                  Europe
## 10                                                          North America
##    Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 1                                                               1409517397
## 2                                                               1339180127
## 3                                                                324459463
## 4                                                                263991379
## 5                                                                209288278
## 6                                                                197015955
## 7                                                                190886311
## 8                                                                164669751
## 9                                                                143989754
## 10                                                               129163276
##    Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 1                                                              0.4
## 2                                                              1.1
## 3                                                              0.7
## 4                                                              1.1
## 5                                                              0.8
## 6                                                              2.0
## 7                                                              2.6
## 8                                                              1.1
## 9                                                              0.0
## 10                                                             1.3
##    GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 1                                                                8582
## 2                                                                1852
## 3                                                               57467
## 4                                                                3895
## 5                                                               10309
## 6                                                                1629
## 7                                                                2640
## 8                                                                1524
## 9                                                               10248
## 10                                                               8562

5. Selecting rows where GDP per capita is less than 10000 and Region is not ‘Asia’

filtered_countries <- Countries[Countries$GDP_capita < 10000 & Countries$Region != "Asia", ]
filtered_countries
##    National....c..China....India....United.States....Indonesia...
## 7                                                         Nigeria
## 10                                                         Mexico
##    Region....c..Asia....Asia....North.America....Asia....South.America...
## 7                                                                  Africa
## 10                                                          North America
##    Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 7                                                                190886311
## 10                                                               129163276
##    Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 7                                                              2.6
## 10                                                             1.3
##    GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 7                                                                2640
## 10                                                               8562

6. # Selecting a random sample of 3 nations

sampled_nations <- Countries[sample(nrow(Countries), 3), ]
sampled_nations
##    National....c..China....India....United.States....Indonesia...
## 8                                                      Bangladesh
## 10                                                         Mexico
## 9                                                          Russia
##    Region....c..Asia....Asia....North.America....Asia....South.America...
## 8                                                                    Asia
## 10                                                          North America
## 9                                                                  Europe
##    Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 8                                                                164669751
## 10                                                               129163276
## 9                                                                143989754
##    Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 8                                                              1.1
## 10                                                             1.3
## 9                                                              0.0
##    GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 8                                                                1524
## 10                                                               8562
## 9                                                               10248

7. # Selecting nations with a population percent increase greater than 1.5%

nations_increase <- Countries$National[Countries$Pct_increase > 1.5]

Membuat dataframe 2

Olympics <- data.frame(
  Year = c(1992, 1992, 1994, 1996, 1998, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016, 2018),
  Type = c("Summer", "Winter", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter"),
  Host = c("Spain", "France", "Norway", "United States", "Japan", "Australia", "United States", "Greece", "Italy", "China", "Canada", "United Kingdom", "Russia", "Brazil", "South Korea"),
  Competitors = c(9356, 1801, 1737, 10318, 2176, 10651, 2399, 10625, 2508, 10942, 2566, 10768, 2873, 11238, 2922),
  Events = c(257, 57, 61, 271, 68, 300, 78, 301, 84, 302, 86, 302, 98, 306, 102),
  Nations = c(169, 64, 67, 197, 72, 199, 78, 201, 80, 204, 82, 204, 88, 207, 92),
  Leader = c("Unified Team", "Germany", "Russia", "United States", "Germany", "United States", "Norway", "United States", "Germany", "China", "Canada", "United States", "Russia", "United States", "Norway")
)
Olympics
##    Year   Type           Host Competitors Events Nations        Leader
## 1  1992 Summer          Spain        9356    257     169  Unified Team
## 2  1992 Winter         France        1801     57      64       Germany
## 3  1994 Winter         Norway        1737     61      67        Russia
## 4  1996 Summer  United States       10318    271     197 United States
## 5  1998 Winter          Japan        2176     68      72       Germany
## 6  2000 Summer      Australia       10651    300     199 United States
## 7  2002 Winter  United States        2399     78      78        Norway
## 8  2004 Summer         Greece       10625    301     201 United States
## 9  2006 Winter          Italy        2508     84      80       Germany
## 10 2008 Summer          China       10942    302     204         China
## 11 2010 Winter         Canada        2566     86      82        Canada
## 12 2012 Summer United Kingdom       10768    302     204 United States
## 13 2014 Winter         Russia        2873     98      88        Russia
## 14 2016 Summer         Brazil       11238    306     207 United States
## 15 2018 Winter    South Korea        2922    102      92        Norway

8. Selecting rows where Host is also the Leader

host_medal_leader <- Olympics[Olympics$Host == Olympics$Leader, ]
host_medal_leader
##    Year   Type          Host Competitors Events Nations        Leader
## 4  1996 Summer United States       10318    271     197 United States
## 10 2008 Summer         China       10942    302     204         China
## 11 2010 Winter        Canada        2566     86      82        Canada
## 13 2014 Winter        Russia        2873     98      88        Russia

9. Adding a new column for Competitors per Event

Olympics$CompetitorsPerEvent <- Olympics$Competitors / Olympics$Events
Olympics
##    Year   Type           Host Competitors Events Nations        Leader
## 1  1992 Summer          Spain        9356    257     169  Unified Team
## 2  1992 Winter         France        1801     57      64       Germany
## 3  1994 Winter         Norway        1737     61      67        Russia
## 4  1996 Summer  United States       10318    271     197 United States
## 5  1998 Winter          Japan        2176     68      72       Germany
## 6  2000 Summer      Australia       10651    300     199 United States
## 7  2002 Winter  United States        2399     78      78        Norway
## 8  2004 Summer         Greece       10625    301     201 United States
## 9  2006 Winter          Italy        2508     84      80       Germany
## 10 2008 Summer          China       10942    302     204         China
## 11 2010 Winter         Canada        2566     86      82        Canada
## 12 2012 Summer United Kingdom       10768    302     204 United States
## 13 2014 Winter         Russia        2873     98      88        Russia
## 14 2016 Summer         Brazil       11238    306     207 United States
## 15 2018 Winter    South Korea        2922    102      92        Norway
##    CompetitorsPerEvent
## 1             36.40467
## 2             31.59649
## 3             28.47541
## 4             38.07380
## 5             32.00000
## 6             35.50333
## 7             30.75641
## 8             35.29900
## 9             29.85714
## 10            36.23179
## 11            29.83721
## 12            35.65563
## 13            29.31633
## 14            36.72549
## 15            28.64706
 # Selecting rows where Competitors per Event is greater than 35
competitors_per_event <- Olympics[Olympics$CompetitorsPerEvent > 35, ]
competitors_per_event
##    Year   Type           Host Competitors Events Nations        Leader
## 1  1992 Summer          Spain        9356    257     169  Unified Team
## 4  1996 Summer  United States       10318    271     197 United States
## 6  2000 Summer      Australia       10651    300     199 United States
## 8  2004 Summer         Greece       10625    301     201 United States
## 10 2008 Summer          China       10942    302     204         China
## 12 2012 Summer United Kingdom       10768    302     204 United States
## 14 2016 Summer         Brazil       11238    306     207 United States
##    CompetitorsPerEvent
## 1             36.40467
## 4             38.07380
## 6             35.50333
## 8             35.29900
## 10            36.23179
## 12            35.65563
## 14            36.72549

10. # Selecting rows where Type is ‘Winter’ and Nations is at least 80

winter_olympics <- Olympics[Olympics$Type == "Winter" & Olympics$Nations >= 80, ]
winter_olympics
##    Year   Type        Host Competitors Events Nations  Leader
## 9  2006 Winter       Italy        2508     84      80 Germany
## 11 2010 Winter      Canada        2566     86      82  Canada
## 13 2014 Winter      Russia        2873     98      88  Russia
## 15 2018 Winter South Korea        2922    102      92  Norway
##    CompetitorsPerEvent
## 9             29.85714
## 11            29.83721
## 13            29.31633
## 15            28.64706