Latihan 2.7 Practice problems

1.What code would you use to create a vector named Movie with the values Citizen Kane, The Godfather, Casablanca, Raging Bull, and Singing in the Rain?

Movie <- c("Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing")
Movie

## [1] "Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing"

2. What code would you use to create a vector—giving the year that the moviesin Problem 1 were made—named Year with the values 1941, 1972, 1942, 1980, and 1952?

Year <- c(1941, 1972, 1942, 1980, 1952)
Year

## [1] 1941 1972 1942 1980 1952

3. What code would you use to create a vector—giving the run times in minutes of the movies in Problem 1—named RunTime with the values 119, 177, 102, 129, and 103?

RunTime <- c(119, 177, 102, 129, 103)

4.What code would you use to find the run times of the movies in hours and save them in a vector called RunTimeHours?

RunTimeHours <- RunTime / 60
RunTimeHours

## [1] 1.983333 2.950000 1.700000 2.150000 1.716667

5 What code would you use to create a data frame named MovieInfo containing the vectors created in Problem 1, Problem 2, and Problem 3?

Movieinfo <- data.frame(Movie, Year, RunTime)
Movieinfo

##                                                           Movie Year RunTime
## 1 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1941     119
## 2 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1972     177
## 3 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1942     102
## 4 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1980     129
## 5 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1952     103

6 What code would you use to create a vector named Title with the values The Secret of Monkey Island, Indiana Jones, and the Fate of Atlantis, Day of the Tentacle, and Grim Fandango?

Title <- c("The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango")

7 What code would you use to create a vector—giving the year that the games in Problem 6 were released—named Release with the values 1990, 1992, 1993, and 1998?

Release <- c(1990, 1992, 1993, 1998)
Release

## [1] 1990 1992 1993 1998

8 LucasArts was founded in 1982. What code would you use to calculate how many years after the founding of the company was the game released?

Lucas = 1982
Release - Lucas

## [1]  8 10 11 16

9 Each of these games fall under the genre of adventure games. In 2011, adventuregamers.com created a ranking of the top 100 adventure games of all time. Create a vector—containing the rankings of the aforementioned games—named Rank with the values 14, 11, 6, and 1 [6].

Rank <- c(14, 11, 6, 1)

10 What code would you use to create a data frame called AdventureGames containing the vectors contained in Problem 6, Problem 7, and Problem 9?

AdventureGames <- data.frame(Title, Release, Rank)
AdventureGames

##                                                                                                  Title
## 1 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 2 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 3 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 4 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
##   Release Rank
## 1    1990   14
## 2    1992   11
## 3    1993    6
## 4    1998    1

Latihan 4.7 Practice problems

colleges <- data.frame(
  College = c("William and Mary", "Christopher Newport", "George Mason", "James Madison", 
             "Longwood", "Norfolk State", "Old Dominion", "Radford", "Mary Washington",
             "Virginia", "Virginia Commonwealth", "Virginia Military Institute", "Virginia Tech", "Virginia State"),
  Employees = c(2104, 922, 4043, 2833, 746, 919, 2369, 1273, 721, 7431, 5825, 550, 7303, 761),
  TopSalary = c(425000, 381486, 536714, 428400, 328268, 295000, 448272, 312080, 449865, 561099, 503154, 364269, 500000, 356524),
  MedianSalary = c(56496, 47895, 63029, 53080, 52000, 49605, 54416, 51000, 53045, 60048, 55000, 44999, 51656, 55925)
)

1. What code would you use to select the first, third, tenth, and twelfth entriesin the TopSalary vector from the Colleges data frame?

selected_entries <- colleges$TopSalary[c(1, 3, 10, 12)]
selected_entries

## [1] 425000 536714 561099 364269

2. What code would you use to select the elements of the MedianSalary vector where the TopSalary is greater than $400,000?

# Selecting MedianSalary elements where TopSalary > 400000
selected_median_salary <- colleges$MedianSalary[colleges$TopSalary > 400000]
selected_median_salary

## [1] 56496 63029 53080 54416 53045 60048 55000 51656

3. # Selecting rows where Employees <= 1000

colleges_few_employees <- colleges[colleges$Employees <= 1000, ]
colleges_few_employees

##                        College Employees TopSalary MedianSalary
## 2          Christopher Newport       922    381486        47895
## 5                     Longwood       746    328268        52000
## 6                Norfolk State       919    295000        49605
## 9              Mary Washington       721    449865        53045
## 12 Virginia Military Institute       550    364269        44999
## 14              Virginia State       761    356524        55925

4 Selecting a random sample of 5 colleges

sampled_colleges <- colleges[sample(nrow(colleges), 5), ]
sampled_colleges

##                  College Employees TopSalary MedianSalary
## 4          James Madison      2833    428400        53080
## 9        Mary Washington       721    449865        53045
## 7           Old Dominion      2369    448272        54416
## 11 Virginia Commonwealth      5825    503154        55000
## 6          Norfolk State       919    295000        49605

Membuat data frame

Countries <- data.frame(
National <- c("China", "India", "United States", "Indonesia", "Brazil", "Pakistan", "Nigeria", "Bangladesh", "Russia", "Mexico"),
Region <- c("Asia", "Asia", "North America", "Asia", "South America", "Asia", "Africa", "Asia", "Europe", "North America"),
Populasi <- c(1409517397, 1339180127, 324459463, 263991379, 209288278, 197015955, 190886311, 164669751, 143989754, 129163276),
Pct_increase <- c(0.4, 1.1, 0.7, 1.1, 0.8, 2.0, 2.6, 1.1, 0.0, 1.3),
GDP_capita <- c(8582, 1852, 57467, 3895, 10309, 1629, 2640, 1524, 10248, 8562)
)

Countries

##    National....c..China....India....United.States....Indonesia...
## 1                                                           China
## 2                                                           India
## 3                                                   United States
## 4                                                       Indonesia
## 5                                                          Brazil
## 6                                                        Pakistan
## 7                                                         Nigeria
## 8                                                      Bangladesh
## 9                                                          Russia
## 10                                                         Mexico
##    Region....c..Asia....Asia....North.America....Asia....South.America...
## 1                                                                    Asia
## 2                                                                    Asia
## 3                                                           North America
## 4                                                                    Asia
## 5                                                           South America
## 6                                                                    Asia
## 7                                                                  Africa
## 8                                                                    Asia
## 9                                                                  Europe
## 10                                                          North America
##    Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 1                                                               1409517397
## 2                                                               1339180127
## 3                                                                324459463
## 4                                                                263991379
## 5                                                                209288278
## 6                                                                197015955
## 7                                                                190886311
## 8                                                                164669751
## 9                                                                143989754
## 10                                                               129163276
##    Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 1                                                              0.4
## 2                                                              1.1
## 3                                                              0.7
## 4                                                              1.1
## 5                                                              0.8
## 6                                                              2.0
## 7                                                              2.6
## 8                                                              1.1
## 9                                                              0.0
## 10                                                             1.3
##    GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 1                                                                8582
## 2                                                                1852
## 3                                                               57467
## 4                                                                3895
## 5                                                               10309
## 6                                                                1629
## 7                                                                2640
## 8                                                                1524
## 9                                                               10248
## 10                                                               8562

5. Selecting rows where GDP per capita is less than 10000 and Region is not ‘Asia’

filtered_countries <- Countries[Countries$GDP_capita < 10000 & Countries$Region != "Asia", ]
filtered_countries

##    National....c..China....India....United.States....Indonesia...
## 7                                                         Nigeria
## 10                                                         Mexico
##    Region....c..Asia....Asia....North.America....Asia....South.America...
## 7                                                                  Africa
## 10                                                          North America
##    Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 7                                                                190886311
## 10                                                               129163276
##    Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 7                                                              2.6
## 10                                                             1.3
##    GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 7                                                                2640
## 10                                                               8562

6. # Selecting a random sample of 3 nations

sampled_nations <- Countries[sample(nrow(Countries), 3), ]
sampled_nations

##    National....c..China....India....United.States....Indonesia...
## 8                                                      Bangladesh
## 10                                                         Mexico
## 9                                                          Russia
##    Region....c..Asia....Asia....North.America....Asia....South.America...
## 8                                                                    Asia
## 10                                                          North America
## 9                                                                  Europe
##    Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 8                                                                164669751
## 10                                                               129163276
## 9                                                                143989754
##    Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 8                                                              1.1
## 10                                                             1.3
## 9                                                              0.0
##    GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 8                                                                1524
## 10                                                               8562
## 9                                                               10248

7. # Selecting nations with a population percent increase greater than 1.5%

nations_increase <- Countries$National[Countries$Pct_increase > 1.5]

Membuat dataframe 2

Olympics <- data.frame(
  Year = c(1992, 1992, 1994, 1996, 1998, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016, 2018),
  Type = c("Summer", "Winter", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter"),
  Host = c("Spain", "France", "Norway", "United States", "Japan", "Australia", "United States", "Greece", "Italy", "China", "Canada", "United Kingdom", "Russia", "Brazil", "South Korea"),
  Competitors = c(9356, 1801, 1737, 10318, 2176, 10651, 2399, 10625, 2508, 10942, 2566, 10768, 2873, 11238, 2922),
  Events = c(257, 57, 61, 271, 68, 300, 78, 301, 84, 302, 86, 302, 98, 306, 102),
  Nations = c(169, 64, 67, 197, 72, 199, 78, 201, 80, 204, 82, 204, 88, 207, 92),
  Leader = c("Unified Team", "Germany", "Russia", "United States", "Germany", "United States", "Norway", "United States", "Germany", "China", "Canada", "United States", "Russia", "United States", "Norway")
)
Olympics

##    Year   Type           Host Competitors Events Nations        Leader
## 1  1992 Summer          Spain        9356    257     169  Unified Team
## 2  1992 Winter         France        1801     57      64       Germany
## 3  1994 Winter         Norway        1737     61      67        Russia
## 4  1996 Summer  United States       10318    271     197 United States
## 5  1998 Winter          Japan        2176     68      72       Germany
## 6  2000 Summer      Australia       10651    300     199 United States
## 7  2002 Winter  United States        2399     78      78        Norway
## 8  2004 Summer         Greece       10625    301     201 United States
## 9  2006 Winter          Italy        2508     84      80       Germany
## 10 2008 Summer          China       10942    302     204         China
## 11 2010 Winter         Canada        2566     86      82        Canada
## 12 2012 Summer United Kingdom       10768    302     204 United States
## 13 2014 Winter         Russia        2873     98      88        Russia
## 14 2016 Summer         Brazil       11238    306     207 United States
## 15 2018 Winter    South Korea        2922    102      92        Norway

8. Selecting rows where Host is also the Leader

host_medal_leader <- Olympics[Olympics$Host == Olympics$Leader, ]
host_medal_leader

##    Year   Type          Host Competitors Events Nations        Leader
## 4  1996 Summer United States       10318    271     197 United States
## 10 2008 Summer         China       10942    302     204         China
## 11 2010 Winter        Canada        2566     86      82        Canada
## 13 2014 Winter        Russia        2873     98      88        Russia

9. Adding a new column for Competitors per Event

Olympics$CompetitorsPerEvent <- Olympics$Competitors / Olympics$Events
Olympics

##    Year   Type           Host Competitors Events Nations        Leader
## 1  1992 Summer          Spain        9356    257     169  Unified Team
## 2  1992 Winter         France        1801     57      64       Germany
## 3  1994 Winter         Norway        1737     61      67        Russia
## 4  1996 Summer  United States       10318    271     197 United States
## 5  1998 Winter          Japan        2176     68      72       Germany
## 6  2000 Summer      Australia       10651    300     199 United States
## 7  2002 Winter  United States        2399     78      78        Norway
## 8  2004 Summer         Greece       10625    301     201 United States
## 9  2006 Winter          Italy        2508     84      80       Germany
## 10 2008 Summer          China       10942    302     204         China
## 11 2010 Winter         Canada        2566     86      82        Canada
## 12 2012 Summer United Kingdom       10768    302     204 United States
## 13 2014 Winter         Russia        2873     98      88        Russia
## 14 2016 Summer         Brazil       11238    306     207 United States
## 15 2018 Winter    South Korea        2922    102      92        Norway
##    CompetitorsPerEvent
## 1             36.40467
## 2             31.59649
## 3             28.47541
## 4             38.07380
## 5             32.00000
## 6             35.50333
## 7             30.75641
## 8             35.29900
## 9             29.85714
## 10            36.23179
## 11            29.83721
## 12            35.65563
## 13            29.31633
## 14            36.72549
## 15            28.64706

 # Selecting rows where Competitors per Event is greater than 35
competitors_per_event <- Olympics[Olympics$CompetitorsPerEvent > 35, ]
competitors_per_event

##    Year   Type           Host Competitors Events Nations        Leader
## 1  1992 Summer          Spain        9356    257     169  Unified Team
## 4  1996 Summer  United States       10318    271     197 United States
## 6  2000 Summer      Australia       10651    300     199 United States
## 8  2004 Summer         Greece       10625    301     201 United States
## 10 2008 Summer          China       10942    302     204         China
## 12 2012 Summer United Kingdom       10768    302     204 United States
## 14 2016 Summer         Brazil       11238    306     207 United States
##    CompetitorsPerEvent
## 1             36.40467
## 4             38.07380
## 6             35.50333
## 8             35.29900
## 10            36.23179
## 12            35.65563
## 14            36.72549

10. # Selecting rows where Type is ‘Winter’ and Nations is at least 80

winter_olympics <- Olympics[Olympics$Type == "Winter" & Olympics$Nations >= 80, ]
winter_olympics

##    Year   Type        Host Competitors Events Nations  Leader
## 9  2006 Winter       Italy        2508     84      80 Germany
## 11 2010 Winter      Canada        2566     86      82  Canada
## 13 2014 Winter      Russia        2873     98      88  Russia
## 15 2018 Winter South Korea        2922    102      92  Norway
##    CompetitorsPerEvent
## 9             29.85714
## 11            29.83721
## 13            29.31633
## 15            28.64706

Latihan

Farid

2024-09-12

Latihan 2.7 Practice problems

1.What code would you use to create a vector named Movie with the values Citizen Kane, The Godfather, Casablanca, Raging Bull, and Singing in the Rain?

2. What code would you use to create a vector—giving the year that the moviesin Problem 1 were made—named Year with the values 1941, 1972, 1942, 1980, and 1952?

3. What code would you use to create a vector—giving the run times in minutes of the movies in Problem 1—named RunTime with the values 119, 177, 102, 129, and 103?

4.What code would you use to find the run times of the movies in hours and save them in a vector called RunTimeHours?

5 What code would you use to create a data frame named MovieInfo containing the vectors created in Problem 1, Problem 2, and Problem 3?

6 What code would you use to create a vector named Title with the values The Secret of Monkey Island, Indiana Jones, and the Fate of Atlantis, Day of the Tentacle, and Grim Fandango?

7 What code would you use to create a vector—giving the year that the games in Problem 6 were released—named Release with the values 1990, 1992, 1993, and 1998?

8 LucasArts was founded in 1982. What code would you use to calculate how many years after the founding of the company was the game released?

9 Each of these games fall under the genre of adventure games. In 2011, adventuregamers.com created a ranking of the top 100 adventure games of all time. Create a vector—containing the rankings of the aforementioned games—named Rank with the values 14, 11, 6, and 1 [6].

10 What code would you use to create a data frame called AdventureGames containing the vectors contained in Problem 6, Problem 7, and Problem 9?

Latihan 4.7 Practice problems

1. What code would you use to select the first, third, tenth, and twelfth entriesin the TopSalary vector from the Colleges data frame?

2. What code would you use to select the elements of the MedianSalary vector where the TopSalary is greater than $400,000?

3. # Selecting rows where Employees <= 1000

4 Selecting a random sample of 5 colleges

Membuat data frame

5. Selecting rows where GDP per capita is less than 10000 and Region is not ‘Asia’

6. # Selecting a random sample of 3 nations

7. # Selecting nations with a population percent increase greater than 1.5%

Membuat dataframe 2

8. Selecting rows where Host is also the Leader

9. Adding a new column for Competitors per Event

10. # Selecting rows where Type is ‘Winter’ and Nations is at least 80