Latihan 2.7 Practice problems
1.What code would you use to create a vector named Movie with the
values Citizen Kane, The Godfather, Casablanca, Raging Bull, and Singing
in the Rain?
Movie <- c("Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing")
Movie
## [1] "Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing"
2. What code would you use to create a vector—giving the year that
the moviesin Problem 1 were made—named Year with the values 1941, 1972,
1942, 1980, and 1952?
Year <- c(1941, 1972, 1942, 1980, 1952)
Year
## [1] 1941 1972 1942 1980 1952
3. What code would you use to create a vector—giving the run times
in minutes of the movies in Problem 1—named RunTime with the values 119,
177, 102, 129, and 103?
RunTime <- c(119, 177, 102, 129, 103)
4.What code would you use to find the run times of the movies in
hours and save them in a vector called RunTimeHours?
RunTimeHours <- RunTime / 60
RunTimeHours
## [1] 1.983333 2.950000 1.700000 2.150000 1.716667
5 What code would you use to create a data frame named MovieInfo
containing the vectors created in Problem 1, Problem 2, and Problem
3?
Movieinfo <- data.frame(Movie, Year, RunTime)
Movieinfo
## Movie Year RunTime
## 1 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1941 119
## 2 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1972 177
## 3 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1942 102
## 4 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1980 129
## 5 Citizen Kane, The Godfather, Casablanca, Raging Bull, Singing 1952 103
6 What code would you use to create a vector named Title with the
values The Secret of Monkey Island, Indiana Jones, and the Fate of
Atlantis, Day of the Tentacle, and Grim Fandango?
Title <- c("The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango")
7 What code would you use to create a vector—giving the year that
the games in Problem 6 were released—named Release with the values 1990,
1992, 1993, and 1998?
Release <- c(1990, 1992, 1993, 1998)
Release
## [1] 1990 1992 1993 1998
8 LucasArts was founded in 1982. What code would you use to
calculate how many years after the founding of the company was the game
released?
Lucas = 1982
Release - Lucas
## [1] 8 10 11 16
10 What code would you use to create a data frame called
AdventureGames containing the vectors contained in Problem 6, Problem 7,
and Problem 9?
AdventureGames <- data.frame(Title, Release, Rank)
AdventureGames
## Title
## 1 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 2 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 3 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## 4 The Secret of Monkey Island, Indiana Jones, the Fate of Atlantis, Day of the Tentacle, Grim Fandango
## Release Rank
## 1 1990 14
## 2 1992 11
## 3 1993 6
## 4 1998 1
Latihan 4.7 Practice problems
colleges <- data.frame(
College = c("William and Mary", "Christopher Newport", "George Mason", "James Madison",
"Longwood", "Norfolk State", "Old Dominion", "Radford", "Mary Washington",
"Virginia", "Virginia Commonwealth", "Virginia Military Institute", "Virginia Tech", "Virginia State"),
Employees = c(2104, 922, 4043, 2833, 746, 919, 2369, 1273, 721, 7431, 5825, 550, 7303, 761),
TopSalary = c(425000, 381486, 536714, 428400, 328268, 295000, 448272, 312080, 449865, 561099, 503154, 364269, 500000, 356524),
MedianSalary = c(56496, 47895, 63029, 53080, 52000, 49605, 54416, 51000, 53045, 60048, 55000, 44999, 51656, 55925)
)
1. What code would you use to select the first, third, tenth, and
twelfth entriesin the TopSalary vector from the Colleges data
frame?
selected_entries <- colleges$TopSalary[c(1, 3, 10, 12)]
selected_entries
## [1] 425000 536714 561099 364269
3. # Selecting rows where Employees <= 1000
colleges_few_employees <- colleges[colleges$Employees <= 1000, ]
colleges_few_employees
## College Employees TopSalary MedianSalary
## 2 Christopher Newport 922 381486 47895
## 5 Longwood 746 328268 52000
## 6 Norfolk State 919 295000 49605
## 9 Mary Washington 721 449865 53045
## 12 Virginia Military Institute 550 364269 44999
## 14 Virginia State 761 356524 55925
4 Selecting a random sample of 5 colleges
sampled_colleges <- colleges[sample(nrow(colleges), 5), ]
sampled_colleges
## College Employees TopSalary MedianSalary
## 4 James Madison 2833 428400 53080
## 9 Mary Washington 721 449865 53045
## 7 Old Dominion 2369 448272 54416
## 11 Virginia Commonwealth 5825 503154 55000
## 6 Norfolk State 919 295000 49605
Membuat data frame
Countries <- data.frame(
National <- c("China", "India", "United States", "Indonesia", "Brazil", "Pakistan", "Nigeria", "Bangladesh", "Russia", "Mexico"),
Region <- c("Asia", "Asia", "North America", "Asia", "South America", "Asia", "Africa", "Asia", "Europe", "North America"),
Populasi <- c(1409517397, 1339180127, 324459463, 263991379, 209288278, 197015955, 190886311, 164669751, 143989754, 129163276),
Pct_increase <- c(0.4, 1.1, 0.7, 1.1, 0.8, 2.0, 2.6, 1.1, 0.0, 1.3),
GDP_capita <- c(8582, 1852, 57467, 3895, 10309, 1629, 2640, 1524, 10248, 8562)
)
Countries
## National....c..China....India....United.States....Indonesia...
## 1 China
## 2 India
## 3 United States
## 4 Indonesia
## 5 Brazil
## 6 Pakistan
## 7 Nigeria
## 8 Bangladesh
## 9 Russia
## 10 Mexico
## Region....c..Asia....Asia....North.America....Asia....South.America...
## 1 Asia
## 2 Asia
## 3 North America
## 4 Asia
## 5 South America
## 6 Asia
## 7 Africa
## 8 Asia
## 9 Europe
## 10 North America
## Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 1 1409517397
## 2 1339180127
## 3 324459463
## 4 263991379
## 5 209288278
## 6 197015955
## 7 190886311
## 8 164669751
## 9 143989754
## 10 129163276
## Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 1 0.4
## 2 1.1
## 3 0.7
## 4 1.1
## 5 0.8
## 6 2.0
## 7 2.6
## 8 1.1
## 9 0.0
## 10 1.3
## GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 1 8582
## 2 1852
## 3 57467
## 4 3895
## 5 10309
## 6 1629
## 7 2640
## 8 1524
## 9 10248
## 10 8562
5. Selecting rows where GDP per capita is less than 10000 and Region
is not ‘Asia’
filtered_countries <- Countries[Countries$GDP_capita < 10000 & Countries$Region != "Asia", ]
filtered_countries
## National....c..China....India....United.States....Indonesia...
## 7 Nigeria
## 10 Mexico
## Region....c..Asia....Asia....North.America....Asia....South.America...
## 7 Africa
## 10 North America
## Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 7 190886311
## 10 129163276
## Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 7 2.6
## 10 1.3
## GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 7 2640
## 10 8562
6. # Selecting a random sample of 3 nations
sampled_nations <- Countries[sample(nrow(Countries), 3), ]
sampled_nations
## National....c..China....India....United.States....Indonesia...
## 8 Bangladesh
## 10 Mexico
## 9 Russia
## Region....c..Asia....Asia....North.America....Asia....South.America...
## 8 Asia
## 10 North America
## 9 Europe
## Populasi....c.1409517397..1339180127..324459463..263991379..209288278..
## 8 164669751
## 10 129163276
## 9 143989754
## Pct_increase....c.0.4..1.1..0.7..1.1..0.8..2..2.6..1.1..0..1.3.
## 8 1.1
## 10 1.3
## 9 0.0
## GDP_capita....c.8582..1852..57467..3895..10309..1629..2640..1524..
## 8 1524
## 10 8562
## 9 10248
7. # Selecting nations with a population percent increase greater
than 1.5%
nations_increase <- Countries$National[Countries$Pct_increase > 1.5]
Membuat dataframe 2
Olympics <- data.frame(
Year = c(1992, 1992, 1994, 1996, 1998, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016, 2018),
Type = c("Summer", "Winter", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter"),
Host = c("Spain", "France", "Norway", "United States", "Japan", "Australia", "United States", "Greece", "Italy", "China", "Canada", "United Kingdom", "Russia", "Brazil", "South Korea"),
Competitors = c(9356, 1801, 1737, 10318, 2176, 10651, 2399, 10625, 2508, 10942, 2566, 10768, 2873, 11238, 2922),
Events = c(257, 57, 61, 271, 68, 300, 78, 301, 84, 302, 86, 302, 98, 306, 102),
Nations = c(169, 64, 67, 197, 72, 199, 78, 201, 80, 204, 82, 204, 88, 207, 92),
Leader = c("Unified Team", "Germany", "Russia", "United States", "Germany", "United States", "Norway", "United States", "Germany", "China", "Canada", "United States", "Russia", "United States", "Norway")
)
Olympics
## Year Type Host Competitors Events Nations Leader
## 1 1992 Summer Spain 9356 257 169 Unified Team
## 2 1992 Winter France 1801 57 64 Germany
## 3 1994 Winter Norway 1737 61 67 Russia
## 4 1996 Summer United States 10318 271 197 United States
## 5 1998 Winter Japan 2176 68 72 Germany
## 6 2000 Summer Australia 10651 300 199 United States
## 7 2002 Winter United States 2399 78 78 Norway
## 8 2004 Summer Greece 10625 301 201 United States
## 9 2006 Winter Italy 2508 84 80 Germany
## 10 2008 Summer China 10942 302 204 China
## 11 2010 Winter Canada 2566 86 82 Canada
## 12 2012 Summer United Kingdom 10768 302 204 United States
## 13 2014 Winter Russia 2873 98 88 Russia
## 14 2016 Summer Brazil 11238 306 207 United States
## 15 2018 Winter South Korea 2922 102 92 Norway
8. Selecting rows where Host is also the Leader
host_medal_leader <- Olympics[Olympics$Host == Olympics$Leader, ]
host_medal_leader
## Year Type Host Competitors Events Nations Leader
## 4 1996 Summer United States 10318 271 197 United States
## 10 2008 Summer China 10942 302 204 China
## 11 2010 Winter Canada 2566 86 82 Canada
## 13 2014 Winter Russia 2873 98 88 Russia
9. Adding a new column for Competitors per Event
Olympics$CompetitorsPerEvent <- Olympics$Competitors / Olympics$Events
Olympics
## Year Type Host Competitors Events Nations Leader
## 1 1992 Summer Spain 9356 257 169 Unified Team
## 2 1992 Winter France 1801 57 64 Germany
## 3 1994 Winter Norway 1737 61 67 Russia
## 4 1996 Summer United States 10318 271 197 United States
## 5 1998 Winter Japan 2176 68 72 Germany
## 6 2000 Summer Australia 10651 300 199 United States
## 7 2002 Winter United States 2399 78 78 Norway
## 8 2004 Summer Greece 10625 301 201 United States
## 9 2006 Winter Italy 2508 84 80 Germany
## 10 2008 Summer China 10942 302 204 China
## 11 2010 Winter Canada 2566 86 82 Canada
## 12 2012 Summer United Kingdom 10768 302 204 United States
## 13 2014 Winter Russia 2873 98 88 Russia
## 14 2016 Summer Brazil 11238 306 207 United States
## 15 2018 Winter South Korea 2922 102 92 Norway
## CompetitorsPerEvent
## 1 36.40467
## 2 31.59649
## 3 28.47541
## 4 38.07380
## 5 32.00000
## 6 35.50333
## 7 30.75641
## 8 35.29900
## 9 29.85714
## 10 36.23179
## 11 29.83721
## 12 35.65563
## 13 29.31633
## 14 36.72549
## 15 28.64706
# Selecting rows where Competitors per Event is greater than 35
competitors_per_event <- Olympics[Olympics$CompetitorsPerEvent > 35, ]
competitors_per_event
## Year Type Host Competitors Events Nations Leader
## 1 1992 Summer Spain 9356 257 169 Unified Team
## 4 1996 Summer United States 10318 271 197 United States
## 6 2000 Summer Australia 10651 300 199 United States
## 8 2004 Summer Greece 10625 301 201 United States
## 10 2008 Summer China 10942 302 204 China
## 12 2012 Summer United Kingdom 10768 302 204 United States
## 14 2016 Summer Brazil 11238 306 207 United States
## CompetitorsPerEvent
## 1 36.40467
## 4 38.07380
## 6 35.50333
## 8 35.29900
## 10 36.23179
## 12 35.65563
## 14 36.72549
10. # Selecting rows where Type is ‘Winter’ and Nations is at least
80
winter_olympics <- Olympics[Olympics$Type == "Winter" & Olympics$Nations >= 80, ]
winter_olympics
## Year Type Host Competitors Events Nations Leader
## 9 2006 Winter Italy 2508 84 80 Germany
## 11 2010 Winter Canada 2566 86 82 Canada
## 13 2014 Winter Russia 2873 98 88 Russia
## 15 2018 Winter South Korea 2922 102 92 Norway
## CompetitorsPerEvent
## 9 29.85714
## 11 29.83721
## 13 29.31633
## 15 28.64706