Import Library
library(plyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:plyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Read The matches file Using read_csv function
matches_data <- read.csv("C:\\Users\\HP\\Documents\\R\\Project_2\\Rscripts\\IPL_DATA\\matches.CSV")
Read The deliveries file Using read_csv function
deliveries_data <- read.csv("C:\\Users\\HP\\Documents\\R\\Project_2\\Rscripts\\IPL_DATA\\deliveries.CSV")
view the top rows of matches_data
head(matches_data)
## id season city date team1
## 1 1 2017 Hyderabad 2017-04-05 Sunrisers Hyderabad
## 2 2 2017 Pune 2017-04-06 Mumbai Indians
## 3 3 2017 Rajkot 2017-04-07 Gujarat Lions
## 4 4 2017 Indore 2017-04-08 Rising Pune Supergiant
## 5 5 2017 Bangalore 2017-04-08 Royal Challengers Bangalore
## 6 6 2017 Hyderabad 2017-04-09 Gujarat Lions
## team2 toss_winner toss_decision result
## 1 Royal Challengers Bangalore Royal Challengers Bangalore field normal
## 2 Rising Pune Supergiant Rising Pune Supergiant field normal
## 3 Kolkata Knight Riders Kolkata Knight Riders field normal
## 4 Kings XI Punjab Kings XI Punjab field normal
## 5 Delhi Daredevils Royal Challengers Bangalore bat normal
## 6 Sunrisers Hyderabad Sunrisers Hyderabad field normal
## dl_applied winner win_by_runs win_by_wickets
## 1 0 Sunrisers Hyderabad 35 0
## 2 0 Rising Pune Supergiant 0 7
## 3 0 Kolkata Knight Riders 0 10
## 4 0 Kings XI Punjab 0 6
## 5 0 Royal Challengers Bangalore 15 0
## 6 0 Sunrisers Hyderabad 0 9
## player_of_match venue umpire1
## 1 Yuvraj Singh Rajiv Gandhi International Stadium, Uppal AY Dandekar
## 2 SPD Smith Maharashtra Cricket Association Stadium A Nand Kishore
## 3 CA Lynn Saurashtra Cricket Association Stadium Nitin Menon
## 4 GJ Maxwell Holkar Cricket Stadium AK Chaudhary
## 5 KM Jadhav M Chinnaswamy Stadium
## 6 Rashid Khan Rajiv Gandhi International Stadium, Uppal A Deshmukh
## umpire2 umpire3
## 1 NJ Llong NA
## 2 S Ravi NA
## 3 CK Nandan NA
## 4 C Shamshuddin NA
## 5 NA
## 6 NJ Llong NA
view the top rows of deliveries_data
head(deliveries_data)
## match_id inning batting_team bowling_team over ball
## 1 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 1
## 2 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 2
## 3 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 3
## 4 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 4
## 5 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 5
## 6 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 6
## batsman non_striker bowler is_super_over wide_runs bye_runs legbye_runs
## 1 DA Warner S Dhawan TS Mills 0 0 0 0
## 2 DA Warner S Dhawan TS Mills 0 0 0 0
## 3 DA Warner S Dhawan TS Mills 0 0 0 0
## 4 DA Warner S Dhawan TS Mills 0 0 0 0
## 5 DA Warner S Dhawan TS Mills 0 2 0 0
## 6 S Dhawan DA Warner TS Mills 0 0 0 0
## noball_runs penalty_runs batsman_runs extra_runs total_runs player_dismissed
## 1 0 0 0 0 0
## 2 0 0 0 0 0
## 3 0 0 4 0 4
## 4 0 0 0 0 0
## 5 0 0 0 2 2
## 6 0 0 0 0 0
## dismissal_kind fielder
## 1
## 2
## 3
## 4
## 5
## 6
Which team is dominating in a certain location(Venue)?
matches_data%>%
filter(result != 'no result') %>% group_by(winner,city) %>%
summarise(win = n()) %>% arrange(desc(win)) # The n() function use for row count
## `summarise()` regrouping output by 'winner' (override with `.groups` argument)
## # A tibble: 196 x 3
## # Groups: winner [14]
## winner city win
## <chr> <chr> <int>
## 1 Mumbai Indians Mumbai 45
## 2 Kolkata Knight Riders Kolkata 37
## 3 Chennai Super Kings Chennai 33
## 4 Royal Challengers Bangalore Bangalore 30
## 5 Rajasthan Royals Jaipur 24
## 6 Delhi Daredevils Delhi 23
## 7 Kings XI Punjab Chandigarh 22
## 8 Sunrisers Hyderabad Hyderabad 20
## 9 Mumbai Indians Kolkata 9
## 10 Chennai Super Kings Mumbai 8
## # ... with 186 more rows
Insight: Mumbai Indians Mumbai
Who are best IPL batsmen still dated??
matches_data%>%
filter(result != 'no result') %>% group_by(player_of_match) %>%
summarise(win = n()) %>% arrange(desc(win))# The n() function use for row count
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 201 x 2
## player_of_match win
## <chr> <int>
## 1 CH Gayle 18
## 2 YK Pathan 16
## 3 AB de Villiers 15
## 4 DA Warner 15
## 5 RG Sharma 14
## 6 SK Raina 14
## 7 G Gambhir 13
## 8 MS Dhoni 13
## 9 AM Rahane 12
## 10 MEK Hussey 12
## # ... with 191 more rows
Insight: CH Gayle Wins the title player of the match maximun time, so ‘CH Gayle’ is the best batsman?
Who is best IPL bowler still dated??
deliveries_data%>%group_by(bowler) %>%
summarise(total_runs = sum(total_runs)) %>% arrange((total_runs))# The sum function use for summation
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 356 x 2
## bowler total_runs
## <chr> <int>
## 1 AC Gilchrist 0
## 2 N Rana 3
## 3 AM Rahane 5
## 4 SPD Smith 5
## 5 LA Carseldine 6
## 6 SS Mundhe 6
## 7 KS Williamson 7
## 8 Y Gnaneswara Rao 7
## 9 RS Gavaskar 8
## 10 SA Yadav 8
## # ... with 346 more rows
Insight: Not Avalable
Who is the best allrounder in the IPL?