1.Load the required libraries and datasets

library(tidyverse)
library(nycflights13)

2.dplyr basics

The five important dplyr() verbs are :

  1. filter()
  2. arrange()
  3. select()
  4. mutate()
  5. summarise()

all of these can be used in conjuction with group_by(). for a full documentation of dplyr() functions do ?dplyr()

2.1 filter rows with filter() :

by default filter() pipes ‘and’ operator to all the arguments you pass within it,and you get an output only if all the things are evaluated in your argument,if you need explict operations use them explicitly !

logical operations :

  1. &
  2. |
  3. !
  4. ==
  5. !=
jan1 <- filter(flights, month == 1, day == 1)
filter(flights, month == 11 | month == 12)
nov_dec <- filter(flights, month %in% c(11, 12))
filter(flights, !(arr_delay > 120 | dep_delay > 120))
filter(flights, arr_delay <= 120, dep_delay <= 120)
filter(flights,dest == "IAH" | dest == "HOU")
 
filter(flights,arr_delay < 2 & dep_delay == 0)
filter(flights,dep_time >= 600 & dep_time <= 2400)

also check the doc for between() using ?between() ,look at what it does…

2.2 Arrange() :

arrange() works similarly to filter() except that instead of selecting rows, it changes their order. It takes a data frame and a set of column names (or more complicated expressions) to order by. If you provide more than one column name, each additional column will be used to break ties in the values of preceding columns:

arrange(flights,year,month,day)
arrange(flights,desc(arr_delay))
2.3 select() :

It’s not uncommon to get datasets with hundreds or even thousands of variables. In this case, the first challenge is often narrowing in on the variables you’re actually interested in. select() allows you to rapidly zoom in on a useful subset using operations based on the names of the variables.

select() is not terribly useful with the flights data because we only have 19 variables, but you can still get the general idea:

# Select columns by name
select(flights, year, month, day)
# Select all columns between year and day (inclusive)
select(flights, year:day)
# Select all columns except those from year to day (inclusive)
select(flights, -(year:day))
  • There are a number of helper functions you can use within select():
    • starts_with(“abc”): matches names that begin with “abc”
    • ends_with(“xyz”): matches names that end with “xyz”.
    • contains(“ijk”): matches names that contain “ijk”.
    • matches(“(.)\1”): selects variables that match a regular expression.
    • num_range(“x”, 1:3) matches x1, x2 and x3.

See ?select for more details. also check ?rename() and ?everything() .

2.4 mutate() :

Besides selecting sets of existing columns, it’s often useful to add new columns that are functions of existing columns. That’s the job of mutate().

#creating a new dataset
flights_sml <- select(flights, 
  year:day, 
  ends_with("delay"), 
  distance, 
  air_time
)
#adding new variables to the dataset 
mutate(flights_sml,
  gain = arr_delay - dep_delay,
  speed = distance / air_time * 60
)
#adding new columns to your dataframe
mutate(flights_sml,
  gain = arr_delay - dep_delay,
  hours = air_time / 60,
  gain_per_hour = gain / hours
)
#If you only want to keep the new variables, use transmute():
transmute(flights,
  gain = arr_delay - dep_delay,
  hours = air_time / 60,
  gain_per_hour = gain / hours
)
2.5 summarise() :

The last key verb is summarise(). It collapses a data frame to a single row:

summarise(flights, delay = mean(dep_delay, na.rm = TRUE))

summarise() is not terribly useful unless we pair it with group_by(). This changes the unit of analysis from the complete dataset to individual groups. Then, when you use the dplyr verbs on a grouped data frame they’ll be automatically applied “by group”. For example, if we applied exactly the same code to a data frame grouped by date, we get the average delay per date:

by_day <- group_by(flights, year, month, day)
summarise(by_day, delay = mean(dep_delay, na.rm = TRUE))

this is not the only thing summarise() can do !!!! check ?summarise()

for an exhaustive documentation of dplyr,visit :

https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html

https://cran.r-project.org/web/packages/dplyr/dplyr.pdf

LS0tDQp0aXRsZTogIkRhdGEgVHJhbnNmb3JtYXRpb24iDQphdXRob3I6ICJiaGFyYXRoIGcgcyINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQoNCiMjIyMxLkxvYWQgdGhlIHJlcXVpcmVkIGxpYnJhcmllcyBhbmQgZGF0YXNldHMNCg0KYGBge3IgbGlicmFyaWVzLWRhdGFzZXRzfQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpsaWJyYXJ5KG55Y2ZsaWdodHMxMykNCmBgYA0KDQojIyMjMi5kcGx5ciBiYXNpY3MNCg0KVGhlIGZpdmUgaW1wb3J0YW50IGRwbHlyKCkgdmVyYnMgYXJlIDoNCg0KMS4gZmlsdGVyKCkNCjIuIGFycmFuZ2UoKQ0KMy4gc2VsZWN0KCkNCjQuIG11dGF0ZSgpDQo1LiBzdW1tYXJpc2UoKQ0KDQphbGwgb2YgdGhlc2UgY2FuIGJlIHVzZWQgaW4gY29uanVjdGlvbiB3aXRoIGdyb3VwX2J5KCkuDQpmb3IgYSBmdWxsIGRvY3VtZW50YXRpb24gb2YgZHBseXIoKSBmdW5jdGlvbnMgZG8gP2RwbHlyKCkNCg0KIyMjIyMyLjEgZmlsdGVyIHJvd3Mgd2l0aCBmaWx0ZXIoKSA6DQpieSBkZWZhdWx0IGZpbHRlcigpIHBpcGVzICdhbmQnIG9wZXJhdG9yIHRvIGFsbCB0aGUgYXJndW1lbnRzIHlvdSBwYXNzIHdpdGhpbiBpdCxhbmQgeW91IGdldCBhbiBvdXRwdXQgb25seSBpZiBhbGwgdGhlIHRoaW5ncyBhcmUgZXZhbHVhdGVkIGluIHlvdXIgYXJndW1lbnQsaWYgeW91IG5lZWQgZXhwbGljdCBvcGVyYXRpb25zIHVzZSB0aGVtIGV4cGxpY2l0bHkgIQ0KDQpsb2dpY2FsIG9wZXJhdGlvbnMgOiAgDQoNCjEuICYNCjIuIHwNCjMuICENCjQuID09DQo1LiAhPQ0KYGBge3IgZmlsdGVyfQ0KDQpqYW4xIDwtIGZpbHRlcihmbGlnaHRzLCBtb250aCA9PSAxLCBkYXkgPT0gMSkNCg0KZmlsdGVyKGZsaWdodHMsIG1vbnRoID09IDExIHwgbW9udGggPT0gMTIpDQoNCm5vdl9kZWMgPC0gZmlsdGVyKGZsaWdodHMsIG1vbnRoICVpbiUgYygxMSwgMTIpKQ0KDQpmaWx0ZXIoZmxpZ2h0cywgIShhcnJfZGVsYXkgPiAxMjAgfCBkZXBfZGVsYXkgPiAxMjApKQ0KDQpmaWx0ZXIoZmxpZ2h0cywgYXJyX2RlbGF5IDw9IDEyMCwgZGVwX2RlbGF5IDw9IDEyMCkNCg0KZmlsdGVyKGZsaWdodHMsZGVzdCA9PSAiSUFIIiB8IGRlc3QgPT0gIkhPVSIpDQogDQpmaWx0ZXIoZmxpZ2h0cyxhcnJfZGVsYXkgPCAyICYgZGVwX2RlbGF5ID09IDApDQoNCmZpbHRlcihmbGlnaHRzLGRlcF90aW1lID49IDYwMCAmIGRlcF90aW1lIDw9IDI0MDApDQoNCmBgYA0KYWxzbyBjaGVjayAgdGhlICBkb2MgZm9yIGJldHdlZW4oKSB1c2luZyA/YmV0d2VlbigpICxsb29rIGF0IHdoYXQgaXQgZG9lcy4uLg0KDQojIyMjIzIuMiBBcnJhbmdlKCkgOg0KYXJyYW5nZSgpIHdvcmtzIHNpbWlsYXJseSB0byBmaWx0ZXIoKSBleGNlcHQgdGhhdCBpbnN0ZWFkIG9mIHNlbGVjdGluZyByb3dzLCBpdCBjaGFuZ2VzIHRoZWlyIG9yZGVyLiBJdCB0YWtlcyBhIGRhdGEgZnJhbWUgYW5kIGEgc2V0IG9mIGNvbHVtbiBuYW1lcyAob3IgbW9yZSBjb21wbGljYXRlZCBleHByZXNzaW9ucykgdG8gb3JkZXIgYnkuIElmIHlvdSBwcm92aWRlIG1vcmUgdGhhbiBvbmUgY29sdW1uIG5hbWUsIGVhY2ggYWRkaXRpb25hbCBjb2x1bW4gd2lsbCBiZSB1c2VkIHRvIGJyZWFrIHRpZXMgaW4gdGhlIHZhbHVlcyBvZiBwcmVjZWRpbmcgY29sdW1uczoNCg0KYGBge3IgYXJyYW5nZX0NCmFycmFuZ2UoZmxpZ2h0cyx5ZWFyLG1vbnRoLGRheSkNCg0KYXJyYW5nZShmbGlnaHRzLGRlc2MoYXJyX2RlbGF5KSkNCg0KYGBgDQoNCiMjIyMjMi4zIHNlbGVjdCgpIDoNCkl0J3Mgbm90IHVuY29tbW9uIHRvIGdldCBkYXRhc2V0cyB3aXRoIGh1bmRyZWRzIG9yIGV2ZW4gdGhvdXNhbmRzIG9mIHZhcmlhYmxlcy4gSW4gdGhpcyBjYXNlLCB0aGUgZmlyc3QgY2hhbGxlbmdlIGlzIG9mdGVuIG5hcnJvd2luZyBpbiBvbiB0aGUgdmFyaWFibGVzIHlvdSdyZSBhY3R1YWxseSBpbnRlcmVzdGVkIGluLiBzZWxlY3QoKSBhbGxvd3MgeW91IHRvIHJhcGlkbHkgem9vbSBpbiBvbiBhIHVzZWZ1bCBzdWJzZXQgdXNpbmcgb3BlcmF0aW9ucyBiYXNlZCBvbiB0aGUgbmFtZXMgb2YgdGhlIHZhcmlhYmxlcy4NCg0Kc2VsZWN0KCkgaXMgbm90IHRlcnJpYmx5IHVzZWZ1bCB3aXRoIHRoZSBmbGlnaHRzIGRhdGEgYmVjYXVzZSB3ZSBvbmx5IGhhdmUgMTkgdmFyaWFibGVzLCBidXQgeW91IGNhbiBzdGlsbCBnZXQgdGhlIGdlbmVyYWwgaWRlYToNCg0KYGBge3Igc2VsZWN0fQ0KIyBTZWxlY3QgY29sdW1ucyBieSBuYW1lDQpzZWxlY3QoZmxpZ2h0cywgeWVhciwgbW9udGgsIGRheSkNCg0KIyBTZWxlY3QgYWxsIGNvbHVtbnMgYmV0d2VlbiB5ZWFyIGFuZCBkYXkgKGluY2x1c2l2ZSkNCnNlbGVjdChmbGlnaHRzLCB5ZWFyOmRheSkNCg0KIyBTZWxlY3QgYWxsIGNvbHVtbnMgZXhjZXB0IHRob3NlIGZyb20geWVhciB0byBkYXkgKGluY2x1c2l2ZSkNCnNlbGVjdChmbGlnaHRzLCAtKHllYXI6ZGF5KSkNCg0KDQpgYGANCiogVGhlcmUgYXJlIGEgbnVtYmVyIG9mIGhlbHBlciBmdW5jdGlvbnMgeW91IGNhbiB1c2Ugd2l0aGluIHNlbGVjdCgpOg0KICAgICsgc3RhcnRzX3dpdGgoImFiYyIpOiBtYXRjaGVzIG5hbWVzIHRoYXQgYmVnaW4gd2l0aCAiYWJjIg0KICAgICsgZW5kc193aXRoKCJ4eXoiKTogbWF0Y2hlcyBuYW1lcyB0aGF0IGVuZCB3aXRoICJ4eXoiLg0KICAgICsgY29udGFpbnMoImlqayIpOiBtYXRjaGVzIG5hbWVzIHRoYXQgY29udGFpbiAiaWprIi4NCiAgICArIG1hdGNoZXMoIiguKVxcMSIpOiBzZWxlY3RzIHZhcmlhYmxlcyB0aGF0IG1hdGNoIGEgcmVndWxhciBleHByZXNzaW9uLg0KICAgICsgbnVtX3JhbmdlKCJ4IiwgMTozKSBtYXRjaGVzIHgxLCB4MiBhbmQgeDMuDQogICAgDQpTZWUgP3NlbGVjdCBmb3IgbW9yZSBkZXRhaWxzLg0KYWxzbyBjaGVjayA/cmVuYW1lKCkgYW5kID9ldmVyeXRoaW5nKCkgLg0KDQoNCiMjIyMjMi40IG11dGF0ZSgpIDoNCkJlc2lkZXMgc2VsZWN0aW5nIHNldHMgb2YgZXhpc3RpbmcgY29sdW1ucywgaXQncyBvZnRlbiB1c2VmdWwgdG8gYWRkIG5ldyBjb2x1bW5zIHRoYXQgYXJlIGZ1bmN0aW9ucyBvZiBleGlzdGluZyBjb2x1bW5zLiBUaGF0J3MgdGhlIGpvYiBvZiBtdXRhdGUoKS4NCg0KYGBge3IgbXV0YXRlfQ0KI2NyZWF0aW5nIGEgbmV3IGRhdGFzZXQNCmZsaWdodHNfc21sIDwtIHNlbGVjdChmbGlnaHRzLCANCiAgeWVhcjpkYXksIA0KICBlbmRzX3dpdGgoImRlbGF5IiksIA0KICBkaXN0YW5jZSwgDQogIGFpcl90aW1lDQopDQoNCiNhZGRpbmcgbmV3IHZhcmlhYmxlcyB0byB0aGUgZGF0YXNldCANCm11dGF0ZShmbGlnaHRzX3NtbCwNCiAgZ2FpbiA9IGFycl9kZWxheSAtIGRlcF9kZWxheSwNCiAgc3BlZWQgPSBkaXN0YW5jZSAvIGFpcl90aW1lICogNjANCikNCg0KI2FkZGluZyBuZXcgY29sdW1ucyB0byB5b3VyIGRhdGFmcmFtZQ0KbXV0YXRlKGZsaWdodHNfc21sLA0KICBnYWluID0gYXJyX2RlbGF5IC0gZGVwX2RlbGF5LA0KICBob3VycyA9IGFpcl90aW1lIC8gNjAsDQogIGdhaW5fcGVyX2hvdXIgPSBnYWluIC8gaG91cnMNCikNCg0KI0lmIHlvdSBvbmx5IHdhbnQgdG8ga2VlcCB0aGUgbmV3IHZhcmlhYmxlcywgdXNlIHRyYW5zbXV0ZSgpOg0KdHJhbnNtdXRlKGZsaWdodHMsDQogIGdhaW4gPSBhcnJfZGVsYXkgLSBkZXBfZGVsYXksDQogIGhvdXJzID0gYWlyX3RpbWUgLyA2MCwNCiAgZ2Fpbl9wZXJfaG91ciA9IGdhaW4gLyBob3Vycw0KKQ0KDQoNCg0KYGBgDQoNCg0KIyMjIyMyLjUgc3VtbWFyaXNlKCkgOg0KVGhlIGxhc3Qga2V5IHZlcmIgaXMgc3VtbWFyaXNlKCkuIEl0IGNvbGxhcHNlcyBhIGRhdGEgZnJhbWUgdG8gYSBzaW5nbGUgcm93Og0KDQpgYGB7ciBzdW1tYXJpc2V9DQpzdW1tYXJpc2UoZmxpZ2h0cywgZGVsYXkgPSBtZWFuKGRlcF9kZWxheSwgbmEucm0gPSBUUlVFKSkNCmBgYA0KDQpzdW1tYXJpc2UoKSBpcyBub3QgdGVycmlibHkgdXNlZnVsIHVubGVzcyB3ZSBwYWlyIGl0IHdpdGggZ3JvdXBfYnkoKS4gVGhpcyBjaGFuZ2VzIHRoZSB1bml0IG9mIGFuYWx5c2lzIGZyb20gdGhlIGNvbXBsZXRlIGRhdGFzZXQgdG8gaW5kaXZpZHVhbCBncm91cHMuIFRoZW4sIHdoZW4geW91IHVzZSB0aGUgZHBseXIgdmVyYnMgb24gYSBncm91cGVkIGRhdGEgZnJhbWUgdGhleSdsbCBiZSBhdXRvbWF0aWNhbGx5IGFwcGxpZWQgImJ5IGdyb3VwIi4gRm9yIGV4YW1wbGUsIGlmIHdlIGFwcGxpZWQgZXhhY3RseSB0aGUgc2FtZSBjb2RlIHRvIGEgZGF0YSBmcmFtZSBncm91cGVkIGJ5IGRhdGUsIHdlIGdldCB0aGUgYXZlcmFnZSBkZWxheSBwZXIgZGF0ZToNCmBgYHtyIHN1bW1hcmlzZSBjb250Li4ufQ0KYnlfZGF5IDwtIGdyb3VwX2J5KGZsaWdodHMsIHllYXIsIG1vbnRoLCBkYXkpDQpzdW1tYXJpc2UoYnlfZGF5LCBkZWxheSA9IG1lYW4oZGVwX2RlbGF5LCBuYS5ybSA9IFRSVUUpKQ0KYGBgDQoNCnRoaXMgaXMgbm90IHRoZSBvbmx5IHRoaW5nIHN1bW1hcmlzZSgpIGNhbiBkbyAhISEhDQpjaGVjayA/c3VtbWFyaXNlKCkNCg0KDQpmb3IgYW4gZXhoYXVzdGl2ZSBkb2N1bWVudGF0aW9uIG9mIGRwbHlyLHZpc2l0IDoNCg0KaHR0cHM6Ly9jcmFuLnJzdHVkaW8uY29tL3dlYi9wYWNrYWdlcy9kcGx5ci92aWduZXR0ZXMvaW50cm9kdWN0aW9uLmh0bWwNCg0KaHR0cHM6Ly9jcmFuLnItcHJvamVjdC5vcmcvd2ViL3BhY2thZ2VzL2RwbHlyL2RwbHlyLnBkZg0K