The tidyverse function I am going to look at is between
from dplyr.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights13)
data(flights)
I chose between
because it’s a way more efficient way to
do x >= left & x <= right which I tend to do a lot.
Let’s test out the between function using
nycflights
:
filter(flights, between(distance, 800, 1500))
## # A tibble: 100,171 × 19
## year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
## <int> <int> <int> <int> <int> <dbl> <int> <int>
## 1 2013 1 1 517 515 2 830 819
## 2 2013 1 1 533 529 4 850 830
## 3 2013 1 1 542 540 2 923 850
## 4 2013 1 1 555 600 -5 913 854
## 5 2013 1 1 557 600 -3 838 846
## 6 2013 1 1 558 600 -2 849 851
## 7 2013 1 1 558 600 -2 853 856
## 8 2013 1 1 559 600 -1 941 910
## 9 2013 1 1 600 600 0 851 858
## 10 2013 1 1 601 600 1 844 850
## # ℹ 100,161 more rows
## # ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
## # tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
## # hour <dbl>, minute <dbl>, time_hour <dttm>
Now we can easily see the filtered distance flights without typing out distance >= 800 & x <= 1500.
Addition by Maxfield Raynolds:
In the interest of using between, we could also use the inverse of between by including an exclamation point!
Using the !between() we can filter all of the flights that are less than 800 or more than 1500 miles.
filter(flights, !between(distance, 800, 1500))
## # A tibble: 236,605 × 19
## year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
## <int> <int> <int> <int> <int> <dbl> <int> <int>
## 1 2013 1 1 544 545 -1 1004 1022
## 2 2013 1 1 554 600 -6 812 837
## 3 2013 1 1 554 558 -4 740 728
## 4 2013 1 1 557 600 -3 709 723
## 5 2013 1 1 558 600 -2 753 745
## 6 2013 1 1 558 600 -2 924 917
## 7 2013 1 1 558 600 -2 923 937
## 8 2013 1 1 559 559 0 702 706
## 9 2013 1 1 559 600 -1 854 902
## 10 2013 1 1 600 600 0 837 825
## # ℹ 236,595 more rows
## # ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
## # tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
## # hour <dbl>, minute <dbl>, time_hour <dttm>