lubridate

Disini saya akan mencoba belajar tentang “Data Wrangling”. “Data Wrangling” adalah suatu usaha agar data yang saya miliki menjadi bentuk yang dapat digunakan/berguna untuk melakukan “vizualitation” dan “modelling”. Pada bagian ini saya akan belajar tentang dates and times variabel menggunakan lubridate.

note : sedang malas nyusun kata-kata dan ngetik jadi copas

library(tidyverse)
library(lubridate)
library(nycflights13)

Creating Date/Times

today()
## [1] "2021-02-04"
now()
## [1] "2021-02-04 01:43:31 +07"

From string

ymd("2017-01-31")
## [1] "2017-01-31"
mdy("January 31st, 2017")
## [1] "2017-01-31"
dmy("31-Jan-2017")
## [1] "2017-01-31"
ymd(20170131)
## [1] "2017-01-31"
ymd_hms("2017-01-31 20:11:59")
## [1] "2017-01-31 20:11:59 UTC"
mdy_hm("01/31/2017 08:01")
## [1] "2017-01-31 08:01:00 UTC"
ymd(20170131, tz = "UTC")
## [1] "2017-01-31 UTC"

From Individual Components

Instead of a single string, sometimes you’ll have the individual com‐ponents of the date-time spread across multiple columns. This is what we have in the flights data:

flights %>%
  select(year, month, day, hour, minute)
## # A tibble: 336,776 x 5
##     year month   day  hour minute
##    <int> <int> <int> <dbl>  <dbl>
##  1  2013     1     1     5     15
##  2  2013     1     1     5     29
##  3  2013     1     1     5     40
##  4  2013     1     1     5     45
##  5  2013     1     1     6      0
##  6  2013     1     1     5     58
##  7  2013     1     1     6      0
##  8  2013     1     1     6      0
##  9  2013     1     1     6      0
## 10  2013     1     1     6      0
## # ... with 336,766 more rows
flights
## # A tibble: 336,776 x 19
##     year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time
##    <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>
##  1  2013     1     1      517            515         2      830            819
##  2  2013     1     1      533            529         4      850            830
##  3  2013     1     1      542            540         2      923            850
##  4  2013     1     1      544            545        -1     1004           1022
##  5  2013     1     1      554            600        -6      812            837
##  6  2013     1     1      554            558        -4      740            728
##  7  2013     1     1      555            600        -5      913            854
##  8  2013     1     1      557            600        -3      709            723
##  9  2013     1     1      557            600        -3      838            846
## 10  2013     1     1      558            600        -2      753            745
## # ... with 336,766 more rows, and 11 more variables: arr_delay <dbl>,
## #   carrier <chr>, flight <int>, tailnum <chr>, origin <chr>, dest <chr>,
## #   air_time <dbl>, distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>

To create a date/time from this sort of input, use make_date() for dates, or make_datetime() for date-times:

flights %>%
  select(year, month, day, hour, minute) %>%
  mutate(
    departure = make_datetime(year, month, day, hour, minute)
  )
## # A tibble: 336,776 x 6
##     year month   day  hour minute departure          
##    <int> <int> <int> <dbl>  <dbl> <dttm>             
##  1  2013     1     1     5     15 2013-01-01 05:15:00
##  2  2013     1     1     5     29 2013-01-01 05:29:00
##  3  2013     1     1     5     40 2013-01-01 05:40:00
##  4  2013     1     1     5     45 2013-01-01 05:45:00
##  5  2013     1     1     6      0 2013-01-01 06:00:00
##  6  2013     1     1     5     58 2013-01-01 05:58:00
##  7  2013     1     1     6      0 2013-01-01 06:00:00
##  8  2013     1     1     6      0 2013-01-01 06:00:00
##  9  2013     1     1     6      0 2013-01-01 06:00:00
## 10  2013     1     1     6      0 2013-01-01 06:00:00
## # ... with 336,766 more rows

The times in flight dataset is represented in a slightly odd format, so we use modulus arithmetic to pull out the hour and minute components. Once I’ve created the date-time variables,

make_datetime_100 <- function(year, month, day, time) {
  make_datetime(year, month, day, time %/% 100, time %% 100)
}
flights_dt <- flights %>%
  filter(!is.na(dep_time), !is.na(arr_time)) %>%
  mutate(
    dep_time = make_datetime_100(year, month, day, dep_time),
    arr_time = make_datetime_100(year, month, day, arr_time),
    sched_dep_time = make_datetime_100(
      year, month, day, sched_dep_time
    ),
    sched_arr_time = make_datetime_100(
      year, month, day, sched_arr_time
    )
  ) %>%
  select(origin, dest, ends_with("delay"), ends_with("time"))
flights_dt
## # A tibble: 328,063 x 9
##    origin dest  dep_delay arr_delay dep_time            sched_dep_time     
##    <chr>  <chr>     <dbl>     <dbl> <dttm>              <dttm>             
##  1 EWR    IAH           2        11 2013-01-01 05:17:00 2013-01-01 05:15:00
##  2 LGA    IAH           4        20 2013-01-01 05:33:00 2013-01-01 05:29:00
##  3 JFK    MIA           2        33 2013-01-01 05:42:00 2013-01-01 05:40:00
##  4 JFK    BQN          -1       -18 2013-01-01 05:44:00 2013-01-01 05:45:00
##  5 LGA    ATL          -6       -25 2013-01-01 05:54:00 2013-01-01 06:00:00
##  6 EWR    ORD          -4        12 2013-01-01 05:54:00 2013-01-01 05:58:00
##  7 EWR    FLL          -5        19 2013-01-01 05:55:00 2013-01-01 06:00:00
##  8 LGA    IAD          -3       -14 2013-01-01 05:57:00 2013-01-01 06:00:00
##  9 JFK    MCO          -3        -8 2013-01-01 05:57:00 2013-01-01 06:00:00
## 10 LGA    ORD          -2         8 2013-01-01 05:58:00 2013-01-01 06:00:00
## # ... with 328,053 more rows, and 3 more variables: arr_time <dttm>,
## #   sched_arr_time <dttm>, air_time <dbl>

Menurut saya code diatas kurang tepat, karena terdapat data yang jika waktu “sched”nya ditambah waktu delay maka hasilnya lebih dari 24:00 atau dengan kata lain ganti hari. contoh data pertama pada dataset dibawah, terlihat “sched” jam 09:00 dan delay 1301 menit atau 21 jam 41 menit bila dijumlahkan akan lebih dari 24:00 yang berarti beda hari.

flights_dt%>%
  arrange(desc(dep_delay))
## # A tibble: 328,063 x 9
##    origin dest  dep_delay arr_delay dep_time            sched_dep_time     
##    <chr>  <chr>     <dbl>     <dbl> <dttm>              <dttm>             
##  1 JFK    HNL        1301      1272 2013-01-09 06:41:00 2013-01-09 09:00:00
##  2 JFK    CMH        1137      1127 2013-06-15 14:32:00 2013-06-15 19:35:00
##  3 EWR    ORD        1126      1109 2013-01-10 11:21:00 2013-01-10 16:35:00
##  4 JFK    SFO        1014      1007 2013-09-20 11:39:00 2013-09-20 18:45:00
##  5 JFK    CVG        1005       989 2013-07-22 08:45:00 2013-07-22 16:00:00
##  6 JFK    TPA         960       931 2013-04-10 11:00:00 2013-04-10 19:00:00
##  7 LGA    MSP         911       915 2013-03-17 23:21:00 2013-03-17 08:10:00
##  8 JFK    PDX         899       850 2013-06-27 09:59:00 2013-06-27 19:00:00
##  9 LGA    ATL         898       895 2013-07-22 22:57:00 2013-07-22 07:59:00
## 10 EWR    MIA         896       878 2013-12-05 07:56:00 2013-12-05 17:00:00
## # ... with 328,053 more rows, and 3 more variables: arr_time <dttm>,
## #   sched_arr_time <dttm>, air_time <dbl>

With that data, I can visualize the distribution of departure times across the year:

flights_dt %>%
 ggplot(aes(dep_time)) +
  geom_freqpoly(binwidth = 86400) # 86400 seconds = 1 day

Or within a single day (banyaknya penerbangan dalam 1 hari setiap 10 menit):

flights_dt %>%
 filter(dep_time < ymd(20130102)) %>%
 ggplot(aes(dep_time)) +
 geom_freqpoly(binwidth = 600) # 600 s = 10 minutes


From Other Types

as_datetime(today())
## [1] "2021-02-04 UTC"
as_date(now())
## [1] "2021-02-04"

Sometimes you’ll get date/times as numeric offsets from the “Unix Epoch,” 1970-01-01. If the offset is in seconds, use as_datetime(); if it’s in days, use as_date():

as_datetime(60 * 60 * 10)
## [1] "1970-01-01 10:00:00 UTC"
as_date(365 * 10 + 2)
## [1] "1980-01-01"

Date-Time Components

Getting Components

You can pull out individual parts of the date with the accessor functions year(), month(), mday() (day of the month), yday() (day of the year), wday() (day of the week), hour(), minute(), and second():

datetime <- ymd_hms("2016-07-08 12:34:56")
datetime
## [1] "2016-07-08 12:34:56 UTC"
year(datetime)
## [1] 2016
month(datetime)
## [1] 7
mday(datetime)
## [1] 8
yday(datetime)
## [1] 190
wday(datetime)
## [1] 6

For month() and wday() you can set label = TRUE to return the abbreviated name of the month or day of the week. Set abbr = FALSE to return the full name:

month(datetime, label = TRUE)
## [1] Jul
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
wday(datetime, label = TRUE, abbr = FALSE)
## [1] Friday
## 7 Levels: Sunday < Monday < Tuesday < Wednesday < Thursday < ... < Saturday

We can use wday() to see that more flights depart (berdasarkan hari).

flights_dt %>%
  mutate(wday = wday(dep_time, label = TRUE)) %>%
  ggplot(aes(x = wday)) +
  geom_bar()+
  ylab("banyaknya keberangkatan")+
  labs(title = "Banyaknya keberangkatan Berdasarkan Hari")+
  theme(plot.title = element_text(hjust = 0.5))+
  xlab("hari")


There’s an interesting pattern if we look at the average departure delay by minute within the hour. (grafik rata-rata delay keberangkatan berdasarkan menit keberangkatan dalam satu jam).

flights_dt %>%
  mutate(minute = minute(dep_time)) %>%
  group_by(minute) %>%
  summarize(
   avg_delay = mean(arr_delay, na.rm = TRUE),
   n = n()) %>%
  ggplot(aes(minute, avg_delay)) +
  geom_line()+
  geom_point()+
  ylab("average delay (minute)")+
  labs(title = "rata-rata delay keberangkatan berdasarkan satu jam")+
  theme(plot.title = element_text(hjust = 0.5))+
  xlab("menit keberangkatan")
## `summarise()` ungrouping output (override with `.groups` argument)


Interestingly, if we look at the scheduled departure time we don’t see such a strong pattern:

sched_dep <- flights_dt %>%
  mutate(minute = minute(sched_dep_time)) %>%
  group_by(minute) %>%
  summarize(
    avg_delay = mean(arr_delay, na.rm = TRUE),
    n = n())
## `summarise()` ungrouping output (override with `.groups` argument)
ggplot(sched_dep, aes(minute, avg_delay)) +
  geom_line()+
  geom_point()+
  ylab("average delay (minute)")+
  xlab("menit keberangkatan")

So why do we see that pattern with the actual departure times? Well, like much data collected by humans, there’s a strong bias toward flights leaving at “nice” departure times. Always be alert for this sort of pattern whenever you work with data that involves human judgment!

ggplot(sched_dep, aes(minute, n)) +
  geom_line()+
  geom_point()

Waktu keberangkatan yang terjadi mah gini.

flights_dt %>%
  mutate(minute = minute(dep_time)) %>%
  group_by(minute) %>%
  summarize(
   avg_delay = mean(arr_delay, na.rm = TRUE),
   n = n()) %>%
  ggplot( aes(minute, n)) +
  geom_line()+
  geom_point()
## `summarise()` ungrouping output (override with `.groups` argument)


Rounding

An alternative approach to plotting individual components is to round the date to a nearby unit of time, with floor_date(), round_date(), and ceiling_date().

Contoh penggunaan. round_date(), floor_date(), and ceiling_date()

x <- ymd_hms("2020-08-11 12:01:59.23")
round_date(x, "second")
## [1] "2020-08-11 12:01:59 UTC"
round_date(x, "minute")
## [1] "2020-08-11 12:02:00 UTC"
round_date(x, "5 mins")
## [1] "2020-08-11 12:00:00 UTC"
round_date(x, "hour")
## [1] "2020-08-11 12:00:00 UTC"
round_date(x, "2 hours")
## [1] "2020-08-11 12:00:00 UTC"
round_date(x, "day")
## [1] "2020-08-12 UTC"
round_date(x, "week")
## [1] "2020-08-09 UTC"
round_date(x, "month")
## [1] "2020-08-01 UTC"
round_date(x, "bimonth")
## [1] "2020-09-01 UTC"
round_date(x, "quarter") #== round_date(x, "3 months")
## [1] "2020-07-01 UTC"
round_date(x, "halfyear")
## [1] "2020-07-01 UTC"
round_date(x, "year")
## [1] "2021-01-01 UTC"

floor_date(x, "second")
## [1] "2020-08-11 12:01:59 UTC"
floor_date(x, "minute")
## [1] "2020-08-11 12:01:00 UTC"
floor_date(x, "hour")
## [1] "2020-08-11 12:00:00 UTC"
floor_date(x, "day")
## [1] "2020-08-11 UTC"
floor_date(x, "week")
## [1] "2020-08-09 UTC"
floor_date(x, "month")
## [1] "2020-08-01 UTC"
floor_date(x, "bimonth")
## [1] "2020-07-01 UTC"
floor_date(x, "quarter")
## [1] "2020-07-01 UTC"
floor_date(x, "season")
## [1] "2020-06-01 UTC"
floor_date(x, "halfyear")
## [1] "2020-07-01 UTC"
floor_date(x, "year")
## [1] "2020-01-01 UTC"

ceiling_date(x, "second")
## [1] "2020-08-11 12:02:00 UTC"
ceiling_date(x, "minute")
## [1] "2020-08-11 12:02:00 UTC"
ceiling_date(x, "5 mins")
## [1] "2020-08-11 12:05:00 UTC"
ceiling_date(x, "hour")
## [1] "2020-08-11 13:00:00 UTC"
ceiling_date(x, "day")
## [1] "2020-08-12 UTC"
ceiling_date(x, "week")
## [1] "2020-08-16 UTC"
ceiling_date(x, "month")
## [1] "2020-09-01 UTC"
ceiling_date(x, "bimonth") == ceiling_date(x, "2 months")
## [1] TRUE
ceiling_date(x, "quarter")
## [1] "2020-10-01 UTC"
ceiling_date(x, "season")
## [1] "2020-09-01 UTC"
ceiling_date(x, "halfyear")
## [1] "2021-01-01 UTC"
ceiling_date(x, "year")
## [1] "2021-01-01 UTC"

This, for example, allows us to plot the number of flights per week:

flights_dt %>%
  count(week = floor_date(dep_time, "week")) %>%
  ggplot(aes(week, n)) +
  geom_line()+
  geom_point()


Setting Components

You can also use each accessor function to set the components of a date/time:

(datetime <- ymd_hms("2016-07-08 12:34:56"))
## [1] "2016-07-08 12:34:56 UTC"
# mengubah tahun
year(datetime) <- 2020
datetime
## [1] "2020-07-08 12:34:56 UTC"
#mengubah bulan
month(datetime) <- 01
datetime
## [1] "2020-01-08 12:34:56 UTC"
#mengubah jam 
hour(datetime) <- hour(datetime) + 1
datetime
## [1] "2020-01-08 13:34:56 UTC"

Alternatively, rather than modifying in place, you can create a new date-time with update(). This also allows you to set multiple values at once:

update(datetime, year = 2025, month = 3, mday = 2, hour = 13)
## [1] "2025-03-02 13:34:56 UTC"

Bisa juga begini.

ymd("2020-02-01") %>%
  update(mday = 30)
## [1] "2020-03-01"
ymd("2020-02-01") %>%
  update(hour = 400)
## [1] "2020-02-17 16:00:00 UTC"
ymd("2020-02-10") %>%
  update(yday = 1)
## [1] "2020-01-01"

Time Spans

Durations

In R, when you subtract two dates, you get a difftime object:

h_age <- today() - ymd(19971028)
h_age
## Time difference of 8500 days
as.duration(h_age)
## [1] "734400000s (~23.27 years)"
dseconds(15)
## [1] "15s"
dminutes(10)
## [1] "600s (~10 minutes)"
dhours(c(12, 24))
## [1] "43200s (~12 hours)" "86400s (~1 days)"
ddays(0:7)
## [1] "0s"                 "86400s (~1 days)"   "172800s (~2 days)" 
## [4] "259200s (~3 days)"  "345600s (~4 days)"  "432000s (~5 days)" 
## [7] "518400s (~6 days)"  "604800s (~1 weeks)"
dweeks(3)
## [1] "1814400s (~3 weeks)"
dyears(1)
## [1] "31557600s (~1 years)"
2 * dyears(1)
## [1] "63115200s (~2 years)"
dyears(1) + dweeks(12) + dhours(15)
## [1] "38869200s (~1.23 years)"
tomorrow <- today() + ddays(1)
tomorrow
## [1] "2021-02-05"
last_year <- today() - dyears(1)
last_year
## [1] "2020-02-04 18:00:00 UTC"

However, because durations represent an exact number of seconds, sometimes you might get an unexpected result:

one_pm <- ymd_hms(
  "2016-03-12 13:00:00",
  tz = "America/New_York"
)
one_pm
## [1] "2016-03-12 13:00:00 EST"
one_pm + ddays(1)
## [1] "2016-03-13 14:00:00 EDT"

Why is one day after 1 p.m. on March 12, 2 p.m. on March 13?! If you look carefully at the date you might also notice that the time zones have changed. Because of DST, March 12 only has 23 hours, so if we add a full day’s worth of seconds we end up with a different time.

(Kurang ngerti juga ini tentang DST kok tanggal 12 Maret cuman 23 jam)


Periods

To solve this problem, lubridate provides periods. Periods are time spans but don’t have a fixed length in seconds; instead they work with “human” times, like days and months. That allows them to work in a more intuitive way:

one_pm
## [1] "2016-03-12 13:00:00 EST"
one_pm + days(1)
## [1] "2016-03-13 13:00:00 EDT"
seconds(15)
## [1] "15S"
minutes(10)
## [1] "10M 0S"
hours(c(12, 24))
## [1] "12H 0M 0S" "24H 0M 0S"
days(7)
## [1] "7d 0H 0M 0S"
months(1:6)
## [1] "1m 0d 0H 0M 0S" "2m 0d 0H 0M 0S" "3m 0d 0H 0M 0S" "4m 0d 0H 0M 0S"
## [5] "5m 0d 0H 0M 0S" "6m 0d 0H 0M 0S"
weeks(3)
## [1] "21d 0H 0M 0S"
years(1)
## [1] "1y 0m 0d 0H 0M 0S"
10 * (months(6) + days(1))
## [1] "60m 10d 0H 0M 0S"
days(50) + hours(25) + minutes(2)
## [1] "50d 25H 2M 0S"
ymd("2020-01-01") + dyears(1)
## [1] "2020-12-31 06:00:00 UTC"
ymd("2020-01-01") + years(1)
## [1] "2021-01-01"
one_pm + ddays(1)
## [1] "2016-03-13 14:00:00 EDT"
one_pm + days(1)
## [1] "2016-03-13 13:00:00 EDT"

Let’s use periods to fix an oddity related to our flight dates. Some planes appear to have arrived at their destination before they departed from New York City:

flights_dt %>%
  filter(arr_time < dep_time)
## # A tibble: 10,633 x 9
##    origin dest  dep_delay arr_delay dep_time            sched_dep_time     
##    <chr>  <chr>     <dbl>     <dbl> <dttm>              <dttm>             
##  1 EWR    BQN           9        -4 2013-01-01 19:29:00 2013-01-01 19:20:00
##  2 JFK    DFW          59        NA 2013-01-01 19:39:00 2013-01-01 18:40:00
##  3 EWR    TPA          -2         9 2013-01-01 20:58:00 2013-01-01 21:00:00
##  4 EWR    SJU          -6       -12 2013-01-01 21:02:00 2013-01-01 21:08:00
##  5 EWR    SFO          11       -14 2013-01-01 21:08:00 2013-01-01 20:57:00
##  6 LGA    FLL         -10        -2 2013-01-01 21:20:00 2013-01-01 21:30:00
##  7 EWR    MCO          41        43 2013-01-01 21:21:00 2013-01-01 20:40:00
##  8 JFK    LAX          -7       -24 2013-01-01 21:28:00 2013-01-01 21:35:00
##  9 EWR    FLL          49        28 2013-01-01 21:34:00 2013-01-01 20:45:00
## 10 EWR    FLL          -9       -14 2013-01-01 21:36:00 2013-01-01 21:45:00
## # ... with 10,623 more rows, and 3 more variables: arr_time <dttm>,
## #   sched_arr_time <dttm>, air_time <dbl>

These are overnight flights. We used the same date information for both the departure and the arrival times, but these flights arrived on the following day. We can fix this by adding days(1) to the arrival time of each overnight flight:

flights_dt1 <- flights_dt %>%
  mutate(
    overnight = arr_time < dep_time,
    arr_time = arr_time + days(overnight * 1),
    sched_arr_time = sched_arr_time + days(overnight * 1)
  )

flights_dt1
## # A tibble: 328,063 x 10
##    origin dest  dep_delay arr_delay dep_time            sched_dep_time     
##    <chr>  <chr>     <dbl>     <dbl> <dttm>              <dttm>             
##  1 EWR    IAH           2        11 2013-01-01 05:17:00 2013-01-01 05:15:00
##  2 LGA    IAH           4        20 2013-01-01 05:33:00 2013-01-01 05:29:00
##  3 JFK    MIA           2        33 2013-01-01 05:42:00 2013-01-01 05:40:00
##  4 JFK    BQN          -1       -18 2013-01-01 05:44:00 2013-01-01 05:45:00
##  5 LGA    ATL          -6       -25 2013-01-01 05:54:00 2013-01-01 06:00:00
##  6 EWR    ORD          -4        12 2013-01-01 05:54:00 2013-01-01 05:58:00
##  7 EWR    FLL          -5        19 2013-01-01 05:55:00 2013-01-01 06:00:00
##  8 LGA    IAD          -3       -14 2013-01-01 05:57:00 2013-01-01 06:00:00
##  9 JFK    MCO          -3        -8 2013-01-01 05:57:00 2013-01-01 06:00:00
## 10 LGA    ORD          -2         8 2013-01-01 05:58:00 2013-01-01 06:00:00
## # ... with 328,053 more rows, and 4 more variables: arr_time <dttm>,
## #   sched_arr_time <dttm>, air_time <dbl>, overnight <lgl>
flights_dt1 %>%
  filter(overnight, arr_time < dep_time)
## # A tibble: 0 x 10
## # ... with 10 variables: origin <chr>, dest <chr>, dep_delay <dbl>,
## #   arr_delay <dbl>, dep_time <dttm>, sched_dep_time <dttm>, arr_time <dttm>,
## #   sched_arr_time <dttm>, air_time <dbl>, overnight <lgl>

Intervals

saya kurang mengerti dengan ini, apalagi ini pipe apaan “%–%”.

years(1) / days(1)
## [1] 365.25
next_year <- today() + years(1)
(today() %--% next_year) / ddays(1)
## [1] 365
(today() %--% next_year) %/% days(1)
## [1] 365

Time Zones

You can find out what R thinks your current time zone.

Sys.timezone()
## [1] "Asia/Bangkok"

And see the complete list of all time zone names.

length(OlsonNames())
## [1] 594
head(OlsonNames())
## [1] "Africa/Abidjan"     "Africa/Accra"       "Africa/Addis_Ababa"
## [4] "Africa/Algiers"     "Africa/Asmara"      "Africa/Asmera"

In R, the time zone is an attribute of the date-time that only controls printing. For example, these three objects represent the same instant in time:

(x1 <- ymd_hms("2015-06-01 12:00:00", tz = "America/New_York"))
## [1] "2015-06-01 12:00:00 EDT"
(x2 <- ymd_hms("2015-06-01 18:00:00", tz = "Europe/Copenhagen"))
## [1] "2015-06-01 18:00:00 CEST"
(x3 <- ymd_hms("2015-06-02 04:00:00", tz = "Pacific/Auckland"))
## [1] "2015-06-02 04:00:00 NZST"

You can verify that they’re the same time using subtraction:

x1 - x2
## Time difference of 0 secs
x1 - x3
## Time difference of 0 secs

Operations that combine date-times, like c(), will often drop the time zone. In that case, the date-times will display in your local time zone:

x4 <- c(x1, x2, x3)
x4
## [1] "2015-06-01 12:00:00 EDT" "2015-06-01 12:00:00 EDT"
## [3] "2015-06-01 12:00:00 EDT"

You can change the time zone in two ways:

x4a <- with_tz(x4, tzone = "Australia/Lord_Howe")
x4a
## [1] "2015-06-02 02:30:00 +1030" "2015-06-02 02:30:00 +1030"
## [3] "2015-06-02 02:30:00 +1030"
x4a-x4
## Time differences in secs
## [1] 0 0 0
x4b <- force_tz(x4, tzone = "Australia/Lord_Howe")
x4b
## [1] "2015-06-01 12:00:00 +1030" "2015-06-01 12:00:00 +1030"
## [3] "2015-06-01 12:00:00 +1030"
x4b-x4
## Time differences in hours
## [1] -14.5 -14.5 -14.5

Semoga Bermanfaat