This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.

plot(cars)

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.

install.packages("tidyverse")
library(tidyverse)
Warning: package ‘tidyverse’ was built under R version 4.3.3
Warning: package ‘ggplot2’ was built under R version 4.3.3
Warning: package ‘tibble’ was built under R version 4.3.3
Warning: package ‘tidyr’ was built under R version 4.3.3
Warning: package ‘readr’ was built under R version 4.3.3
Warning: package ‘purrr’ was built under R version 4.3.3
Warning: package ‘dplyr’ was built under R version 4.3.3
Warning: package ‘stringr’ was built under R version 4.3.3
Warning: package ‘forcats’ was built under R version 4.3.3
Warning: package ‘lubridate’ was built under R version 4.3.3
── Attaching core tidyverse packages ─────────────────
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ──────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors

That message means that from that moment:

if you use filter(), the filter() from dplyr will be used, rather than the filter() from base R (stats); similarly for lag(). (There isn’t any conflict between dplyr::filter() and dplyr::lag().)

This is usually not an issue - the vast majority of code will work just fine.

If for any reason you want to use the base R version of either of these two functions, you can simply call them explicitly: stats::filter(…) or stats::lag(…)

library(palmerpenguins)
Warning: package ‘palmerpenguins’ was built under R version 4.3.3
ggplot(data=penguins)+
  geom_point(mapping=aes(x=flipper_length_mm, y=body_mass_g, color=species))
Warning: Removed 2 rows containing missing values or values
outside the scale range (`geom_point()`).

ggsave() #saves last plot you displayed

ggsave("three penguins species.png")

there are other ways to save without using ggsave can also specify color, dimensionl

png(file="C:/Datamentor/R-tutorial/saving_plot2.png",
width=600, height=350)
hist(Temperature, col="gold")
dev.off()
 hotel_bookings <- read.csv("C:/Users/Teng/Downloads/hotel_bookings.csv")
colnames(hotel_bookings)
head(hotel_bookings)

Imagine you want to analyze the fuel efficiency across different car classes in the mpg dataset. facet_wrap makes this straightforward:

library(ggplot2)
ggplot(mpg, aes(x = displ, y = hwy)) + geom_point() + facet_wrap(~class)
ggplot(data=hotel_bookings) + geom_bar(mapping=aes(x=market_segment)) + facet_wrap(~hotel)
ggplot(hotel_bookings) + geom_bar(mapping=aes(x= market_segment)) + facet_wrap(~hotel) + labs(title="Comparison of market segments by hotel type for hotel bookings")

You also want to add another detail about what time period this data covers. To do this, you need to find out when the data is from.

You realize you can use the min() function on the year column in the data:

min(hotel_bookings$arrival_date_year)
max(hotel_bookings$arrival_date_year)

then

mindate <- min(hotel_bookings$arrival_date_year)
maxdate <- max(hotel_bookings$arrival_date_year)

Now, you will add in a subtitle using subtitle= in the labs() function. Then, you can use the paste0() function to use your newly-created variables in your labels. This is really handy, because if the data gets updated and there is more recent data added, you don’t have to change the code below because the variables are dynamic:

{r} ggplot(data = hotel_bookings) + geom_bar(mapping = aes(x = market_segment)) + facet_wrap(~hotel) + labs(title="Comparison of market segments by hotel type for hotel bookings", caption=paste0("Data from: ", mindate, " to ", maxdate), x="Market Segment", y="Number of Bookings")

saving chart

ggsave('hotel_booking_chart.png')
ggsave('hotel_booking_chart.png', 
       width=16,
       height = 8)
LS0tDQp0aXRsZTogIlIgTm90ZWJvb2siDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQpUaGlzIGlzIGFuIFtSIE1hcmtkb3duXShodHRwOi8vcm1hcmtkb3duLnJzdHVkaW8uY29tKSBOb3RlYm9vay4gV2hlbiB5b3UgZXhlY3V0ZSBjb2RlIHdpdGhpbiB0aGUgbm90ZWJvb2ssIHRoZSByZXN1bHRzIGFwcGVhciBiZW5lYXRoIHRoZSBjb2RlLg0KDQpUcnkgZXhlY3V0aW5nIHRoaXMgY2h1bmsgYnkgY2xpY2tpbmcgdGhlICpSdW4qIGJ1dHRvbiB3aXRoaW4gdGhlIGNodW5rIG9yIGJ5IHBsYWNpbmcgeW91ciBjdXJzb3IgaW5zaWRlIGl0IGFuZCBwcmVzc2luZyAqQ3RybCtTaGlmdCtFbnRlciouDQoNCmBgYHtyfQ0KcGxvdChjYXJzKQ0KYGBgDQoNCkFkZCBhIG5ldyBjaHVuayBieSBjbGlja2luZyB0aGUgKkluc2VydCBDaHVuayogYnV0dG9uIG9uIHRoZSB0b29sYmFyIG9yIGJ5IHByZXNzaW5nICpDdHJsK0FsdCtJKi4NCg0KV2hlbiB5b3Ugc2F2ZSB0aGUgbm90ZWJvb2ssIGFuIEhUTUwgZmlsZSBjb250YWluaW5nIHRoZSBjb2RlIGFuZCBvdXRwdXQgd2lsbCBiZSBzYXZlZCBhbG9uZ3NpZGUgaXQgKGNsaWNrIHRoZSAqUHJldmlldyogYnV0dG9uIG9yIHByZXNzICpDdHJsK1NoaWZ0K0sqIHRvIHByZXZpZXcgdGhlIEhUTUwgZmlsZSkuDQoNClRoZSBwcmV2aWV3IHNob3dzIHlvdSBhIHJlbmRlcmVkIEhUTUwgY29weSBvZiB0aGUgY29udGVudHMgb2YgdGhlIGVkaXRvci4gQ29uc2VxdWVudGx5LCB1bmxpa2UgKktuaXQqLCAqUHJldmlldyogZG9lcyBub3QgcnVuIGFueSBSIGNvZGUgY2h1bmtzLiBJbnN0ZWFkLCB0aGUgb3V0cHV0IG9mIHRoZSBjaHVuayB3aGVuIGl0IHdhcyBsYXN0IHJ1biBpbiB0aGUgZWRpdG9yIGlzIGRpc3BsYXllZC4NCg0KYGBge3J9DQppbnN0YWxsLnBhY2thZ2VzKCJ0aWR5dmVyc2UiKQ0KYGBgDQoNCmBgYHtyfQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpgYGANCg0KVGhhdCBtZXNzYWdlIG1lYW5zIHRoYXQgZnJvbSB0aGF0IG1vbWVudDoNCg0KaWYgeW91IHVzZSBmaWx0ZXIoKSwgdGhlIGZpbHRlcigpIGZyb20gZHBseXIgd2lsbCBiZSB1c2VkLCByYXRoZXIgdGhhbiB0aGUgZmlsdGVyKCkgZnJvbSBiYXNlIFIgKHN0YXRzKTsgc2ltaWxhcmx5IGZvciBsYWcoKS4gKFRoZXJlIGlzbid0IGFueSBjb25mbGljdCBiZXR3ZWVuIGRwbHlyOjpmaWx0ZXIoKSBhbmQgZHBseXI6OmxhZygpLikNCg0KVGhpcyBpcyB1c3VhbGx5IG5vdCBhbiBpc3N1ZSAtIHRoZSB2YXN0IG1ham9yaXR5IG9mIGNvZGUgd2lsbCB3b3JrIGp1c3QgZmluZS4NCg0KSWYgZm9yIGFueSByZWFzb24geW91IHdhbnQgdG8gdXNlIHRoZSBiYXNlIFIgdmVyc2lvbiBvZiBlaXRoZXIgb2YgdGhlc2UgdHdvIGZ1bmN0aW9ucywgeW91IGNhbiBzaW1wbHkgY2FsbCB0aGVtIGV4cGxpY2l0bHk6IHN0YXRzOjpmaWx0ZXIoLi4uKSBvciBzdGF0czo6bGFnKC4uLikNCg0KYGBge3J9DQpsaWJyYXJ5KGdncGxvdDIpDQppbnN0YWxsLnBhY2thZ2VzKCJwYWxtZXJwZW5ndWlucyIpDQpsaWJyYXJ5KHBhbG1lcnBlbmd1aW5zKQ0KYGBgDQoNCmBgYHtyfQ0KZ2dwbG90KGRhdGE9cGVuZ3VpbnMpKw0KICBnZW9tX3BvaW50KG1hcHBpbmc9YWVzKHg9ZmxpcHBlcl9sZW5ndGhfbW0sIHk9Ym9keV9tYXNzX2csIGNvbG9yPXNwZWNpZXMpKQ0KYGBgDQoNCmdnc2F2ZSgpICNzYXZlcyBsYXN0IHBsb3QgeW91IGRpc3BsYXllZA0KDQpgYGB7cn0NCmdnc2F2ZSgidGhyZWUgcGVuZ3VpbnMgc3BlY2llcy5wbmciKQ0KYGBgDQoNCnRoZXJlIGFyZSBvdGhlciB3YXlzIHRvIHNhdmUgd2l0aG91dCB1c2luZyBnZ3NhdmUgY2FuIGFsc28gc3BlY2lmeSBjb2xvciwgZGltZW5zaW9ubA0KDQpgYGB7cn0NCnBuZyhmaWxlPSJDOi9EYXRhbWVudG9yL1ItdHV0b3JpYWwvc2F2aW5nX3Bsb3QyLnBuZyIsDQp3aWR0aD02MDAsIGhlaWdodD0zNTApDQpoaXN0KFRlbXBlcmF0dXJlLCBjb2w9ImdvbGQiKQ0KZGV2Lm9mZigpDQpgYGANCg0KYGBge3J9DQogaG90ZWxfYm9va2luZ3MgPC0gcmVhZC5jc3YoIkM6L1VzZXJzL1RlbmcvRG93bmxvYWRzL2hvdGVsX2Jvb2tpbmdzLmNzdiIpDQpgYGANCg0KYGBge3J9DQpjb2xuYW1lcyhob3RlbF9ib29raW5ncykNCmhlYWQoaG90ZWxfYm9va2luZ3MpDQpgYGANCg0KSW1hZ2luZSB5b3Ugd2FudCB0byBhbmFseXplIHRoZSBmdWVsIGVmZmljaWVuY3kgYWNyb3NzIGRpZmZlcmVudCBjYXIgY2xhc3NlcyBpbiB0aGUgbXBnIGRhdGFzZXQuIGZhY2V0X3dyYXAgbWFrZXMgdGhpcyBzdHJhaWdodGZvcndhcmQ6DQoNCmBgYHtyfQ0KbGlicmFyeShnZ3Bsb3QyKQ0KZ2dwbG90KG1wZywgYWVzKHggPSBkaXNwbCwgeSA9IGh3eSkpICsgZ2VvbV9wb2ludCgpICsgZmFjZXRfd3JhcCh+Y2xhc3MpDQoNCmBgYA0KDQpgYGB7cn0NCmdncGxvdChkYXRhPWhvdGVsX2Jvb2tpbmdzKSArIGdlb21fYmFyKG1hcHBpbmc9YWVzKHg9bWFya2V0X3NlZ21lbnQpKSArIGZhY2V0X3dyYXAofmhvdGVsKQ0KYGBgDQoNCmBgYHtyfQ0KZ2dwbG90KGhvdGVsX2Jvb2tpbmdzKSArIGdlb21fYmFyKG1hcHBpbmc9YWVzKHg9IG1hcmtldF9zZWdtZW50KSkgKyBmYWNldF93cmFwKH5ob3RlbCkgKyBsYWJzKHRpdGxlPSJDb21wYXJpc29uIG9mIG1hcmtldCBzZWdtZW50cyBieSBob3RlbCB0eXBlIGZvciBob3RlbCBib29raW5ncyIpDQpgYGANCg0KWW91IGFsc28gd2FudCB0byBhZGQgYW5vdGhlciBkZXRhaWwgYWJvdXQgd2hhdCB0aW1lIHBlcmlvZCB0aGlzIGRhdGEgY292ZXJzLiBUbyBkbyB0aGlzLCB5b3UgbmVlZCB0byBmaW5kIG91dCB3aGVuIHRoZSBkYXRhIGlzIGZyb20uDQoNCllvdSByZWFsaXplIHlvdSBjYW4gdXNlIHRoZSBgbWluKClgIGZ1bmN0aW9uIG9uIHRoZSB5ZWFyIGNvbHVtbiBpbiB0aGUgZGF0YToNCg0KYGBge3J9DQptaW4oaG90ZWxfYm9va2luZ3MkYXJyaXZhbF9kYXRlX3llYXIpDQptYXgoaG90ZWxfYm9va2luZ3MkYXJyaXZhbF9kYXRlX3llYXIpDQpgYGANCg0KdGhlbg0KDQpgYGB7cn0NCm1pbmRhdGUgPC0gbWluKGhvdGVsX2Jvb2tpbmdzJGFycml2YWxfZGF0ZV95ZWFyKQ0KbWF4ZGF0ZSA8LSBtYXgoaG90ZWxfYm9va2luZ3MkYXJyaXZhbF9kYXRlX3llYXIpDQpgYGANCg0KTm93LCB5b3Ugd2lsbCBhZGQgaW4gYSBzdWJ0aXRsZSB1c2luZyBgc3VidGl0bGU9YCBpbiB0aGUgYGxhYnMoKWAgZnVuY3Rpb24uIFRoZW4sIHlvdSBjYW4gdXNlIHRoZSBgcGFzdGUwKClgIGZ1bmN0aW9uIHRvIHVzZSB5b3VyIG5ld2x5LWNyZWF0ZWQgdmFyaWFibGVzIGluIHlvdXIgbGFiZWxzLiBUaGlzIGlzIHJlYWxseSBoYW5keSwgYmVjYXVzZSBpZiB0aGUgZGF0YSBnZXRzIHVwZGF0ZWQgYW5kIHRoZXJlIGlzIG1vcmUgcmVjZW50IGRhdGEgYWRkZWQsIHlvdSBkb24ndCBoYXZlIHRvIGNoYW5nZSB0aGUgY29kZSBiZWxvdyBiZWNhdXNlIHRoZSB2YXJpYWJsZXMgYXJlIGR5bmFtaWM6DQoNCmBgYHs9aHRtbH0NCntyfQ0KZ2dwbG90KGRhdGEgPSBob3RlbF9ib29raW5ncykgKw0KICAgICBnZW9tX2JhcihtYXBwaW5nID0gYWVzKHggPSBtYXJrZXRfc2VnbWVudCkpICsNCiAgICAgZmFjZXRfd3JhcCh+aG90ZWwpICsNCiAgICAgbGFicyh0aXRsZT0iQ29tcGFyaXNvbiBvZiBtYXJrZXQgc2VnbWVudHMgYnkgaG90ZWwgdHlwZSBmb3IgaG90ZWwgYm9va2luZ3MiLA0KICAgICAgICAgICBjYXB0aW9uPXBhc3RlMCgiRGF0YSBmcm9tOiAiLCBtaW5kYXRlLCAiIHRvICIsIG1heGRhdGUpLA0KICAgICAgICAgICAgeD0iTWFya2V0IFNlZ21lbnQiLA0KICAgICAgICAgICAgeT0iTnVtYmVyIG9mIEJvb2tpbmdzIikNCmBgYA0Kc2F2aW5nIGNoYXJ0DQoNCmBgYHtyfQ0KZ2dzYXZlKCdob3RlbF9ib29raW5nX2NoYXJ0LnBuZycpDQpgYGANCg0KYGBge3J9DQpnZ3NhdmUoJ2hvdGVsX2Jvb2tpbmdfY2hhcnQucG5nJywgDQogICAgICAgd2lkdGg9MTYsDQogICAgICAgaGVpZ2h0ID0gOCkNCmBgYA0K