This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
library(readxl)
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.8
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.2 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
ins=read_excel("C:\\Users\\hntn\\Documents\\Insurance dataset.xlsx")
ins=ins%>%mutate(ageg=cut(age,breaks=c(-Inf,29,39,49,59,69),labels=c("<30","30-39","40-49","50-59","60-69")))
head(ins)
## # A tibble: 6 x 8
## age sex bmi children smoker region charge ageg
## <dbl> <chr> <dbl> <dbl> <chr> <chr> <dbl> <fct>
## 1 19 female 27.9 0 yes southwest 16885. <30
## 2 18 male 33.8 1 no southeast 1726. <30
## 3 28 male 33 3 no southeast 4449. <30
## 4 33 male 22.7 0 no northwest 21984. 30-39
## 5 32 male 28.9 0 no northwest 3867. 30-39
## 6 31 female 25.7 0 no southeast 3757. 30-39
ins= ins%>%mutate(bmig=cut(bmi, breaks=c(-Inf,18.5,24.9,29.9,Inf),labels=c("Underweight", "Normal", "Overweight", "Obese")))
ins=ins%>% mutate(gender=recode(sex,"male"=1, "female"=0))
#filter and select
male= ins%>% filter(sex=="male")
male=ins%>%filter(sex=="male") %>% dplyr::select(age,sex,bmi,bmig,charge)
dim(male)
## [1] 676 5
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.