Note: Install leaflet
library(leaflet)
my_map <- addMarkers(addTiles(leaflet()), lat=33.72148, lng=73.0433, popup ="Hello from Islamabad!!")
my_map
The above code will create a scatter plot of miles per gallon consumption in cities vs highway; each point will represent a car. There will be no annotations on the chart and the points will be black in color.
The above line of code will create a scatter plit of diamond carat vs price; each point represents a diamond and will be black in color. With a large dataset this chart will be very crowded and not asthetically pleasing.
The above line of code will create a line chart of unemployment over time. The line will be of default color (black) with no annotations
The above line of code will create a histogram of number of cars in equal sozed bins of city miles per gallon. ggplot will assume a default width of the bin and the chart will be in default grey colors
We cannot Map a continuous variable to shape; we will get an error saying “Error: A continuous variable can not be mapped to shape”
Arguments nrow,ncol are used to control how many rows and columns appear in the output
library(ggplot2)
ggplot(mpg, aes(cty,hwy))+geom_point()
ggplot(diamonds, aes(carat, price)) + geom_point()
ggplot(economics, aes(date, unemploy)) + geom_line()
ggplot(mpg, aes(cty)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Basic setup:
options(scipen = 999)
library(ggplot2)
data("midwest", package="ggplot2")
3(i) Simple Scatterplot with title and axis labels
ggplot(midwest,aes(x=area, y=poptotal)) + geom_point() + labs(x="Area", y="Total Population") + ggtitle("Total Population vs Area")
3(ii) There are only 4 data points that are above the population value of 1 million – we can show them as annotations and can remove them from the chart so that the scatter plot has more variation at the lower end.
We can show the remaining 4 counties separately
pop_lt1mill <- subset(midwest, poptotal<1000000)
ggplot(pop_lt1mill,aes(x=area, y=poptotal)) + geom_point() + labs(x="Area", y="Total Population") + ggtitle("Total Population vs Area (counties with pop < 1 million)")
pop_gt1mill <- subset(midwest, poptotal>1000000)
ggplot(pop_gt1mill,aes(x=area, y=poptotal)) + geom_point() + labs(x="Area", y="Total Population") + ggtitle("Total Population vs Area (counties with pop > 1 million)")
3(iii) Changing color and size of points – we will pass argumemnts to geom_point
ggplot(pop_lt1mill,aes(x=area, y=poptotal)) + geom_point(size=2, alpha=.6, color="blue") + labs(x="Area", y="Total Population") + ggtitle("Total Population vs Area (counties with pop < 1 million")
3(iv) We will add argument to aes and assign color = state
ggplot(pop_lt1mill,aes(x=area, y=poptotal, color=state)) + geom_point(size=2, alpha=.6) + labs(x="Area", y="Total Population") + ggtitle("Total Population vs Area (counties with pop < 1 million")
3(v) we are already using custom labels for x and y axis by using labs function, we will use theme to allign them:
ggplot(pop_lt1mill,aes(x=area, y=poptotal, color=state)) + geom_point(size=2, alpha=.6) + labs(x="Area", y="Total Population") + ggtitle("Total Population vs Area (counties with pop < 1 million") +theme(axis.title.x =element_text(hjust=0), axis.title.y = element_text(hjust= 0))
3(v1)
Bubble charts are good for identifying relationships between two continuous variables; they can explain the relation and any existing relational trend between them effectively.
Basic setup:
library(ggplot2)
data(mpg, package="ggplot2")
mpg_select <- mpg[mpg$manufacturer %in% c("audi", "ford", "honda", "hyundai"),]
part i
ggplot(mpg_select, aes(x=displ, y=cty, size=hwy))+geom_point()
part ii
ggplot(mpg, aes(x=manufacturer, y=cty)) + geom_bar(stat="identity") + labs(x="make", y="Mileage") + ggtitle("Make vs Avg Mileage") +theme(axis.text.x = element_text(angle=45))
part iii
ggplot(mpg, aes(x=displ,fill=class,color=class)) + geom_histogram(binwidth = 2) + labs(x="displ", y="count") + ggtitle("Engine Displacement across Vehicle Classes")
part iv
ggplot(mpg, aes(x=cty,fill=cyl)) + geom_density(alpha=.5) + labs(x="City Mileage", y="density") + ggtitle("City Mileage Grouped by Number of Cylendars")+facet_grid(cyl ~ .)
part v
ggplot(mpg, aes(x=class, y=cty)) + geom_boxplot() + labs(x="Class of Vehicle", y="City Mileage") + ggtitle("City Mileage Grouped by Class of Vehicle")