# Mindanao State University
# General Santos City
# Math 108
# ggplot2 Tutorial in R
# Submitted by: Angga, Princess Joy & Dongosa, Davy
# the main task is to complete the ggplot2 tutorials from part 1 to 3.
# The Complete ggplot2 Tutorial - Part1 | Introduction To ggplot2 (Full R code)
# This tutorial is primarily geared towards those having some basic knowledge of the R programming language and want to make complex and nice looking charts with R ggplot2.
# Part 1: Introduction to ggplot2, covers the basic knowledge about constructing simple ggplots and modifying the components and aesthetics.
# 1. Understanding the Ggplot Syntax
# The syntax for constructing ggplots could be puzzling if you are a beginner or work primarily with base graphics.
# The main difference is that, unlike base graphics, ggplot works with dataframes and not individual vectors. All the data needed to make the plot is typically be contained within the dataframe supplied to the ggplot() itself or can be supplied to respective geoms.
# The second noticeable feature is that you can keep enhancing the plot by adding more layers (and themes) to an existing plot created using the ggplot() function.
# Let’s initialize a basic ggplot based on the midwest dataset.
# Setup
install.packages("ggplot2")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.3'
## (as 'lib' is unspecified)
options(scipen=999) # turn off scientific notation like 1e+06
library(ggplot2)
data("midwest", package = "ggplot2") # load the data
# midwest <- read.csv("http://goo.gl/G1K41K") # alt source
# Init Ggplot
ggplot(midwest, aes(x=area, y=poptotal)) # area and poptotal are columns in 'midwest'

# A blank ggplot is drawn.
# Even though the x and y are specified, there are no points or lines in it.
# This is because, ggplot doesn’t assume that you meant a scatterplot or a line chart to be drawn.
# Also note that aes() function is used to specify the X and Y axes. That’s because, any information that is part of the source dataframe has to be specified inside the aes() function.
# 2. How to Make a Simple Scatterplot
# Let’s make a scatterplot on top of the blank ggplot by adding points using a geom layer called geom_point.
library(ggplot2)
ggplot(midwest, aes(x=area, y=poptotal)) + geom_point()

# We got a basic scatterplot, where each point represents a county. However, it lacks some basic components such as the plot title, meaningful axis labels etc. Moreover most of the points are concentrated on the bottom portion of the plot, which is not so nice.
# Like geom_point(), there are many such geom layers which we will see in a subsequent part in this tutorial series. For now, let’s just add a smoothing layer using geom_smooth(method='lm'). Since the method is set as lm (short for linear model), it draws the line of best fit.
library(ggplot2)
g <- ggplot(midwest, aes(x=area, y=poptotal)) + geom_point() + geom_smooth(method="lm") # set se=FALSE to turnoff confidence bands
plot(g)
## `geom_smooth()` using formula = 'y ~ x'

# The line of best fit is in blue.
# 3. Adjusting the X and Y axis limits
# The X and Y axis limits can be controlled in 2 ways.
# Method 1: By deleting the points outside the range
# This will change the lines of best fit or smoothing lines as compared to the original data.
# This can be done by xlim() and ylim().
library(ggplot2)
g <- ggplot(midwest, aes(x=area, y=poptotal)) + geom_point() + geom_smooth(method="lm") # set se=FALSE to turnoff confidence bands