#1.    What is data visualization? This is a big area, so try and give an overview. Data visualization is various visual representation of a dataset, graph, tables or charts.
#2.   List the two main graphics systems of R. Standard graphics and grid graphics
#3.    List the tools for visualizing:graphics and ggplot2
#4.    Explain faceting/facets: it partitions a plot into a matrix of panels with similar graph for different variables in the dataset
# 5.   (a) What are the two basic problems of barplots or barcharts? #The value of nominial data is to long and a few bars maybe #undistinguishble

#      (b) What are the solutions of each of the problems in (a) #above? You could use coord_filp() to swap the x-y coordinates. There #is really no issue here

 #6.   It is generally a good idea to use the information provided by barcharts to create, or  show, as a pie chart.  False

#7.    When analyzing data, explain what barcharts would be used for?
# Barcharts are used to analyze frequency of variables(distribution) of #dataset
#8.    (a) What does the package, GGally, provide? (e.g., what functions does it provide, etc.) Scatterplot

#    (b) What does the function, ggpairs, do? Create matrix of variable #     comparisons

#9    (a)  Explain the function, facet_wrap(). This function allows you to indicate a nominal variable whose values will create a set of subplots that will be presented sequentially with reasonable wrapping around the screen space.

#     (b)  Explain the function, facet_grid(). Allows us to set up a matrix of plots with each dimension of the matrix getting as many plots as there are values of the respective variable. For each cell of this matrix the graph specified before the facet is shown using only the subset of rows that have the respective values on the variables defining the grid.

           

# 10.   Explain the following argument:
#>   aes(x = Sepal.Length, y = Sepal.Width)
# Augument assigns Sepal.Length to x axis and Sepal.Width to the Y 

#11.    Give an interpretation of the symbol, “ ~ ”?  It separates the response and #predictor variables in specification of a model

#12.    What are the aesthetics in a plot?  (color, shape, size,etc.)

#13.    (a) What are layers? At one end we have concrete graphics devices (the most common
# being the computer screen) where the plots will be shown. On the other end we have the
# graphics functions we will use to produce concrete statistical plots.

# (b) What are the five components of layers? 1) Data,•(2) Aesthetic mappings,•(3) A statistical transformation (stat),•(4) A geometric object (geom),•(5) A position adjustment

#14.    What is scaling? Used for standardizing data, primarily by centering (subtracting the mean) and scaling (dividing by the standard deviation).

#15.    Explain layers and what they are used for.At one end we have concrete graphics devices (the most common being the computer screen) where the plots will be shown. On the other end we have the graphics functions we will use to produce concrete statistical plots.

#16.    What are themes? Themes are a powerful way to customize the non-data components of your plots: i.e. titles, labels, fonts, background, gridlines, and legends. Themes can be used to give plots a consistent customized look

#17.    What are APIs? Application Programming Interface, is a set of rules and tools that allow different software applications to communicate and exchange data.

#18.   Give two examples of “geom,” and explain what they do. a point(geom_point()), a bar(geom_bar())

#19.    Explain the following function and all of its arguments, etc.

#       >  ggplot(iris, aes(x = Species, y = Sepal.Length)) + geom_boxplot( )
#iris-dataset, create boxplot for the Sepal.Length

 
#20.   Explain the R graphics layered architecture? data, mappings, stat, geom

#21.    In “ggplot” what does “gg” stand for? Grammar of Graphics

#22.    (a) List some aesthetic attributes. x and y position, line color, point shapes, etc

#       (b) List some geometric objects that are defined by the grammar for graphics. like a point (geom_point()), a bar(geom_bar()), a polygon (geom_polygon()), a histogram (geom_histogram()), a boxplot (geom_boxplot()), a map (geom_map()), ...)

           

#23. What statistical plot can we use to explore the distribution of the values of a
#     nominal variable? to explore the distribution of these values we can use a barplot with this purpose

 

#24.   Use the ggplot2 package to write an algorithm, or a chunk of code, #that will create a plot of the distribution of the values of a #continuous variable (use any geom except the histogram). Choose the correct geom.  Use the iris dataset.

#(a)        Then, explain why you chose your algorithm, Boxplot, it give the summary statistics, as well as identify outliers

#(b)       explain why you chose the functions you used, Data was continuous

#(c)        explain why you chose your geom, Boxplot

#(d)       show your plot.
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.3
data(iris)
ggplot(iris,aes(x=Species,y=Sepal.Length)) + geom_boxplot()

#25. Use the ggplot2 package to write an algorithm, or a chunk of code, that will create a plot of the distribution of the values of a continuous variable using a histogram. Use the iris dataset.

a) Then, explain why you chose to use your aes( ) function, Because I want to plot petal length on x-axis

(b) explain why you chose to use your particular geom, Continuous variable

(c) show your plot.

rm(list=ls())
library(ggplot2)
data(iris)

ggplot(iris,aes(x=Petal.Length)) + geom_histogram() + xlab("Petal Length")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.