November 8, 2018

Different Types of Statistics Plot

There are different types of plots in statistics here are the few plots:

  • Histogram
  • Scatter Plot
  • Box Plot
  • 3D scatter Plot
  • 3D Surface Plot

Histogram

A histogram is a plot that lets you discover, and show,the underlying frequency distribution (shape) of a set of continuous data.

Scatter Plot

The Scatterplot is the normal representation of dataset which shows how the data really looks

Boxplot

Boxplots gives us the 5 number summary .More on this later

Scatter 3D Graph

Scatter 3D plot is used to show the variability of one variable with two other variability

Surface 3D

Some times surface plot help us to visualize Multivariate Regression

Graph of z=x+4y

Boxplot In Detail

  • "John Tukey" introduced Boxplot in 1969 .
  • According to him "Boxplot s use robust summary statistics that are always located at the actual data points are quickly computable and have no tuning parameters.
  • Boxplot is the graphically representation of statistics which gives us five number summary which are:
  • Minimum
  • First Quartile
  • Median (Second Quartile)
  • Third Quartile
  • Maximum

Boxplot Figure

Boxplot in R

Boxplot Terminologies

  • Maximum :- This is not exactly the maximum quantity in the data set but it is the most relevent data that is should be maximum .
  • Statistians have made a rule to get the maximum quantity.
  • Median :- It is the middle of the data points .
  • It is also called as 2nd quartile.
  • 1st Quartile:- It is the 25th percentile of data set.
  • It is the middle of the part of the data which is on left of Median.
  • 3rd Quartile:- It is the 75th percentile of the data and also the middle of the part of the data right to median.
  • Minumum :- It is the most relevent minimum of the data.

Explanation

Look on the board !!

Boxplot and Normal Distribution

  • Box plot is closely related to the Normal Distribution because they basically convey the same massage about the dataset

Boxplot and Normal Distribution (continued..)

  • From the figure we can see that the "Minimum" is generally taken as the (Q1-1.5IQR) and similarly the "Maximum" is the (Q3+1.5IQR) .
  • The equation of normal distribution is given by

Boxplot and Normal Distribution (continued..)

  • And for Standard Normal curve with mean =0 and standard deviation =1 is

Boxplot and Normal Distribution (continued..)

  • To see for IQR integrate to get 50 % of the area

Boxplot and Normal Distribution (continued..)

  • To get the area without outliers integrate for 99.3% of the area

Boxplot and Skewness of the PDFs

  • Boxplot also tell us about the how a unimodel density function looks like
  • There are three types of unimodel distribution which are:
  • Symmetric
  • Left Skewed
  • Right Skewed

Symmetric Distribution

  • In this Distribution where the right and left hand sides are roughly equally balanced around the mean .
  • The mean of the data set is approximately equal to the median

Symmetric Distribution

  • Boxplot for the symmetric distribution is

  • Another property of this distribution is that its median(second Quartile lies in the middle of 1st and 3rd Quantile

Right Skewed

  • The skewed right which is also known as positively skewed which is shown in the figure below

Right Skewed

  • Now the distribution is not symmetric around the mean anymore. For a right skewed
  • The mean is greater than the median the boxplot of this distribution will look like

Left Skewed

  • Simlarly for left one

Simulation on R

Just for Fun

References

  • 1.Statistics for Dummies
  • 2.Khan Acadamy
  • 3.Towards Data Science
  • 4.Chrito.com