Pop Quiz March 16th

Erin Dane

2022-03-16

> Housing <- read.table("/Users/erindane/Desktop/R Studios /Table2.1HousePrices-NoID.csv", 
+   header=TRUE, stringsAsFactors=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)

Determing Distribution of House Prices Using Histrogram

From this hisogram it is safe to assume that the distribution is normal given that the histogram is centred with no skewedness. Other methods should be used to do futher analysis.

> with(Housing, Hist(Price, scale="frequency", breaks="Sturges", col="darkgray"))

Determing Distribution of House Prices Using QQ Plot

From this QQ plot we can see that the prices of home are normally distributed. This conclusion can be formed as:

  • All prices fall within the 95% confidence interval
  • All prices fall very close to the QQ line
  • The distribution is mostly stright and not bell shaped
> with(Housing, qqPlot(Price, dist="norm", id=list(method="y", n=2, labels=rownames(Housing))))

[1] 104 117

Boxplot Analysing Price

From this graph we can determine

  • the max is approx 20,000
  • the min is approx 60,000
  • the median is 130,000
  • the upper quartile is 130,000
  • the lower quartile is 110,000
  • IQR is 20,000

Given that the whiskers are about the same length and that the median is relativelty centred within the box. It can be concluded that there is no skewedness and the price is normally distrubuted.

> Boxplot( ~ Price, data=Housing, id=list(method="y"))

[1] "104"

Determing Distrubution of House Price Using Normality Test

We can accept the null hyupothosis because p-value is greater than 0.05. Meaning that we accept the null and that the price is normally distributed.

> normalityTest(~Price, test="shapiro.test", data=Housing)

    Shapiro-Wilk normality test

data:  Price
W = 0.98023, p-value = 0.05836

Finding Correlation Between Price and Other Variables

First a correlation matrix can be used to find initial correlation coefficients between price and other variables. The strongest correlations are: - Positive correlation between Price and Bedrooms - Positive correlation between Price and Square Feet - Negative correlation betwwen Price and Offers

> cor(Housing[,c("Bedrooms","Offers","Price","SqFt")], use="complete")
          Bedrooms     Offers      Price      SqFt
Bedrooms 1.0000000  0.1142706  0.5259261 0.4838071
Offers   0.1142706  1.0000000 -0.3136359 0.3369234
Price    0.5259261 -0.3136359  1.0000000 0.5529822
SqFt     0.4838071  0.3369234  0.5529822 1.0000000

Price and Square Feet

From this scatter plot we can see that there is a postive correlation. This would mean that as the square feet of the house increases the price increases.

> scatterplot(SqFt~Price, regLine=FALSE, smooth=FALSE, boxplots=FALSE, data=Housing)