This is my first Markdown document.

Let’s load the houseprice data.
For this practice, I will use the houseprices dataset. This dataset includes the following:

for a sample of houses sold in Aranda in 1999. Aranda is a suburb of Canberra, Australia.

list.of.packages <- c("Stat2Data", "datasets", "boot", "HistData")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)

library(HistData)
library(Stat2Data)

houseprices <- read.csv("houseprices.csv")

dim(houseprices)
## [1] 15  3

This dataset is at the house level (1 observation = 1 house) The dataset has 15 observations and 3 variables. I used R code to compute this and display it inline (i.e. calling dim(houseprices)[.]), but in case you don’t believe these numbers, you can prove it with an str statement.

str(houseprices)
## 'data.frame':    15 obs. of  3 variables:
##  $ area      : int  694 905 802 1366 716 963 821 714 1018 887 ...
##  $ bedrooms  : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ sale.price: num  192 215 215 274 113 ...

As shown in the above table, thes dataset is perfectly equipped to examine the determinants of house prices, making it a very strong piece of research.

summary(houseprices)
##       area           bedrooms       sale.price   
##  Min.   : 694.0   Min.   :4.000   Min.   :112.7  
##  1st Qu.: 743.5   1st Qu.:4.000   1st Qu.:213.5  
##  Median : 821.0   Median :4.000   Median :221.5  
##  Mean   : 889.3   Mean   :4.333   Mean   :237.7  
##  3rd Qu.: 984.5   3rd Qu.:4.500   3rd Qu.:267.0  
##  Max.   :1366.0   Max.   :6.000   Max.   :375.0

I use inline r syntax to compute min, median, and max of “sale.price”

Here’s a pairs plot of the data.

pairs(houseprices)

Here’s a regression model of sale price on area and bedrooms.

fit<- lm(sale.price ~ area + bedrooms, data = houseprices)
summary(fit)
## 
## Call:
## lm(formula = sale.price ~ area + bedrooms, data = houseprices)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -80.897  -4.247   1.539  13.249  42.027 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -141.76132   67.87204  -2.089  0.05872 . 
## area           0.14255    0.04697   3.035  0.01038 * 
## bedrooms      58.32375   14.75962   3.952  0.00192 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 33.06 on 12 degrees of freedom
## Multiple R-squared:  0.731,  Adjusted R-squared:  0.6861 
## F-statistic:  16.3 on 2 and 12 DF,  p-value: 0.0003792

Both the number of bedrooms and the floor area are statistically significant at 1% and 5% level, respectively.