R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

Homes <- read.csv('D:/DataSet/Homes.csv')
class(Homes)
## [1] "data.frame"
str(Homes)
## 'data.frame':    492 obs. of  8 variables:
##  $ in_sf         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ beds          : num  2 2 2 1 0 0 1 1 1 2 ...
##  $ bath          : num  1 2 2 1 1 1 1 1 1 1 ...
##  $ price         : int  999000 2750000 1350000 629000 439000 439000 475000 975000 975000 1895000 ...
##  $ year_built    : int  1960 2006 1900 1903 1930 1930 1920 1930 1930 1921 ...
##  $ sqft          : int  1000 1418 2150 500 500 500 500 900 900 1000 ...
##  $ price_per_sqft: int  999 1939 628 1258 878 878 950 1083 1083 1895 ...
##  $ elevation     : int  10 0 9 9 10 10 10 10 12 12 ...
summary(Homes)
##      in_sf             beds             bath            price         
##  Min.   :0.0000   Min.   : 0.000   Min.   : 1.000   Min.   :  187518  
##  1st Qu.:0.0000   1st Qu.: 1.000   1st Qu.: 1.000   1st Qu.:  749000  
##  Median :1.0000   Median : 2.000   Median : 2.000   Median : 1145000  
##  Mean   :0.5447   Mean   : 2.155   Mean   : 1.906   Mean   : 2020696  
##  3rd Qu.:1.0000   3rd Qu.: 3.000   3rd Qu.: 2.000   3rd Qu.: 1908750  
##  Max.   :1.0000   Max.   :10.000   Max.   :10.000   Max.   :27500000  
##    year_built        sqft        price_per_sqft     elevation     
##  Min.   :1880   Min.   : 310.0   Min.   : 270.0   Min.   :  0.00  
##  1st Qu.:1924   1st Qu.: 832.8   1st Qu.: 730.5   1st Qu.: 10.00  
##  Median :1960   Median :1312.0   Median : 960.0   Median : 18.50  
##  Mean   :1959   Mean   :1523.0   Mean   :1195.6   Mean   : 39.85  
##  3rd Qu.:2001   3rd Qu.:1809.0   3rd Qu.:1419.0   3rd Qu.: 61.00  
##  Max.   :2016   Max.   :7800.0   Max.   :4601.0   Max.   :238.00

#data documentation

Data include Homes information (in_sf, beds, bath, price, year_built, sqft, price_per_sqft, elevation), the number of homes details Attribute Information in_sf - Information room booked or not beds - Number of Beds bath- Number of washrooms price - House Price year_built - In which year it built sqft - square feet price_per_sqft - price per square feet elevation - elevation details of building

#goals/purpose

Data is related to determine whether a home is in San Francisco or New York. We have taken standard deviation of price and variation for price per square feet. Where boxplot have taken consideration on sqft price. And at last San Francisco is relatively hilly, the elevation of a home may be a good way to distinguish the areas, so we compare cities based on elevation with sqft.

standard_deviation <- sd(Homes$price, na.rm= TRUE)
variation <- var(Homes$price_per_sqft, na.rm= TRUE)
summ <- sum(Homes$beds)

print(standard_deviation)
## [1] 2824055
print(variation)
## [1] 538412
print(summ)
## [1] 1060.5
plot(Homes$sqft, Homes$price_per_sqft, main="Scatter Plot of X vs Y", xlab="sqft", ylab="price_per_sqft")

hist(Homes$beds, main="Histogram of Quantity", xlab="beds")

boxplot(Homes$price_per_sqft, main="Box Plot of pricesper sqft")

pie(table(Homes$bath), main="Pie Chart of Category")

result <- aggregate(Homes$sqft,by=list(Homes$elevation), mean)
result
##     Group.1         x
## 1         0 1503.5556
## 2         1 1805.5000
## 3         2 1333.6667
## 4         3 1990.3000
## 5         4 1558.5556
## 6         5 1272.4000
## 7         6  854.8750
## 8         7 1063.5714
## 9         8 1116.1538
## 10        9 1949.6250
## 11       10 1386.9600
## 12       11 1075.1000
## 13       12 1591.9286
## 14       13 1312.5556
## 15       14 1107.2500
## 16       15 2484.6429
## 17       16  939.1667
## 18       17  935.6000
## 19       18 1097.4286
## 20       19  965.1429
## 21       20 1869.0000
## 22       21 2826.2000
## 23       22 1766.6667
## 24       23 1431.9000
## 25       24 1137.1818
## 26       25  938.0000
## 27       26 2378.3333
## 28       27  759.7500
## 29       29 1271.0000
## 30       30 1140.0000
## 31       31  998.0000
## 32       32  720.0000
## 33       33 1389.0000
## 34       34 1457.5000
## 35       35 1142.1667
## 36       36 1030.0000
## 37       38 2100.0000
## 38       39 1049.0000
## 39       41 1264.3333
## 40       42 1500.0000
## 41       43 1144.0000
## 42       44 1316.0000
## 43       46  988.0000
## 44       48 1462.0000
## 45       49 2708.0000
## 46       50 1106.0000
## 47       51 1317.3333
## 48       52 1790.7500
## 49       53  950.0000
## 50       54 1552.6000
## 51       55 1984.7500
## 52       56 1110.0000
## 53       57 1877.0000
## 54       58 1970.0000
## 55       59 2141.0000
## 56       60 1051.5000
## 57       61 1374.0000
## 58       62 3479.0000
## 59       63 1256.0000
## 60       64 1050.0000
## 61       65 1772.0000
## 62       66 1506.2500
## 63       67 4628.5000
## 64       68 1282.5000
## 65       69 1234.6667
## 66       70  832.3333
## 67       71 1145.0000
## 68       72 1012.5000
## 69       73 1782.6000
## 70       74  685.0000
## 71       75 2601.0000
## 72       76 2520.5000
## 73       77 1675.0000
## 74       79 1760.0000
## 75       80  667.0000
## 76       81 2345.0000
## 77       83 2575.0000
## 78       84 1084.3333
## 79       86 1415.0000
## 80       87 2347.0000
## 81       88 1915.0000
## 82       89 2458.0000
## 83       90 2157.0000
## 84       91 1006.6667
## 85       92 1350.0000
## 86       94 1625.0000
## 87       95 1513.0000
## 88       97 2159.5000
## 89       98 3406.5000
## 90      102 1806.5000
## 91      103 1515.0000
## 92      105 1520.0000
## 93      106 1543.7500
## 94      108 2528.0000
## 95      110 1450.0000
## 96      112 1635.0000
## 97      118 1310.0000
## 98      119 2330.0000
## 99      121 1284.0000
## 100     123 1254.0000
## 101     125 2168.0000
## 102     127 3258.0000
## 103     131 2050.0000
## 104     136 1595.0000
## 105     139 1925.6667
## 106     140 1752.0000
## 107     141 1752.0000
## 108     143 1785.5000
## 109     153 2001.0000
## 110     160 1905.0000
## 111     163 1483.5000
## 112     174 3729.0000
## 113     176 2769.0000
## 114     179 1626.0000
## 115     181 2376.0000
## 116     185 1524.0000
## 117     187 1868.0000
## 118     189  990.0000
## 119     216 1305.0000
## 120     227 1603.0000
## 121     238 4813.0000
barplot(result$x, names.arg=result$Group.1, xlab="sqft", ylab="elevation",
        main="sqft vs elevation",border="black")