Previously in part 0, I cleaned up the datafile
Now that all the data is in the right format, it would be good to explore some initial questions through data visualization.
First we read in the cleaned csv file that we created.

tenders<-read_csv("cleaned_tenders.csv")

1) Is there a relation between area and price?

A scatterplot of price against area would give a rough sense of any relationship between these variables.

It seems that area and price are not very realated as there are low prices for all area sizes. But, for a specific size range (5-10), there exist moderate and high priced stores.

temp <- data.frame(area=tenders$area,price=tenders$bidNum) 
ggplot(temp,aes(area,price,color=price))+
  geom_point()+
  scale_color_continuous(trans='reverse')+
  geom_smooth(method="glm")+
  labs(title="Exploration Question 1", subtitle = "Plot of price against area", y = "Price", x = "Area", color = "Price")

#Zooming in: limiting x-axis to 5-10 and y-axis to 0-5000
ggplot(temp,aes(area,price,color=price))+
  geom_point()+
  coord_cartesian(xlim=c(5,10),ylim=c(0,5000))+
  scale_color_continuous(trans='reverse')+
  geom_smooth(method="glm")+
  labs(title="Exploration Question 1", subtitle = "Plot of price against area (zoomed in)", y = "Price", x = "Area", color = "Price")

2) What is the average price per m2 per type and trade?

The average price-per-m2 per type is shown in bar chart 1
and the average price-per-m2 per trade is shown in bar chart 2

tenders %>% #avg price per centre
  drop_na(bidNum) %>%
  drop_na(area) %>%
  group_by(type) %>%
  summarise(price = mean(priceM2)) %>%
  ggplot(aes(factor(type),price,fill=factor(type)))+geom_bar(stat = "identity")+
  labs(title="Exploration Question 2", subtitle = "Avg price-per-m2 per type", y = "Price", x = "Type")+ 
  theme(legend.position="none")

tenders %>% #avg price per centre
  drop_na(bidNum) %>%
  drop_na(area) %>%
  group_by(trade) %>%
  summarise(price = mean(priceM2))%>%
  ggplot(aes(factor(trade),price,fill=price))+geom_bar(stat = "identity")+
  scale_fill_continuous(trans='reverse')+
  labs(title="Exploration Question 2", subtitle = "Avg price-per-m2 per trade", y = "Price", x = "Trade")+ 
  theme(legend.position="none")+
  theme(axis.text.x = element_text(angle = 60, hjust = 1))

3) What is the average price per m2 per hawker centre?

tenders %>% #avg price per centre
  group_by(centre) %>%
  summarise(price = mean(priceM2)) %>%
  ggplot(aes(centre,price,fill=price))+
  geom_bar(stat="identity")+
  scale_fill_continuous(trans='reverse')+
  labs(title="Exploration Question 3", subtitle = "Average price for each hawker center\n", x = "Hawker Center", y = "Price", fill = "Price")+
  theme(plot.title = element_text(size=25),plot.subtitle = element_text(size=20),axis.title = element_text(size=20),axis.text.y = element_text(size=10),axis.text.x = element_text(size=15),legend.text = element_text(size=15),legend.title = element_text(size=20))+
  coord_flip()

#I'm sorry Prof, I tried to increase the size of the chart but I can't seem to find a good method to do so. The best way I could think of to reduce the messiness is to lower the font size :(   

4) How many bids are there for each hawker center?

To show this, we first group the data by centre then sum the number of bids.
We plot a barchart for each hawker center to show the number of bids.

tenders %>% #number of bids per centre
  group_by(centre) %>%
  summarise(bids = n()) %>%
  ggplot(aes(centre,bids,fill=bids))+
  geom_bar(stat="identity")+
  scale_fill_continuous(trans='reverse')+
  labs(title="Exploration Question 4", subtitle = "Number of bids for each hawker center\n", x = "Hawker Center", y = "Bids", fill = "Bids")+
  theme(plot.title = element_text(size=25),plot.subtitle = element_text(size=20),axis.title = element_text(size=20),axis.text.y = element_text(size=10),axis.text.x = element_text(size=15),legend.text = element_text(size=15),legend.title = element_text(size=20))+
  coord_flip()

5) Does the number of bids change over time?

From the graph below, we can see that the number of bids vary for each year.
There are points in the year where there are spikes in the bids but these periods do not seem to have a pattern.
Overall, the number of bids increases over the years, as seen by the positive regression line.

tenders %>% 
  group_by(date) %>%
  summarise(bids = n()) %>%
  ggplot(aes(date,bids))+geom_line()+
  geom_smooth(method="glm")+
  labs(title="Exploration Question 5", subtitle = "Change in bids over time\n", x = "Time", y = "Bids")

#I am not sure how to scale the x-axis. I tried to add scale_x_continuous but it keeps throwing errors regardless of input, perhaps it is due to the x-axis being in date format, but even then when i tried adding as.date, it still fails. So I left it as the default scaling

6) Does the average price per m2 change over time?

From the graph, we see that the average price per m2 was anomalously high at $175 in early 2012, afterwhich it fell to $75.
The average price per m2 varies throughout the year with sharp increases and decreases, seemingly at random, but it could be a response to the demand (number of bids), or vice versa.
Overall the prices per m2 have increased slightly over the years, as seen by the regression line.

tenders %>% 
  group_by(date) %>%
  summarise(price = mean(priceM2)) %>%
  ggplot(aes(date,price))+geom_line()+
  geom_smooth(method="glm")+
  labs(title="Exploration Question 6", subtitle = "Change in average price per m2 over time\n", x = "Time", y = "Price per M2")


PART 0 - Preparation
PART 1 - Data Manipulation
PART 2 - Spatial Data
PART 3 - Spatial Point Patterns