Introduction

This is the assignment for the second week of the Developing Data Products course on Coursera. Many thanks to Arthur Charpentier, who combined data from the French INSEE (Institut National de la Statistique et des Etudes Economiques) to a data set which lists the population per Region, Départment or Commune. I took the lowest level (Commune), which provides a list of over 36.000 areas with the registered population. Fans of holiday in France can use this to determine the area they want to go, either busy or quiet.

Data Processing

Data loading

knitr::opts_chunk$set(verbose = FALSE, message = FALSE, warning = FALSE)

library(ggplot2)
library(leaflet)
## Warning: package 'leaflet' was built under R version 3.3.3
setwd("~/Datasciencecoursera/Module 9 Developing Data Products")
chart_data <- read.csv("http://freakonometrics.free.fr/popfr19752010.csv", header=TRUE)

Data preprocessing

Just pick a limited number of columns to work with, Latitude and Longitude for mapping, and the common name and population in the year 2010.

chart_df <- chart_data[,c("long", "lat","com_nom", "pop_2010")]
colnames(chart_df) <- c("lng", "lat", "common_name", "pop_2010")

Now, we can present the data with the help of clusterOptions, which is needed since the number of data points is over 36.000.

chart_df %>% leaflet() %>% addTiles() %>% addProviderTiles(providers$OpenStreetMap) %>% addMarkers(popup = paste("The population of ", chart_df$common_name, " is ", round(chart_df$pop_2010)),   clusterOptions = markerClusterOptions())

Another way of representing the data is through circles with size dependent on the population

chart_df %>% leaflet() %>% addTiles() %>% addProviderTiles(providers$OpenStreetMap) %>% addCircles( weight= 1, radius= sqrt(round(chart_df$pop_2010))/1000)

This gives a nice picture where you can see the population density. One can clearly spot the sparsely inhabited areas of the Pyrennees, the Alpes and the densily populated areas around Paris.