August 12 2017

Synopsis

This is the second project assignment for the Developping Data Products course in Coursera's Data Science specialization track. The purpose of this project is to create an interactive plot with plotly R package. I will create an interactive plot with the data of the artisanal companies in France (numbers, sales revenue, net added value) per department. Original Data is downloaded from data.gouv.fr.

Checking for required packages and install them if necessary, then load them

if (!require("plotly")) {
     install.packages("plotly")}
## Loading required package: plotly
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
if (!require("readxl")) {
     install.packages("readxl")}
## Loading required package: readxl
library(plotly)
library(readxl)

Loading data

Downloading data in DataProduct folder

if(!file.exists("./DataProduct2")) {
    dir.create("./DataProduct2")}
if(!file.exists("./DataProduct2/artisans_etalab.xls")) {
    fileUrl1 <- "https://www.data.gouv.fr/storage/f/2014-01-13T18-09-10/artisans_etalab.xls"
    download.file(fileUrl1, destfile="./DataProduct2/artisans_etalab.xls")}

Loading the data

Artisan <- read_xls("./DataProduct2/artisans_etalab.xls")

Quick Exploration of the dataset

dim(Artisan)
## [1] 73  4
head(Artisan)
## # A tibble: 6 x 4
##      Departement entreprises_artisanales chiffre_affaires Valeur_ajoutee
##            <chr>                   <dbl>            <dbl>          <dbl>
## 1     Guadeloupe                   11150        1442724.0       447836.2
## 2     Martinique                   10032        1179135.3       411015.0
## 3         Guyane                    3977         673351.6       212134.0
## 4     La Réunion                   14253        2658120.2       917440.4
## 5          Paris                   41415       10366280.5      4119777.4
## 6 Seine-et-Marne                   20120        5260904.4      1967502.3
str(Artisan)
## Classes 'tbl_df', 'tbl' and 'data.frame':    73 obs. of  4 variables:
##  $ Departement            : chr  "Guadeloupe" "Martinique" "Guyane" "La Réunion" ...
##  $ entreprises_artisanales: num  11150 10032 3977 14253 41415 ...
##  $ chiffre_affaires       : num  1442724 1179135 673352 2658120 10366281 ...
##  $ Valeur_ajoutee         : num  447836 411015 212134 917440 4119777 ...

The interactive plot

plot_ly(data = Artisan) %>%
  add_trace(x = ~Departement, y = ~entreprises_artisanales, type = "bar", color = I("green"), name = "Artisanal Company Numbers") %>%
  add_trace(x = ~Departement, y = ~chiffre_affaires, yaxis = "y2", type = "bar", color = I("blue"), name = "Sales Revenue in Euros") %>%
  add_trace(x = ~Departement, y = ~Valeur_ajoutee, yaxis = "y2", type = "bar", color = I("purple"), name = "Net Added Value in Euros") %>%
  layout(title = "the artisanal companies in France per department", yaxis2 = list(overlaying = "y2", side = "right"), yaxis = list(title = FALSE), xaxis = list(title = FALSE))

```