Data-Visualization Home work 1

Antanas Kaminskas

Task

1.Choose a data set (the number of data attributes should be more than 5), explain why it is important or interesting for you. 2.Formulate research questions (for which you expect to find the answers) 3.Make some visualizations for the formulated questions. Prepare a presentation (where you explain the data, questions, problems, results) and upload it.

Questions

  1. How does DMC efects ISI ?
  2. How RH depends FFMC in different days?
  3. How wind depends area in different month?
  4. How tempreture depends FFCM in different month?
  5. How tempreture mean depends FFCM?
getwd()
## [1] "C:/Users/antanas.kaminskas/Desktop"
duom <- read.csv2("C:/Users/antanas.kaminskas/Desktop/forestfires.csv",
                      header = TRUE, sep = ";", dec = ".")

Data set attributes explained

  1. X - x-axis spatial coordinate within the Montesinho park map: 1 to 9
  2. Y - y-axis spatial coordinate within the Montesinho park map: 2 to 9
  3. month - month of the year: “jan” to “dec”
  4. day - day of the week: “mon” to “sun”
  5. FFMC - FFMC index from the FWI system: 18.7 to 96.20
  6. DMC - DMC index from the FWI system: 1.1 to 291.3
  7. DC - DC index from the FWI system: 7.9 to 860.6
  8. ISI - ISI index from the FWI system: 0.0 to 56.10
  9. temp - temperature in Celsius degrees: 2.2 to 33.30
  10. RH - relative humidity in %: 15.0 to 100
  11. wind - wind speed in km/h: 0.40 to 9.40
  12. rain - outside rain in mm/m2 : 0.0 to 6.4
  13. area - the burned area of the forest (in ha): 0.00 to 1090.84 (this output variable is very skewed towards 0.0, thus it may make sense to model with the logarithm transform).

Data str()

str(duom)
## 'data.frame':    517 obs. of  13 variables:
##  $ X    : int  7 7 7 8 8 8 8 8 8 7 ...
##  $ Y    : int  5 4 4 6 6 6 6 6 6 5 ...
##  $ month: chr  "mar" "oct" "oct" "mar" ...
##  $ day  : chr  "fri" "tue" "sat" "fri" ...
##  $ FFMC : num  86.2 90.6 90.6 91.7 89.3 92.3 92.3 91.5 91 92.5 ...
##  $ DMC  : num  26.2 35.4 43.7 33.3 51.3 ...
##  $ DC   : num  94.3 669.1 686.9 77.5 102.2 ...
##  $ ISI  : num  5.1 6.7 6.7 9 9.6 14.7 8.5 10.7 7 7.1 ...
##  $ temp : num  8.2 18 14.6 8.3 11.4 22.2 24.1 8 13.1 22.8 ...
##  $ RH   : int  51 33 33 97 99 29 27 86 63 40 ...
##  $ wind : num  6.7 0.9 1.3 4 1.8 5.4 3.1 2.2 5.4 4 ...
##  $ rain : num  0 0 0 0.2 0 0 0 0 0 0 ...
##  $ area : num  0 0 0 0 0 0 0 0 0 0 ...

Library which used to get visualization

#install.packages("ggpubr")
library(ggpubr)
## Įkeliamas reikalingas paketas: ggplot2
ggplot(data = duom, mapping = aes(x = DMC , y = ISI )) + 
  geom_point() + 
  geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

ggplot(duom, aes(x = FFMC, y = RH, color = day)) +
  geom_boxplot()

ggplot(data = duom) + 
  geom_point(mapping = aes(x = wind,
                           y = area , color = month))

ggplot(data = duom) + 
  stat_summary(
    mapping = aes(x = FFMC, y = temp),
    fun.min = min,
    fun.max = max,
    fun = median)

ggplot(data = duom) + 
  geom_point(mapping = aes(x = ISI , y = temp , color = month))