Nessa atividade, vamos fazer um histograma e uma análise descritiva (média, mediana, desvio-padrão) em uma variável quantitativa da base de dados chamada “df_pokemon.RData”.
load("/Users/eduar/Base_de_dados-master/df_pokemon.RData")
class(df$id)
## [1] "numeric"
class(df$pokemon)
## [1] "character"
class(df$species_id)
## [1] "integer"
class(df$height)
## [1] "integer"
class(df$weight)
## [1] "integer"
class(df$base_experience)
## [1] "integer"
class(df$type_1)
## [1] "character"
class(df$type_2)
## [1] "character"
class(df$attack)
## [1] "integer"
class(df$defense)
## [1] "integer"
class(df$hp)
## [1] "integer"
class(df$special_attack)
## [1] "integer"
class(df$special_defense)
## [1] "integer"
class(df$speed)
## [1] "integer"
class(df$color_1)
## [1] "character"
class(df$color_2)
## [1] "character"
class(df$color_f)
## [1] "character"
class(df$egg_group_1)
## [1] "character"
class(df$egg_group_2)
## [1] "character"
class(df$url_image)
## [1] "character"
class(df$x)
## [1] "numeric"
class(df$y)
## [1] "numeric"
Depois de classificar as variáveis, foi identificado quais eram quantitativas e quais eram qualitativas e escolhi uma para analisar.
mean(df$special_attack)
## [1] 68.46797
A media do ataque especial foi de 68,4
median(df$special_attack)
## [1] 65
A mediana do ataque especial foi de 65
sd(df$special_attack)
## [1] 28.53129
O desvio padrao do ataque especial foi de 28,5
Nao houve outliers, entao nao houve uma necessidade de escolher uma medida mais adequada de calculo nesta variavel
summary(df)
## id pokemon species_id height
## Min. : 1.0 Length:718 Min. : 1.0 Min. : 1.00
## 1st Qu.:180.2 Class :character 1st Qu.:180.2 1st Qu.: 6.00
## Median :359.5 Mode :character Median :359.5 Median : 10.00
## Mean :359.5 Mean :359.5 Mean : 11.41
## 3rd Qu.:538.8 3rd Qu.:538.8 3rd Qu.: 14.00
## Max. :718.0 Max. :718.0 Max. :145.00
## weight base_experience type_1 type_2
## Min. : 1.0 Min. : 36.00 Length:718 Length:718
## 1st Qu.: 95.0 1st Qu.: 65.25 Class :character Class :character
## Median : 280.0 Median :147.00 Mode :character Mode :character
## Mean : 568.2 Mean :141.55
## 3rd Qu.: 609.5 3rd Qu.:177.00
## Max. :9500.0 Max. :608.00
## attack defense hp special_attack
## Min. : 5.00 Min. : 5.00 Min. : 1.00 Min. : 10.00
## 1st Qu.: 53.00 1st Qu.: 50.00 1st Qu.: 50.00 1st Qu.: 45.00
## Median : 73.00 Median : 65.00 Median : 65.00 Median : 65.00
## Mean : 74.85 Mean : 70.67 Mean : 68.37 Mean : 68.47
## 3rd Qu.: 95.00 3rd Qu.: 85.00 3rd Qu.: 80.00 3rd Qu.: 90.00
## Max. :165.00 Max. :230.00 Max. :255.00 Max. :154.00
## special_defense speed color_1 color_2
## Min. : 20.00 Min. : 5.00 Length:718 Length:718
## 1st Qu.: 50.00 1st Qu.: 45.00 Class :character Class :character
## Median : 65.00 Median : 65.00 Mode :character Mode :character
## Mean : 69.09 Mean : 65.72
## 3rd Qu.: 85.00 3rd Qu.: 85.00
## Max. :230.00 Max. :160.00
## color_f egg_group_1 egg_group_2 url_image
## Length:718 Length:718 Length:718 Length:718
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## x y
## Min. :-49.152 Min. :-45.793
## 1st Qu.:-17.695 1st Qu.:-17.293
## Median : 0.705 Median : -0.628
## Mean : 0.000 Mean : 0.000
## 3rd Qu.: 15.905 3rd Qu.: 18.155
## Max. : 53.142 Max. : 46.593
boxplot(df$special_attack)
boxplot(df$special_attack, col = "violet",
main = "Boxplot Ataque Especial",
horizontal = TRUE)
A análise do boxplot:
1º quartil = 45
2º quartil ou mediana = 65
3º quartil = 90
mínimo = 10
máximo = 154
Há uma concentração maior entre a mediana e o máximo.
hist(df$special_attack, col = c("springgreen", "blue", "red", "red", "pink", "red", "sienna3", "sienna3", "sienna3", "purple3", "rosybrown", "rosybrown", "peachpuff1", "peachpuff1", "orange1"),
main = "Histograma Ataque Especial",
ylab = "Frequência",
xlab = "Ataque Especial")
Análise do histograma:
O histograma é assimétrico a esquerda.
No histograma, de 50 a 100 ataques temos uma frequência variando de 10 a 100, de 100 a 150 ataques temos uma frequência variando de 65 a 1.