Carregar Base de dados

arte_MOMA <- read.csv2("C:/Estatistica/Base_de_dados-master/arte_MOMA.csv")

Questão 1

Quantas pinturas existem no MoMA? Quantas variáveis existem no banco de dados?

str(arte_MOMA)
## 'data.frame':    2253 obs. of  24 variables:
##  $ X                : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ title            : chr  "Rope and People, I" "Fire in the Evening" "Portrait of an Equilibrist" "Guitar" ...
##  $ artist           : chr  "Joan Miró" "Paul Klee" "Paul Klee" "Pablo Picasso" ...
##  $ artist_bio       : chr  "(Spanish, 1893-1983)" "(German, born Switzerland. 1879-1940)" "(German, born Switzerland. 1879-1940)" "(Spanish, 1881-1973)" ...
##  $ artist_birth_year: int  1893 1879 1879 1881 1880 1879 1943 1880 1839 1894 ...
##  $ artist_death_year: int  1983 1940 1940 1973 1946 1953 1977 1950 1906 1956 ...
##  $ num_artists      : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ n_female_artists : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ n_male_artists   : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ artist_gender    : chr  "Male" "Male" "Male" "Male" ...
##  $ year_acquired    : int  1936 1970 1966 1955 1939 1968 1997 1931 1934 1941 ...
##  $ year_created     : int  1935 1929 1927 1919 1925 1919 1970 1929 1885 1930 ...
##  $ circumference_cm : logi  NA NA NA NA NA NA ...
##  $ depth_cm         : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ diameter_cm      : logi  NA NA NA NA NA NA ...
##  $ height_cm        : num  104.8 33.8 60.3 215.9 50.8 ...
##  $ length_cm        : logi  NA NA NA NA NA NA ...
##  $ width_cm         : num  74.6 33.3 36.8 78.7 54 ...
##  $ seat_height_cm   : logi  NA NA NA NA NA NA ...
##  $ purchase         : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ gift             : logi  TRUE FALSE FALSE TRUE TRUE FALSE ...
##  $ exchange         : logi  FALSE FALSE FALSE FALSE TRUE FALSE ...
##  $ classification   : chr  "Painting" "Painting" "Painting" "Painting" ...
##  $ department       : chr  "Painting & Sculpture" "Painting & Sculpture" "Painting & Sculpture" "Painting & Sculpture" ...

R: Existem 2253 pinturas e um total de 24 variáveis.

Questão 2

Qual é a primeira pintura adquirida pelo MoMA? Qual ano? Qual artista? Qual título?

R:House by the Railroad, no ano de 1930, do artista Edward Hopper.

Questão 3

Qual é a pintura mais antiga da coleção? Qual ano? Qual artista? Qual título?

R: Landscape at Daybreak, do ano de 1872, do artista Odilon Redon

Questão 4

Quantos artistas distintos existem?

arte_MOMA$artist <- as.factor(arte_MOMA$artist)
str(arte_MOMA$artist)
##  Factor w/ 989 levels "A. E. Gallatin",..: 452 728 728 712 94 278 131 752 723 244 ...

R: 989 artistas

Questão 5

Qual artista tem mais pinturas na coleção?

library(dlookr)
## Loading required package: mice
## 
## Attaching package: 'mice'
## The following objects are masked from 'package:base':
## 
##     cbind, rbind
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## 
## Attaching package: 'dlookr'
## The following object is masked from 'package:base':
## 
##     transform
diagnose(arte_MOMA)
## # A tibble: 24 x 6
##    variables      types   missing_count missing_percent unique_count unique_rate
##    <chr>          <chr>           <int>           <dbl>        <int>       <dbl>
##  1 X              integer             0          0              2253     1      
##  2 title          charac~             0          0              2015     0.894  
##  3 artist         factor              0          0               989     0.439  
##  4 artist_bio     charac~             1          0.0444          859     0.381  
##  5 artist_birth_~ integer             6          0.266           132     0.0586 
##  6 artist_death_~ integer           629         27.9             102     0.0453 
##  7 num_artists    integer             1          0.0444            5     0.00222
##  8 n_female_arti~ integer             0          0                 3     0.00133
##  9 n_male_artists integer             0          0                 5     0.00222
## 10 artist_gender  charac~            10          0.444             3     0.00133
## # ... with 14 more rows
diagnose_category(arte_MOMA, artist)
## # A tibble: 10 x 6
##    variables levels               N  freq ratio  rank
##  * <chr>     <fct>            <int> <int> <dbl> <int>
##  1 artist    Pablo Picasso     2253    55  2.44     1
##  2 artist    Henri Matisse     2253    32  1.42     2
##  3 artist    On Kawara         2253    32  1.42     3
##  4 artist    Jacob Lawrence    2253    30  1.33     4
##  5 artist    Batiste Madalena  2253    25  1.11     5
##  6 artist    Jean Dubuffet     2253    25  1.11     6
##  7 artist    Odilon Redon      2253    25  1.11     7
##  8 artist    Ben Vautier       2253    24  1.07     8
##  9 artist    Frank Stella      2253    23  1.02     9
## 10 artist    Philip Guston     2253    23  1.02    10

R: Pablo Picasso, com 55 pinturas

Questão 6

Quantas pinturas existem por este artista?

R: 55 pinturas

Questão 7

Quantas pinturas de artistas masculinos e femininos?

diagnose(arte_MOMA)
## # A tibble: 24 x 6
##    variables      types   missing_count missing_percent unique_count unique_rate
##    <chr>          <chr>           <int>           <dbl>        <int>       <dbl>
##  1 X              integer             0          0              2253     1      
##  2 title          charac~             0          0              2015     0.894  
##  3 artist         factor              0          0               989     0.439  
##  4 artist_bio     charac~             1          0.0444          859     0.381  
##  5 artist_birth_~ integer             6          0.266           132     0.0586 
##  6 artist_death_~ integer           629         27.9             102     0.0453 
##  7 num_artists    integer             1          0.0444            5     0.00222
##  8 n_female_arti~ integer             0          0                 3     0.00133
##  9 n_male_artists integer             0          0                 5     0.00222
## 10 artist_gender  charac~            10          0.444             3     0.00133
## # ... with 14 more rows
diagnose_category(arte_MOMA, artist_gender)
## # A tibble: 3 x 6
##   variables     levels     N  freq  ratio  rank
## * <chr>         <chr>  <int> <int>  <dbl> <int>
## 1 artist_gender Male    2253  1991 88.4       1
## 2 artist_gender Female  2253   252 11.2       2
## 3 artist_gender <NA>    2253    10  0.444     3

R: São 1991 artistas do gênero masculino, 252 de feminino e 10 obras sem gênero especificado.

Questão 8

Quantos artistas de cada gênero existem?

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
xxx<- arte_MOMA
xxx %>%
  count(artist_gender, artist) %>%
  count(artist_gender) %>%
  mutate(n = as.character(paste(n, "art"))) %>%
  table()
##              n
## artist_gender 143 art 837 art 9 art
##        Female       1       0     0
##        Male         0       1     0

R: São 837 artistas do genero masculino e 143 artistas do gênero feminino

Questão 9

Em que ano foram adquiridas mais pinturas?

arte_MOMA$year_acquired <- as.factor(arte_MOMA$year_acquired)
diagnose(arte_MOMA)
## # A tibble: 24 x 6
##    variables      types   missing_count missing_percent unique_count unique_rate
##    <chr>          <chr>           <int>           <dbl>        <int>       <dbl>
##  1 X              integer             0          0              2253     1      
##  2 title          charac~             0          0              2015     0.894  
##  3 artist         factor              0          0               989     0.439  
##  4 artist_bio     charac~             1          0.0444          859     0.381  
##  5 artist_birth_~ integer             6          0.266           132     0.0586 
##  6 artist_death_~ integer           629         27.9             102     0.0453 
##  7 num_artists    integer             1          0.0444            5     0.00222
##  8 n_female_arti~ integer             0          0                 3     0.00133
##  9 n_male_artists integer             0          0                 5     0.00222
## 10 artist_gender  charac~            10          0.444             3     0.00133
## # ... with 14 more rows
diagnose_category(arte_MOMA, year_acquired)
## # A tibble: 10 x 6
##    variables     levels     N  freq ratio  rank
##  * <chr>         <fct>  <int> <int> <dbl> <int>
##  1 year_acquired 1985    2253    86  3.82     1
##  2 year_acquired 1942    2253    71  3.15     2
##  3 year_acquired 1979    2253    71  3.15     3
##  4 year_acquired 1991    2253    67  2.97     4
##  5 year_acquired 2005    2253    67  2.97     5
##  6 year_acquired 1967    2253    65  2.89     6
##  7 year_acquired 2008    2253    55  2.44     7
##  8 year_acquired 1961    2253    45  2.00     8
##  9 year_acquired 1969    2253    45  2.00     9
## 10 year_acquired 1956    2253    42  1.86    10

R: No ano de 1985, com um total de 86 pinturas.

Questão 10

Em que ano foram Criadas mais pinturas?

arte_MOMA$year_created <- as.factor(arte_MOMA$year_created)
diagnose(arte_MOMA)
## # A tibble: 24 x 6
##    variables      types   missing_count missing_percent unique_count unique_rate
##    <chr>          <chr>           <int>           <dbl>        <int>       <dbl>
##  1 X              integer             0          0              2253     1      
##  2 title          charac~             0          0              2015     0.894  
##  3 artist         factor              0          0               989     0.439  
##  4 artist_bio     charac~             1          0.0444          859     0.381  
##  5 artist_birth_~ integer             6          0.266           132     0.0586 
##  6 artist_death_~ integer           629         27.9             102     0.0453 
##  7 num_artists    integer             1          0.0444            5     0.00222
##  8 n_female_arti~ integer             0          0                 3     0.00133
##  9 n_male_artists integer             0          0                 5     0.00222
## 10 artist_gender  charac~            10          0.444             3     0.00133
## # ... with 14 more rows
diagnose_category(arte_MOMA, year_created)
## # A tibble: 11 x 6
##    variables    levels     N  freq ratio  rank
##  * <chr>        <fct>  <int> <int> <dbl> <int>
##  1 year_created 1977    2253    57  2.53     1
##  2 year_created 1940    2253    56  2.49     2
##  3 year_created 1964    2253    56  2.49     3
##  4 year_created 1961    2253    50  2.22     4
##  5 year_created 1962    2253    49  2.17     5
##  6 year_created 1963    2253    44  1.95     6
##  7 year_created 1959    2253    42  1.86     7
##  8 year_created 1968    2253    40  1.78     8
##  9 year_created 1960    2253    39  1.73     9
## 10 year_created 1914    2253    37  1.64    10
## 11 year_created 1950    2253    37  1.64    11

R: No ano de 1977, com um total de 57 pinturas.