Loading packages and dataset

pacotes <- c("readxl", "dplyr")
lapply(pacotes, library, character.only = TRUE)
cars_train <- read_excel("cars_train.xlsx")

Analyses

Using table to look at the frequencies of the variable “garantia_de_fábrica”( factory warranty)

table(cars_train$garantia_de_fábrica)
## 
## Garantia de fábrica 
##                4365

. ## Creating a new dataset only with cars that still have factory warranty

garantia <- filter(cars_train, garantia_de_fábrica %in% c("Garantia de fábrica"))

Looking at the volume of observations for each “estado_vendedor”(selling state)

vendas_por_estado <- garantia %>%
  group_by(estado_vendedor) %>%
  summarize(volume_vendas = n()) %>%
  arrange(desc(volume_vendas))
print(vendas_por_estado, n = Inf)
## # A tibble: 21 × 2
##    estado_vendedor          volume_vendas
##    <chr>                            <int>
##  1 São Paulo (SP)                    2307
##  2 Rio de Janeiro (RJ)                412
##  3 Paraná (PR)                        389
##  4 Santa Catarina (SC)                330
##  5 Rio Grande do Sul (RS)             281
##  6 Minas Gerais (MG)                  264
##  7 Goiás (GO)                         186
##  8 Alagoas (AL)                        58
##  9 Bahia (BA)                          51
## 10 Pernambuco (PE)                     20
## 11 Pará (PA)                           17
## 12 Mato Grosso do Sul (MS)             13
## 13 Amazonas (AM)                        9
## 14 Espírito Santo (ES)                  9
## 15 Rio Grande do Norte (RN)             9
## 16 Sergipe (SE)                         3
## 17 Acre (AC)                            2
## 18 Mato Grosso (MT)                     2
## 19 Ceará (CE)                           1
## 20 Paraíba (PB)                         1
## 21 Tocantins (TO)                       1

Getting the lowest price for each “estado_vendedor”

menor_preco_por_estado <- garantia %>%
  group_by(estado_vendedor) %>%
  summarize(menor_preco = min(preco)) %>%
arrange(menor_preco)
print(menor_preco_por_estado, n = Inf)
## # A tibble: 21 × 2
##    estado_vendedor          menor_preco
##    <chr>                          <dbl>
##  1 Paraná (PR)                   29328.
##  2 Minas Gerais (MG)             29907.
##  3 São Paulo (SP)                31763.
##  4 Bahia (BA)                    33101.
##  5 Rio de Janeiro (RJ)           39556.
##  6 Pará (PA)                     43907.
##  7 Santa Catarina (SC)           44005.
##  8 Goiás (GO)                    45812.
##  9 Espírito Santo (ES)           49920.
## 10 Rio Grande do Sul (RS)        54743.
## 11 Rio Grande do Norte (RN)      54849.
## 12 Amazonas (AM)                 54908.
## 13 Alagoas (AL)                  55178.
## 14 Pernambuco (PE)               55244.
## 15 Acre (AC)                     72812.
## 16 Mato Grosso do Sul (MS)       85731.
## 17 Paraíba (PB)                  95763.
## 18 Ceará (CE)                   123940.
## 19 Mato Grosso (MT)             178934.
## 20 Tocantins (TO)               243002.
## 21 Sergipe (SE)                 308357.

Calculating the standard deviation for each “estado_vendedor”

desvio_padrao_por_estado <- garantia %>%
  group_by(estado_vendedor) %>%
  summarize(desvio_padrao = sd(preco)) %>%
arrange(desc(desvio_padrao))
print(desvio_padrao_por_estado, n = Inf)
## # A tibble: 21 × 2
##    estado_vendedor          desvio_padrao
##    <chr>                            <dbl>
##  1 Acre (AC)                      109750.
##  2 Santa Catarina (SC)             93082.
##  3 Goiás (GO)                      90810.
##  4 Rio Grande do Sul (RS)          89532.
##  5 Bahia (BA)                      86729.
##  6 Rio de Janeiro (RJ)             86162.
##  7 Sergipe (SE)                    85150.
##  8 São Paulo (SP)                  83010.
##  9 Paraná (PR)                     82211.
## 10 Rio Grande do Norte (RN)        77499.
## 11 Minas Gerais (MG)               76857.
## 12 Alagoas (AL)                    76039.
## 13 Pernambuco (PE)                 65277.
## 14 Pará (PA)                       40698.
## 15 Mato Grosso do Sul (MS)         37919.
## 16 Espírito Santo (ES)             33633.
## 17 Amazonas (AM)                   28311.
## 18 Mato Grosso (MT)                26478.
## 19 Ceará (CE)                         NA 
## 20 Paraíba (PB)                       NA 
## 21 Tocantins (TO)                     NA

Calculating the average selling price for each “estado_vendedor”

preco_medio_por_estado <- garantia %>%
  group_by(estado_vendedor) %>%
  summarize(preco_medio = mean(preco)) %>%
  arrange(preco_medio)
print(preco_medio_por_estado, n = Inf)
## # A tibble: 21 × 2
##    estado_vendedor          preco_medio
##    <chr>                          <dbl>
##  1 Paraíba (PB)                  95763.
##  2 Pará (PA)                     98157.
##  3 Amazonas (AM)                 99617.
##  4 Espírito Santo (ES)          104030.
##  5 Mato Grosso do Sul (MS)      121710.
##  6 Ceará (CE)                   123940.
##  7 Rio Grande do Norte (RN)     133120.
##  8 Pernambuco (PE)              149898.
##  9 Acre (AC)                    150417.
## 10 Alagoas (AL)                 154269.
## 11 Minas Gerais (MG)            161206.
## 12 Bahia (BA)                   165221.
## 13 São Paulo (SP)               166751.
## 14 Santa Catarina (SC)          173798.
## 15 Goiás (GO)                   174521.
## 16 Paraná (PR)                  175578.
## 17 Rio Grande do Sul (RS)       176442.
## 18 Rio de Janeiro (RJ)          179110.
## 19 Mato Grosso (MT)             197657.
## 20 Tocantins (TO)               243002.
## 21 Sergipe (SE)                 367138.

Once again, the state of São Paulo demonstrates, in general, the most interesting metrics for the goal at hand. Besides having the largest volume of observations, São Paulo also presents one of the cars with a warranty at one of the lowest prices.

In the ranking of the average prices of cars with a warranty by state, São Paulo occupies the thirteenth position. However, if we consider only the five states with the highest number of cars with a warranty, São Paulo would be in first place. As for the standard deviation, it indicates that São Paulo has a moderate variability in prices compared to other states.

These metrics indicate that, based on the analyzed data, São Paulo is the most suitable state for purchasing factory warranty cars. It’s where you would find the largest number of available options, as well as having a reasonable average price. Additionally, the high number of observations and a reasonably wide standard deviation suggest that the buyer would have good chances of finding opportunities at lower prices, provided they were willing to search for them.

It’s important to note that this analysis was conducted only with observations that had the category “Garantia de Fábrica” in the qualitative variable “garantia_de_fábrica.”