informe1

coffeData = read.table("coffeData.csv", header=TRUE, sep=";", dec=".", stringsAsFactors = TRUE)

summary(coffeData)

##        X           Species                  Country.of.Origin Fragrance...Aroma
##  Min.   :   1   Arabica:1309   Mexico                :237     Min.   :5.080    
##  1st Qu.: 335   Robusta:  28   Colombia              :183     1st Qu.:7.420    
##  Median : 669                  Guatemala             :180     Median :7.580    
##  Mean   : 669                  Brazil                :132     Mean   :7.572    
##  3rd Qu.:1003                  Taiwan                : 75     3rd Qu.:7.750    
##  Max.   :1337                  United States (Hawaii): 73     Max.   :8.750    
##                                (Other)               :457                      
##      Flavor        Aftertaste     Salt...Acid      Mouthfeel    
##  Min.   :6.080   Min.   :6.170   Min.   :5.250   Min.   :5.080  
##  1st Qu.:7.330   1st Qu.:7.250   1st Qu.:7.330   1st Qu.:7.330  
##  Median :7.580   Median :7.420   Median :7.580   Median :7.500  
##  Mean   :7.527   Mean   :7.407   Mean   :7.541   Mean   :7.524  
##  3rd Qu.:7.750   3rd Qu.:7.580   3rd Qu.:7.750   3rd Qu.:7.750  
##  Max.   :8.830   Max.   :8.670   Max.   :8.750   Max.   :8.750  
##                                                                 
##     Balance       Bitter...Sweet   Uniform.Cup       Clean.Cup     
##  Min.   : 5.250   Min.   :5.250   Min.   : 6.000   Min.   : 0.000  
##  1st Qu.:10.000   1st Qu.:7.330   1st Qu.:10.000   1st Qu.:10.000  
##  Median :10.000   Median :7.500   Median :10.000   Median :10.000  
##  Mean   : 9.868   Mean   :7.527   Mean   : 9.844   Mean   : 9.849  
##  3rd Qu.:10.000   3rd Qu.:7.670   3rd Qu.:10.000   3rd Qu.:10.000  
##  Max.   :10.000   Max.   :8.580   Max.   :10.000   Max.   :10.000  
##                                                                    
##  Cupper.Points   quality_score  
##  Min.   : 5.17   Min.   :63.08  
##  1st Qu.: 7.25   1st Qu.:81.17  
##  Median : 7.50   Median :82.50  
##  Mean   : 7.51   Mean   :82.17  
##  3rd Qu.: 7.75   3rd Qu.:83.67  
##  Max.   :10.00   Max.   :90.58  
##

Histrogram

library(ggplot2)

g1 <- ggplot(coffeData, aes(x=Flavor))+
  geom_histogram(fill="#F08787")+
  labs(title="Coffee flavor histogram")

g1

g2 <- ggplot(coffeData, aes(x=quality_score))+
  geom_histogram(fill="#6D94C5")+
  labs(title="Coffee quality score histogram")

g2

Boxplots

library(gridExtra)

g3 <- ggplot(coffeData, aes(x=Flavor))+
  geom_boxplot(fill="#F08787")+
  labs(title="Coffee flavor boxplot")

g4 <- ggplot(coffeData, aes(x=quality_score))+
  geom_boxplot(fill="#6D94C5")+
  labs(title="Coffee quality score boxplot")

grid.arrange(g3, g4)

Panel Grafico

grid.arrange(g1, g2)

## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

Analisis con 2 variables numericas

ggplot(coffeData, aes(x=Flavor, y=quality_score))+
  geom_jitter(colour="#748873")+
  geom_smooth(method="lm", colour="#F08787")

## `geom_smooth()` using formula = 'y ~ x'

cor(x=coffeData$Flavor, y=coffeData$quality_score, method='pearson')

## [1] 0.8348271

Interpretacion

Se tiene un coeficiente de correlacion de \(r=0.83\), lo cual indica una relacion alta entre las variables. Por tanto, a mayor calificacion del sabor mayor es la calidad del cafe.

Analisi con 1 variables numerica y 1 variable cualitativa

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:gridExtra':
## 
##     combine

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

SurAmericaCountries <- filter(coffeData,
  Country.of.Origin %in% c("Colombia","Brazil","Ecuador","Peru"))

unique(coffeData$Country.of.Origin)

##  [1] Uganda                       India                       
##  [3] United States                Ecuador                     
##  [5] Vietnam                      Ethiopia                    
##  [7] Guatemala                    Brazil                      
##  [9] Peru                         United States (Hawaii)      
## [11] Indonesia                    China                       
## [13] Costa Rica                   Mexico                      
## [15] Honduras                     Taiwan                      
## [17] Nicaragua                    Tanzania, United Republic Of
## [19] Kenya                        Thailand                    
## [21] Colombia                     Panama                      
## [23] Papua New Guinea             El Salvador                 
## [25] Japan                        United States (Puerto Rico) 
## [27] Haiti                        Burundi                     
## [29] Philippines                  Rwanda                      
## [31] Malawi                       Laos                        
## [33] Zambia                       Myanmar                     
## [35] Mauritius                    Cote d?Ivoire               
## 36 Levels: Brazil Burundi China Colombia Costa Rica Cote d?Ivoire ... Zambia

ggplot(SurAmericaCountries, aes(x=Country.of.Origin, y=quality_score, fill=Country.of.Origin))+
  geom_boxplot()+
  labs(title="Boxplots calidad con base en el pais", y="Puntaje de calidad")

Interpretacion

Se observa que Colombia se presenta una mayor calidad del cafe en comparacion de otros paises de Suramerica.

Dendrobates truncatus:

Conclusiones

Colombia presenta la mejor calidad del cafe que varios paises de Suramerica (Brazil, Ecuador, Peru).
Las variables sabor y calidad del cafe tienen una relacion directamente proporcional.
Existen datos atipicos en las variables sabor y calidad del cafe.

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)

##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.