knitr::opts_chunk$set(echo = TRUE)

###P1 Realice todo el trabajo de datos de la Actividad 2 hasta la pregunta 3 (incluyendola). Además cargue el paquete ggplot2 que se utilizará en esta actividad. Utilice la base de datos A3.csv

1.1

library(data.table)
app<-fread("A3.csv", fill= T)

1.2

class(app)
## [1] "data.table" "data.frame"
head(app)
##                                                        app       category
## 1:          Photo Editor & Candy Camera & Grid & ScrapBook ART_AND_DESIGN
## 2:                                     Coloring book moana ART_AND_DESIGN
## 3: U Launcher Lite âÂ\200“ FREE Live Cool Themes, Hide Apps ART_AND_DESIGN
## 4:                                   Sketch - Draw & Paint ART_AND_DESIGN
## 5:                   Pixel Draw - Number Art Coloring Book ART_AND_DESIGN
## 6:                              Paper flowers instructions ART_AND_DESIGN
##    rating reviews size    installs type price contentrating
## 1:    4.1     159  19M     10,000+ Free     0      Everyone
## 2:    3.9     967  14M    500,000+ Free     0      Everyone
## 3:    4.7   87510 8.7M  5,000,000+ Free     0      Everyone
## 4:    4.5  215644  25M 50,000,000+ Free     0          Teen
## 5:    4.3     967 2.8M    100,000+ Free     0      Everyone
## 6:    4.4     167 5.6M     50,000+ Free     0      Everyone
##                       genres      lastupdated         currentver
## 1:              Art & Design  January 7, 2018              1.0.0
## 2: Art & Design;Pretend Play January 15, 2018              2.0.0
## 3:              Art & Design   August 1, 2018              1.2.4
## 4:              Art & Design     June 8, 2018 Varies with device
## 5:   Art & Design;Creativity    June 20, 2018                1.1
## 6:              Art & Design   March 26, 2017                1.0
##      androidver
## 1: 4.0.3 and up
## 2: 4.0.3 and up
## 3: 4.0.3 and up
## 4:   4.2 and up
## 5:   4.4 and up
## 6:   2.3 and up
names(app)
##  [1] "app"           "category"      "rating"        "reviews"      
##  [5] "size"          "installs"      "type"          "price"        
##  [9] "contentrating" "genres"        "lastupdated"   "currentver"   
## [13] "androidver"
app<-app[!duplicated(app)]

1.3

names(app)
##  [1] "app"           "category"      "rating"        "reviews"      
##  [5] "size"          "installs"      "type"          "price"        
##  [9] "contentrating" "genres"        "lastupdated"   "currentver"   
## [13] "androidver"
app1<-app[,.(app,category,rating,reviews,installs,type,price,`contentrating`)]
library(ggplot2)

###P2 Cree un gráfico de barra con ggplot que contenga el conteo de cada categoría de la base de datos que ha generado en la pregunta 1. Pista: Es normal la saturación del gráfico por la cantidad de categorías.

categorias<-table(app1[,category])

categorias<-app1[,.N,by=category]
app1[,.N,by=category]
##                category    N
##  1:      ART_AND_DESIGN   61
##  2:   AUTO_AND_VEHICLES   73
##  3:              BEAUTY   42
##  4: BOOKS_AND_REFERENCE  169
##  5:            BUSINESS  263
##  6:              COMICS   54
##  7:       COMMUNICATION  256
##  8:              DATING  134
##  9:           EDUCATION  118
## 10:       ENTERTAINMENT  102
## 11:              EVENTS   45
## 12:             FINANCE  302
## 13:      FOOD_AND_DRINK   94
## 14:  HEALTH_AND_FITNESS  244
## 15:      HOUSE_AND_HOME   62
## 16:  LIBRARIES_AND_DEMO   64
## 17:           LIFESTYLE  301
## 18:                GAME  912
## 19:              FAMILY 1608
## 20:             MEDICAL  290
## 21:              SOCIAL  203
## 22:            SHOPPING  180
## 23:         PHOTOGRAPHY  263
## 24:              SPORTS  260
## 25:    TRAVEL_AND_LOCAL  187
## 26:               TOOLS  718
## 27:     PERSONALIZATION  298
## 28:        PRODUCTIVITY  301
## 29:           PARENTING   50
## 30:             WEATHER   72
## 31:       VIDEO_PLAYERS  148
## 32:  NEWS_AND_MAGAZINES  204
## 33: MAPS_AND_NAVIGATION  118
##                category    N
ggplot(data=categorias,aes(x=category , weights=N)) + geom_bar()

###P3 Cree un scatter-plot con ggplot que muestre la relación entre el número de comentarios (reviews) (eje x) y el rating (eje y). Pista: Revisar la pista de la pregunta 6 de la Actividad 2.

class(app1[,rating])
## [1] "numeric"
ggplot(data=app1,aes( x=reviews ,y = rating)) + geom_point()

P3<-ggplot(data=app1,aes( x=reviews ,y = rating))+ geom_point()

###P4 realice un histograma doble con ggplot que muestre la distribución del precio (price) de las aplicaciones para las categorías SOCIAL y PHOTOGRAPHY. Es decir, un histograma para cada categoría pero en un mismo gráfico como muestra el diagrama de ejemplo. Pista: Revisa la pregunta 5 de la actividad 2 y recuerda la función facet_wrap() vista en el último taller en clases.

app_final<-app1[category%in% c("SOCIAL","PHOTOGRAPHY")]
ggplot(data=app_final , aes(x=rating ,weights= price , fill=category)) + geom_histogram() + facet_wrap(~category)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.