Realice todo el trabajo de datos de la Actividad 3 hasta la pregunta 4 (incluyendola). Además cargue el paquete ggplot2 que se utilizará en esta actividad. Utilice la base A4
library(data.table)
app<-fread("A4.csv",fill = T)
class(app)
## [1] "data.table" "data.frame"
head(app)
## app
## 1: Photo Editor & Candy Camera & Grid & ScrapBook
## 2: Coloring book moana
## 3: U Launcher Lite â\u0080\u0093 FREE Live Cool Themes, Hide Apps
## 4: Sketch - Draw & Paint
## 5: Pixel Draw - Number Art Coloring Book
## 6: Paper flowers instructions
## category rating reviews size installs type price contentrating
## 1: ART_AND_DESIGN 4.1 159 19M 10,000+ Free 0 Everyone
## 2: ART_AND_DESIGN 3.9 967 14M 500,000+ Free 0 Everyone
## 3: ART_AND_DESIGN 4.7 87510 8.7M 5,000,000+ Free 0 Everyone
## 4: ART_AND_DESIGN 4.5 215644 25M 50,000,000+ Free 0 Teen
## 5: ART_AND_DESIGN 4.3 967 2.8M 100,000+ Free 0 Everyone
## 6: ART_AND_DESIGN 4.4 167 5.6M 50,000+ Free 0 Everyone
## genres lastupdated currentver androidver
## 1: Art & Design January 7, 2018 1.0.0 4.0.3 and up
## 2: Art & Design;Pretend Play January 15, 2018 2.0.0 4.0.3 and up
## 3: Art & Design August 1, 2018 1.2.4 4.0.3 and up
## 4: Art & Design June 8, 2018 Varies with device 4.2 and up
## 5: Art & Design;Creativity June 20, 2018 1.1 4.4 and up
## 6: Art & Design March 26, 2017 1.0 2.3 and up
names(app)
## [1] "app" "category" "rating" "reviews"
## [5] "size" "installs" "type" "price"
## [9] "contentrating" "genres" "lastupdated" "currentver"
## [13] "androidver"
app<-app[!duplicated(app)]
names(app)
## [1] "app" "category" "rating" "reviews"
## [5] "size" "installs" "type" "price"
## [9] "contentrating" "genres" "lastupdated" "currentver"
## [13] "androidver"
varint<-app[,.(app,category,rating,reviews,installs,type,price,contentrating)]
varint<-varint[category!="1.9"]
library(ggplot2)
ggplot()
Cree un gráfico de barra con ggplot que contenga el conteo de cada categoría de la base de datos que ha generado en la pregunta 1.
GRAFICO = ggplot(data=varint, aes(x=category)) +
geom_bar(stat="count")
GRAFICO
Agregar etiquetas a los ejes, títulos, subtítulos y fuente al gráfico anterior. Además, deje de manera legible las categrías del eje x.
GRAFICO = GRAFICO +
theme (text = element_text(size=10)) +
ggtitle ("DISTRIBUCION CATEGORIAS") +
theme (plot.title = element_text(family="ARIAL",
color="VIOLET") )
caption = "Fuente: Base de aplicaciones de Google Play Store "
subtitle = "Por tipo de Categoría"
labs(x = "CATEGORIAS",y = "CONTEO")
## $x
## [1] "CATEGORIAS"
##
## $y
## [1] "CONTEO"
##
## attr(,"class")
## [1] "labels"
GRAFICO
Cree un scatter-plot con ggplot que muestre la relación entre el número de comentarios (reviews) (eje x) y el rating (eje y)
GRAFICO2 = ggplot(data=varint, aes(x= reviews, y = rating)) + geom_point()
GRAFICO2
Realice un histograma doble con ggplot que muestre la distribución del precio (price) de las aplicaciones para las categorías SOCIAL y PHOTOGRAPHY. Es decir, un histograma para cada categoría pero en un mismo gráfico como muestra el diagrama de ejemplo.
Social<-varint[category=="SOCIAL"]
Photography<-varint[category=="PHOTOGRAPHY"]
syp<-rbind(Social,Photography)
syp
## app category rating reviews
## 1: Facebook SOCIAL 4.1 78158306
## 2: Instagram SOCIAL 4.5 66577313
## 3: Facebook Lite SOCIAL 4.3 8606259
## 4: Messages, Text and Video Chat for Messenger SOCIAL 4.4 49173
## 5: Tumblr SOCIAL 4.4 2955326
## ---
## 462: PIP-Camera FN Photo Effect PHOTOGRAPHY 4.6 8
## 463: Photo Editor Collage Maker Pro PHOTOGRAPHY 4.5 1519671
## 464: Free Slideshow Maker & Video Editor PHOTOGRAPHY 4.2 162564
## 465: Thumbnail Maker PHOTOGRAPHY 4.4 26252
## 466: PhotoFunia PHOTOGRAPHY 4.3 316378
## installs type price contentrating
## 1: 1,000,000,000+ Free 0 Teen
## 2: 1,000,000,000+ Free 0 Teen
## 3: 500,000,000+ Free 0 Teen
## 4: 10,000,000+ Free 0 Everyone
## 5: 100,000,000+ Free 0 Mature 17+
## ---
## 462: 1,000+ Free 0 Everyone
## 463: 100,000,000+ Free 0 Everyone
## 464: 10,000,000+ Free 0 Everyone
## 465: 1,000,000+ Free 0 Everyone
## 466: 10,000,000+ Free 0 Everyone
ggplot(data = syp, aes(x = price)) + geom_histogram(bins = 5) + facet_wrap("category")