Introduction

The dataset I have taken is about the apps in google playstore from kaggle. The dataset consists of app name, category, ratings,reviews and etc… And also it gives the information about the app is paid or free and version of android to be support.

DD

Structure

'data.frame':   300 obs. of  13 variables:
 $ App           : chr  "Photo Editor & Candy Camera & Grid & ScrapBook" "Coloring book moana" "U Launcher Lite – FREE Live Cool Themes, Hide Apps" "Sketch - Draw & Paint" ...
 $ Category      : chr  "ART_AND_DESIGN" "ART_AND_DESIGN" "ART_AND_DESIGN" "ART_AND_DESIGN" ...
 $ Rating        : num  4.1 3.9 4.7 4.5 4.3 4.4 3.8 4.1 4.4 4.7 ...
 $ Reviews       : chr  "159" "967" "87510" "215644" ...
 $ Size          : chr  "19M" "14M" "8.7M" "25M" ...
 $ Installs      : chr  "10,000+" "500,000+" "5,000,000+" "50,000,000+" ...
 $ Type          : chr  "Free" "Free" "Free" "Free" ...
 $ Price         : chr  "0" "0" "0" "0" ...
 $ Content.Rating: chr  "Everyone" "Everyone" "Everyone" "Teen" ...
 $ Genres        : chr  "Art & Design" "Art & Design;Pretend Play" "Art & Design" "Art & Design" ...
 $ Last.Updated  : chr  "07-Jan-18" "15-Jan-18" "01-Aug-18" "08-Jun-18" ...
 $ Current.Ver   : chr  "1.0.0" "2.0.0" "1.2.4" "Varies with device" ...
 $ Android.Ver   : chr  "4.0.3 and up" "4.0.3 and up" "4.0.3 and up" "4.2 and up" ...

From this the structure of the dataset is shown whether the column is numeric or character.

Columns

 [1] "App"            "Category"       "Rating"         "Reviews"       
 [5] "Size"           "Installs"       "Type"           "Price"         
 [9] "Content.Rating" "Genres"         "Last.Updated"   "Current.Ver"   
[13] "Android.Ver"   

It displays the names of the column of our dataset.

Summary

     App              Category             Rating        Reviews         
 Length:300         Length:300         Min.   :3.100   Length:300        
 Class :character   Class :character   1st Qu.:4.100   Class :character  
 Mode  :character   Mode  :character   Median :4.300   Mode  :character  
                                       Mean   :4.304                     
                                       3rd Qu.:4.500                     
                                       Max.   :4.900                     
                                       NA's   :11                        
     Size             Installs             Type              Price          
 Length:300         Length:300         Length:300         Length:300        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
                                                                            
 Content.Rating        Genres          Last.Updated       Current.Ver       
 Length:300         Length:300         Length:300         Length:300        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
                                                                            
 Android.Ver       
 Length:300        
 Class :character  
 Mode  :character  
                   
                   
                   
                   

Summary function used to display mean,median,sd and length of the dataset we taken and it used to do plot later.

Change the type and remove Null value

                                                 App       Category Rating
1     Photo Editor & Candy Camera & Grid & ScrapBook ART_AND_DESIGN    4.1
2                                Coloring book moana ART_AND_DESIGN    3.9
3 U Launcher Lite – FREE Live Cool Themes, Hide Apps ART_AND_DESIGN    4.7
4                              Sketch - Draw & Paint ART_AND_DESIGN    4.5
5              Pixel Draw - Number Art Coloring Book ART_AND_DESIGN    4.3
6                         Paper flowers instructions ART_AND_DESIGN    4.4
  Reviews Size    Installs Type Price Content.Rating                    Genres
1     159 19.0     10,000+ Free     0       Everyone              Art & Design
2     967 14.0    500,000+ Free     0       Everyone Art & Design;Pretend Play
3   87510  8.7  5,000,000+ Free     0       Everyone              Art & Design
4  215644 25.0 50,000,000+ Free     0           Teen              Art & Design
5     967  2.8    100,000+ Free     0       Everyone   Art & Design;Creativity
6     167  5.6     50,000+ Free     0       Everyone              Art & Design
  Last.Updated        Current.Ver  Android.Ver
1    07-Jan-18              1.0.0 4.0.3 and up
2    15-Jan-18              2.0.0 4.0.3 and up
3    01-Aug-18              1.2.4 4.0.3 and up
4    08-Jun-18 Varies with device   4.2 and up
5    20-Jun-18                1.1   4.4 and up
6    26-Mar-17                  1   2.3 and up

Top apps

DD

Top installed apps under Beauty

In the dataset from beauty category top installed apps are ploted in barplot. In that the most installed app in beauty category is Hush-Beauty for everyone and it has rating 4.7 and the least installed app in beauty category is Ipsy:Mackup,beauty and tips and it has 4.9 rating.

Top installed apps under Art and Design

In the dataset from Art and Design category top installed apps are ploted in barplot. In that there are four top most installed app in Art and Design category and it has rating 4.7 and the least installed app in Art and Design category is Canva and it also has 4.7 rating.

Category wise

DD

Rating with category

In this barplot we have the app category with respect to rating is shown along type of app i.e, free or paid. In this only in business category have paid app with 4.7 rating and all other category have free apps only. In this all categories have almost equal rating between 4.7 to 4.9.

Content Rate with category

This is the bar chart for app category with content rating which would divided by rating greater than 4.5. In this only comics category all apps have rating above 4.5 and others are mixed. The business category is the only category which has the content rating only given by everyone.

Size with category

This bar chart shows that the app category with size of the app and content rating. In this we observe that only in business category the content rating given is everyone and all other categories are rating given by all others. The maximum size of the app is in Auto and Vehicles and which has size nearly 200.

Rating with category

In this category wise app rating is shown. In this comics category only have rating 4.7, in Books and reference 40% of apps have 4.5 rating and in beauty category the apps have rating between 3.7 to 4.9.

Histogram

DD

Histogram of size

This hiatogram shows that app size with frequency. It is seen that most of the apps are in size 0 to 20mb. Only one app have 200mb and few app have 50 to 70mb. The mean value of the app size is aprox 20mb.

Size with rating

It shows that most of apps have 10 to 20mb and with 53 apps greater than 4.5 rating and 15 apps less than 4.5 rating. In this one app have 75mb and one have 200mb size these two are outliers.

#Scatterplot

DD

Rating with Reviews

In this plot rating along with reviews is shown with respect to the category of app. In this books and reviews have one outlier value in reviews. If the rating is low then reviews are alos less and viceversa.

Boxplot

DD

Category with version

In this boxplot Art and Design, Auto and Vehicles and Beauty category apps with rating and android version is shown. In Art and Design category more apps have version 4.2 and up which has median rating value of 3.8. The Auto and Vehicles category has one outlier app with least rating.

content rating

DD

Content rating percentage for app category is shown in bar chart. In this business category have 100% everyone rating and Art and Design have 90% everyone rating, 8% teen rating and 2% 10+ rating. comics category has 50% adults and 50% teens rating the apps.

Inference

DD

[1] "So there are all total of 6 categories in the dataset."
[1] "In the play store most of the apps are under Family category and least are of Comics Category."
[1] "The Everyone content rating has the highest number of apps."
[1] "Most of the apps in the google play store are rated between 3.5 to 4.8."
[1] "The Family category apps are mostly installed."
[1] "The top rating app in the Beauty category is Ipsy:Mackup,Beauty abd Tips"

Download

---
title: "Google_playstore_app"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    vertical_layout: fill
    social: menu
    theme: united
    storyboard: TRUE
    source_code: embed
---

```{r setup, include=FALSE}
library(flexdashboard)
library(tidyverse)
library(plyr)
library(dplyr)
library(magrittr)
library(lattice)
library(ggplot2)
library(DT)
gplay0=read.csv("C:/Users/HP/Downloads/googleplaystore.csv")
gplay=head(gplay0,300)
```


-----------------------------------------------------------------------

# Introduction{.tabset}

The dataset I have taken is about the apps in google playstore from kaggle. The dataset consists of app name, category, ratings,reviews and etc... And also it gives the information about the app is paid or free and version of android to be support.

## DD{.tabset}

### Structure

```{r}
str(gplay)
```

From this the structure of the dataset is shown whether the column is numeric or character.

### Columns

```{r}
names(gplay)
```

It displays the names of the column of our dataset.

### Summary

```{r}
summary(gplay)
```

Summary function used to display mean,median,sd and length of the dataset we taken and it used to do plot later.

### Change the type and remove Null value

```{r}
gplay$Size = substr(gplay$Size,1,nchar(gplay$Size)-1)
gplay$Size <- as.numeric(gplay$Size)

gplay$Reviews <- as.numeric(gplay$Reviews)

gplay<- na.omit(gplay)

head(gplay)
```

# Top apps

## DD{.tabset}

### Top installed apps under Beauty

```{r}
gp1<- subset(gplay, Category=="BEAUTY", select = c(App, Installs, Rating))
gp1<-slice_max(gp1,n=5,gp1$Rating)
ggplot(gp1, aes(x=gp1$App, y=gp1$Installs, fill=gp1$Rating))+geom_bar(stat="identity")+labs(title="Most installed Apps under Beauty Category",x="Apps",y="Installs")
```

In the dataset from beauty category top installed apps are ploted in barplot. In that the most installed app in beauty category is Hush-Beauty for everyone and it has rating 4.7 and the least installed app in beauty category is Ipsy:Mackup,beauty and tips and it has 4.9 rating.

### Top installed apps under Art and Design

```{r}
gp2=subset(gplay,Category=="ART_AND_DESIGN",select = c(App, Installs, Rating))
gp2=slice_max(gp2,n=10,gp2$Rating)
ggplot(gp2,aes(x=App,y=Installs,fill=Rating))+geom_bar(stat = "identity")
```

In the dataset from Art and Design category top installed apps are ploted in barplot. In that there are four top most installed app in Art and Design category  and it has rating 4.7 and the least installed app in Art and Design category is Canva and it also has 4.7 rating.

# Category wise

## DD{.tabset}

### Rating with category

```{r}
ggplot(gplay, aes(x= Category, y= Rating, fill = Type)) +
  geom_bar(position='dodge',stat = "identity")
```

In this barplot we have the app category with respect to rating is shown along type of app i.e, free or paid. In this only in business category have paid app with 4.7 rating and all other category have free apps only. In this all categories have almost equal rating between 4.7 to 4.9.

### Content Rate with category

```{r}
ggplot(gplay, aes(x= Category, y=Content.Rating , fill = Rating>4.5)) +
  geom_bar(position='dodge',stat='identity')
```

This is the bar chart for app category with content rating which would divided by rating greater than 4.5. In this only comics category all apps have rating above 4.5 and others are mixed. The business category is the only category which has the content rating only given by everyone. 

### Size with category

```{r}
ggplot(gplay, aes(x= Category, y= Size, fill = Content.Rating)) +
  geom_bar(position='dodge',stat='identity')
```

This bar chart shows that the app category with size of the app and content rating. In this we observe that only in business category the content rating given is everyone and all other categories are rating given by all others. The maximum size of the app is in Auto and Vehicles and which has size nearly 200.

### Rating with category

```{r}
histogram(~gplay$Rating|gplay$Category,col=c(1,3))
```

In this category wise app rating is shown. In this comics category only have rating 4.7, in Books and reference 40% of apps have 4.5 rating and in beauty category the apps have rating between 3.7 to 4.9.

# Histogram

## DD{.tabset}

### Histogram of size

```{r}
hist(gplay$Size,col = "green")
abline(v=mean(gplay$Size),col="black",lwd=2)
```

This hiatogram shows that app size with frequency. It is seen that most of the apps are in size 0 to 20mb. Only one app have 200mb and few app have 50 to 70mb. The mean value of the app size is aprox 20mb.

### Size with rating

```{r}
qplot(Size,data=gplay,geom="histogram",fill=Rating>4)
```

It shows that most of apps have 10 to 20mb and with 53 apps greater than 4.5 rating and 15 apps less than 4.5 rating. In this one app have 75mb and one have 200mb size these two are outliers.

#Scatterplot

## DD{.tabset}

### Rating with Reviews

```{r}
ggplot(gplay, aes(x=Rating, y=Reviews)) +
  geom_point(aes(color = Category))+geom_smooth()
```

In this plot rating along with reviews is shown with respect to the category of app. In this books and reviews have one outlier value in reviews. If the rating is low then reviews are alos less and viceversa.

# Boxplot

## DD{.tabset}

### Category with version

```{r}
gp3=subset(gplay, Category==c("ART_AND_DESIGN","AUTO_AND_VEHICLES","BEAUTY"))
ggplot(gp3,aes(x=Category,y=Rating))+geom_boxplot(aes(fill=Android.Ver),outlier.shape = 2,outlier.color = "red")+theme_classic()+coord_flip()
```

In this boxplot Art and Design, Auto and Vehicles and Beauty category apps with rating and android version is shown. In Art and Design category more apps have version 4.2 and up which has median rating value of 3.8. The Auto and Vehicles category has one outlier app with least rating.

# content rating

## DD{.tabset}

```{r}
ggplot(gplay,aes(x=Category,fill=Content.Rating))+geom_bar(position = "fill")
```

Content rating percentage for app category is shown in bar chart. In this business category have 100% everyone rating and Art and Design have 90% everyone rating, 8% teen rating and 2% 10+ rating. comics category has 50% adults and 50% teens rating the apps.

# Inference

## DD{.tabset}

```{r}
"So there are all total of 6 categories in the dataset."

"In the play store most of the apps are under Family category and least are of Comics Category."

"The Everyone content rating has the highest number of apps."

"Most of the apps in the google play store are rated between 3.5 to 4.8."

"The Family category apps are mostly installed."

"The top rating app in the Beauty category is Ipsy:Mackup,Beauty abd Tips"
```

# Download

```{r}
datatable(gplay,extensions='Buttons',options=list(dom="Bftrip",buttons=c('copy','print','csv','pdf')))
```