Data analysis using Google Trends
Google searches are one of the most important datasets ever collected. This is not only the tool to search or get answers but also great mean to understand people around the world. It is digital gold mine which can unravel much unknown information.


How to…search analysis(USA)

We frequently query “how to ….” on Google, so here is analysis which “how to” query we do the most.

To get “how to query”, we use Gtrends R package. We query “how to ..” term for the year 2012 to 2017 and showing most searched terms using wordcloud.

# Writing function to display wordcloud

getYearTrends <- function(timeline)
{

  HowTo2017<-gtrends("How to", geo="US", time = timeline)
  HowTo2017<-HowTo2017$related_queries
  return (HowTo2017)

}

gTrendswordcloud <- function(timeline)
{

  #HowTo2017<-gtrends("How to", geo="US", time = timeline)
  HowTo2017<-getYearTrends(timeline)
  HowTo2017$subjectNew<-gsub('%','',HowTo2017$subject)
  HowTo2017[which(HowTo2017[,7]=='<1', arr.ind=TRUE), 7] <-0
  HowTo2017[which(HowTo2017[,7]=='Breakout', arr.ind=TRUE), 7] <-9999
  HowTo2017$subjectNew<-as.numeric(gsub(',','',HowTo2017$subjectNew))
  HowTo2017$subjectNew<-as.numeric(HowTo2017$subjectNew)
  max<-max(subset(HowTo2017, related_queries == 'rising')$subjectNew)
  HowTo2017rising<-subset(HowTo2017,related_queries=='rising')
  HowTo2017rising$subjectNew<-subset(HowTo2017rising,related_queries=='rising')$subjectNew*100/max
  HowTo2017risingTop<-subset(HowTo2017,related_queries=='top')
  HowTo2017<-rbind(HowTo2017rising ,HowTo2017risingTop)
  HowTo2017$subjectNew<-as.integer(HowTo2017$subjectNew)
  HowTo2017Top<-subset(HowTo2017,as.numeric(HowTo2017$subjectNew)>1)
  HowTo2017Top<-subset(HowTo2017Top, select=c("value", "subjectNew"))
  HowTo2017Top$subjectNew<-as.numeric(HowTo2017Top$subjectNew)
  colnames(HowTo2017Top)[2]<-"freq"
  HowTo2017Top<-HowTo2017Top[order(-HowTo2017Top$freq),]
  wordcloud2(data = HowTo2017Top)

}

#gTrendswordcloud("2017-01-01 2017-12-31")
2012 2013 2014 2015 2016 2017

Make GIF image

Make GIF using ezgif.com

Result shows most searched “How to..” queries on Google. The clouds give greater prominence to words that appear more frequently.

tr2012<-getYearTrends("2012-01-01 2012-12-31")
tr2013<-getYearTrends("2013-01-01 2013-12-31")
tr2014<-getYearTrends("2014-01-01 2014-12-31")
tr2015<-getYearTrends("2015-01-01 2015-12-31")
tr2016<-getYearTrends("2016-01-01 2016-12-31")
tr2017<-getYearTrends("2017-01-01 2017-12-31")

all<-rbind(tr2012,tr2013,tr2014,tr2015,tr2016,tr2017)

#plot_ly(x = df$Var1,y = df$Freq,name = "SF Zoo",type = "bar")

How to queries asked since more than four years

df<-as.data.frame(table(all$value))
df<-subset(df, df$Freq > 4)
df[order(-df$Freq),] 
##                               Var1 Freq
## 8                 how to boil eggs    6
## 70                how to tie a tie    6
## 25 how to delete instagram account    5
## 27                     how to draw    5
## 32     how to get away with murder    5
## 46              how to lose weight    5
## 78            how to write a check    5

How to queries asked only in the year 2012

get2012uniqueQueries <- function(){
allExcept2012<-rbind(tr2013,tr2014,tr2015,tr2016,tr2017)
only2012<-setDT(tr2012)[!allExcept2012, on="value"]
only2012df<-as.data.frame(only2012$value)
colnames(only2012df)="Year 2012 Unique queries"
only2012df<-unique(only2012df)
kable(only2012df, "html") %>%
  kable_styling(bootstrap_options = "striped", full_width = F)
}

How to queries asked only in the year 2016

get2016uniqueQueries <- function(){
allExcept2016<-rbind(tr2012,tr2013,tr2014,tr2015,tr2017)
only2016<-setDT(tr2016)[!allExcept2016, on="value"]
only2016df<-as.data.frame(only2016$value)
colnames(only2016df)="Year 2016 Unique queries"
only2016df<-unique(only2016df)
kable(only2016df, "html") %>%
  kable_styling(bootstrap_options = "striped", full_width = F)
}

How to queries asked only in the year 2017

get2017uniqueQueries <- function(){
allExcept2017<-rbind(tr2012,tr2013,tr2014,tr2015,tr2016)
only2017<-setDT(tr2017)[!allExcept2017, on="value"]

only2017df<-as.data.frame(only2017$value)
colnames(only2017df)="Year 2017 Unique queries"
only2017df<-unique(only2017df)
kable(only2017df, "html") %>%
  kable_styling(bootstrap_options = "striped", full_width = F)
}
Year 2012 Unique queries
how to make out
how to rock
how to make french toast
how to tie a bow tie
how to unlock iphone 4
how to get married in skyrim
how to make pancakes
how to deactivate facebook
how to get rid of blackheads
how to get followers on instagram
how to make scrambled eggs
how to breed a rainbow dragon
Year 2016 Unique queries
how to pronounce
how to make money
how to be single
how to register to vote
how to delete instagram
how to use snapchat
how to play pokemon go
how to delete snapchat
how to cook chicken breast
how to play powerball
how to draw a dog
Year 2017 Unique queries
how to be a latin lover
how long to bake chicken
how to make slime without glue
how to make fluffy slime
how to buy bitcoin
how to cook spaghetti squash
how to get rid of fruit flies
how to cook quinoa
how to delete apps on iphone 7
how to watch mayweather vs mcgregor

Summary

  • Good to see “how to vote” & “how to register to vote” in year 2016, as that year being an election year.
  • “how to tie a tie”,“how to boil eggs”,“how to draw” queries we keep asking all these years.
  • People are more searching for snapchat than facebook.
  • We have moved from iphone 4 to iphone 7 (obviously).

Unique How to..search(World): Editor’s choice.

Here are interesting queries in last 12 months.

howToData <- read.csv("https://raw.githubusercontent.com/chirag-vithlani/Capstone/master/data/How_to_Interesting.csv")
howToDataSubSet<-subset(howToData, select = c(1, 4))
colnames(howToDataSubSet)[2] <- "Country"

kable(howToDataSubSet, "html") %>%
  kable_styling(bootstrap_options = "striped", full_width = F)
Topic Country
how to make paper flowers Bhutan
how to take pictures of northern lights Iceland
how to become good teacher India
how to get twins Kenya
how to hack facebook Myanmar
how to make carrot oil Nigeria
how to handle wife Pakistan
how to identify AIDS Sri Lanka
how to delete telegram account Uzbekistan
how to measure infiltration rate Zimbabwe
#Create dataframe with toy data:
  LAND_ISO <- howToData$Country

  value <- howToData$val
 topic<-howToData$Topic
  data <- data.frame(LAND_ISO, value,topic)

# Run your code:
g <- list(scope = 'world')

plot_geo(data) %>%
  add_trace(
    z = ~value, locations = ~LAND_ISO, colors = c(Pass="yellow", High="red", Low= "cyan", Sigma= "magenta", Mean='limegreen', Fail="blue", Median="violet"),text = ~paste(howToData$Topic)
  ) %>%
  
  layout(geo = g)%>% hide_colorbar()

Todo: when we mouse hover it shows some value, which is wrong. I am working on it.

How to handle wife

Out of above unique “How to” queries, I found “How to handle wife” quite funny and serious at the same time. It points out gender inequality and wherever we see such query, I expect that location to have high gender inequality. So here we are finding top five such countries.

howToHandleWifeSearch<-gtrends("how to handle wife", time = "today 12-m")

howToHandleWifeSearchHead<-head(howToHandleWifeSearch$interest_by_country,5)
howToHandleWifeSearchHead<-subset(howToHandleWifeSearchHead, select = c(1, 2))
colnames(howToHandleWifeSearchHead)[2] <- "Percentage of Hits"


kable(howToHandleWifeSearchHead, "html") %>%
  kable_styling(bootstrap_options = "striped", full_width = F)
location Percentage of Hits
Pakistan 100
Sri Lanka 69
United Arab Emirates 54
India 42
Bangladesh 34

Northern Lights

It is a natural light display in the Earth’s sky, predominantly seen in the high-latitude regions like Iceland. That is the reason people from Iceland search “how to take pictures of northern lights”. This was the most amazing thing to know while working on this project.


Source : Wikipedia