Google News Clusters: A deeper look at the traits which help make each segment unique

This briefing is an working document to help guide our collective thinking as we consider “who,” these clusters are and how best we can introduce them to the world.

It may first help to refresh ourselves on the overall cluster distribution. As the below table shows, we have technically 7 segment groups, though one of them – the “Not Interested,” – will be routinely disregarded throughout the rest of this analysis. This means that, at a maximum, there are six segments which we are trying to describe.

Furthermore, not all clusters are created (or at least analyzed) equally, so to speak. We have shown greater interest in clusters which meet some combination of the following criteria:

-Size of the cluster: All things being equal, larger groups are preferred to small ones, however there are exceptions here. The “holding others accountable,” cluster represents just 4% of the overall sample, and reaches a maximum size of 6% in the United States and the United Kingdom. Still, the group is of interest due to other characteristics we currently know about it.
-General media interest of the cluster, as well as interest in specific forms of news and/or topics: Clusters who, broadly speaking report little interest in the news, or have no discernible interest in the information they do consume are also of a lower-priority, all other things being equal.
-Use digital platforms to get the news, especially those platforms where Google has a presence (news aggregators, online videos, search engines, etc.) The Google team’s interest in this domain probably requires no further explanation. It is worth noting that in recent discussions our attention is now focusing on the use of digital platforms for news (either specific platforms or in general) on a “multiple times a week,” “daily,” or “more than daily,” basis. As these variables were used only as split variables in the analysis, they represent potentially interesting descriptors. -When people get their news This is a relatively unexplored thus far, and these items were not actually used in the segmentation. However, there is the possibility these items could create actionable insights. -Subscription/pay for content Our teams is interested in identifying those news consumers who pay (or have paid) for news as well as the type of content they pay for. -Notable demographic or personal characteristics, broadly speaking While this topic has been explored in previous analyses, it is worth digging deeperfor interesting demographic patterns to stick out with respect to certain clusters – especially characteristics which match previous research (or general conventional wisdom) about certain issues, such as young people being more tech-oriented. Examples of important demographics include: -Age: (especially 18-24 year old cohort which was oversamples in all countries)
-Education: (note: we have this on a country by country basis, though ABD recoded into a three-category general variable. It may be we rethink the recoding but a challenge is simply that the sample for any given country is far more educated than the general public, though our sample is supposed to be representative of internet users not the generl public)
-Gender: Our sample has a collective (slight) male bias for reasons related to the disproportionate share of men sampled in India, in accordance with the natural gender skew of the internet-using population in that country.
-Type of device a person owns: Though this has not been super helpful thus far.
-Socio-economic status.
-Country of cluster: This would never be used to describe a cluster, but is important all the same to note. In theory, a given cluster group should have a similar country of residence distribution as the overall sample, which is about 12.5% per country (weights applied). As we will see, not all countries follow this pattern.

This does not represent an exhaustive list of all the attributes which might make for an interesting cluster description, but provides, at the very least, the more important themes to consider.

This document is broken down into several major sections, including:

  1. Examination of overall cluster distribution by country and tests of association
  2. Digital platform usage of the news – the Q1 series, focusing on the “multiple times a week category,” which we have not previously explored, at least using this definition.
  3. Time of day
  4. Topics (these were used in segmentation)
  5. Demographic analysis/News-interest items (essentially the first 7 questions of the survey). We have mostly looked at these before.
  6. TBD

In general, this analysis will ignore the not interested category, including in statistical analyses comparing one cluster group to all other respondents, as we are attempting to distinguish between the segments of interest. As this group represents 5% of respondents altogether (on a weighted basis), and up to 10% in some countries (the United States) it was seemed sensible not include them in any reference category for this analysis.

Cluster Results: Overall & Country Distrbution

The below table shows the overall results of the clusters (using an abbreviated version of the names for each segment, due to space limitations), including the “not interested,” group (though we will soon forget about them). Single topic users and “NO celebrities, community or government,” collectively represent over half of all respondents, and constitute around 60% of respondents in 6 of the 8 countries (with only Indonesia and India breaking from this trend).

“Experts/celebs/community,” also makes up a significant share of news users, at 19%. At the country level, this group is as large as 43% of internet users in Indonesia and as small as 7% in the United States.

“Super subscribers,” makes up 13% overall, but are the pluarity segment in India, constituting 43% of internet users there. India, though, is the exception, as this group falls between a narrow range of 5%-13% in other countries.

“Non-Google,” and “Not interested,” will be skipped over.

“Holding others accountable,” is a group we find intriguing – despite their small stature (proportionally speaking, of course). Overall, 4% of internet users cross the 8 countries are classified in the group, with the highest amount being 6% (in both the USA and UK).

Segment Group All countries USA UK Germany France Brazil Indonesia India Japan
Single topic users 28 29 27 25 34 42 28 19 17
No celeb/community/gov 25 28 37 37 29 17 11 6 40
Experts/celeb/community 19 7 11 13 15 26 43 26 13
Super subscribers 13 13 8 10 5 10 13 43 6
Non-Google 6 7 6 6 6 1 2 1 18
Not interested 5 10 6 5 6 2 2 1 3
Holding others accountable 4 6 6 4 5 2 2 4 2

Given this country variation, what can be said about the relationship between the country of residence of an internet user and the cluster group they fall in? Is there even a relationship?

As a basic first step, a Pearson’s CHi-Square test for independence was performed on the two variables: country and news cluster group (the "not interested group was excluded from the analysis). The weighted counts were assessed, and the chi-square statistic computed (see below).

Weighted counts of cluster by country
Experts/celeb/community No celeb/community/gov Holding others accountable Non-Google Super subscribers Single topic users
USA 101.64 397.33 90.31 99.49 186.16 420.88
UK 153.14 519.66 85.28 77.57 106.60 376.40
Germany 182.27 519.38 56.80 77.92 136.96 353.75
France 208.32 404.16 75.31 86.20 64.79 475.06
Brazil 365.55 234.90 27.99 14.99 143.56 599.87
Indonesia 611.05 159.37 29.03 23.57 183.27 392.97
India 361.12 78.12 62.05 14.36 613.72 268.88
Japan 186.71 556.53 23.96 254.34 84.19 235.43
## 
##  Pearson's Chi-squared test
## 
## data:  chi.table
## X-squared = 3284.6, df = 35, p-value < 2.2e-16

This test shows, then, that country and news cluster group are not independent of each other – they have a statistical relationship. We can see this more clearly by mapping the Pearson residuals, which tell us which cells (the intersection of a given row and column) contribute the most to the total Chi-square test. This is plotted below. Looking at the graph, it confirms a few observations:

  • There is a strong positive association between India and super subscribers.
  • A nearly as strong relationship is apparent with respect to Indonesia and “experts, celebrities and community news”.
  • Japan and “Non-Google” exhibit a positive relationship as well.
  • Though harder to see in the chart, there is a notable negative relationship with respect to Indonesia and India for “no celebrities, community or government.”

Digital platform usage

This section focuses on the Q1 series of the survey, which reads:

Q1. How often do you do each of the following to get the news? For the purpose of this survey, by “news” we mean any information you read, watch, or listen to that helps you stay informed about your interests, your community, and the world around you. This can include things like current events, local news, weather, sports, politics, entertainment news, celebrity news, etc.

Response options: 1) Never 2) Less than once a month 3) At least once per month 4) Once pers week 5) A few times per week 6) Daily 7) Several times a day

Individuals where then asked about 13 types of platforms, ranging from “traditional,” nature (such as television, radio and print newspapers or magazines) or a digital one (social media, search engines, news websites, online video platforms, news apps, news aggregators, text messaging, email news letters, podcasts and virtual assistants). Two other platforms featured on the survey are not included in this analysis: “friends and family,” due to the difficulty of placing it in the traditional/digital paradigm as well as “messaging apps,” which was not asked in every country.

Turning to how often the clusters use either ANY digital or traditional platforms (on at least a more than weekly basis), it is apparent all groups have high, bordering on universal rates of using digital news sources. Use of traditional sources is high, but in all cases lags behind digital usage.

Both “experts/celebrities/community news,” and “super subscribers” record the highest overall digital usage rate, with 98% using these platforms at least multiple times a week. However, “super subscribers,” are slightly more balanced in their platform usage – as 94% of this group also uses traditional platforms (essentially equal with its digital usage).

Cluster digital_user_weekly_plus traditional_user_weekly_plus digital_divide
Experts/celeb/community 98 90 8
Super subscribers 98 94 4
Non-Google 93 86 7
Holding others accountable 91 81 10
Single topic users 88 79 9
No celeb/community/gov 84 83 1

Still these figures provide relatively limited information in terms of our main goal: finding meaningful, useful differences between the clusters. Below we look at how many digital and/or traditional devices each of the Google News clusters used on a more than weekly basis, again excluding the “not interested group.”

As a reminder, the survey asked about a total of 10 digital platforms and 3 traditional platforms, at least for the purpose of this analysis. The below table provides the mean number of digital devices used by each segment on a more than weekly basis, as well as the similar figure for traditional methods of getting the news. The “mean_centered” columns simply divides the averages for digital and traditional platforms by the respective total number of platforms a respondent could use (i.e. turns the average into a proportion) and then multiples that figure by 100. This allows for a more straightforward comparison of digital devices compare to traditional ones (though does not correct for the imbalance between the number of digital and traditional sources asked about on the survey).

Here we see the super subscribers ravenous desire for information come up to the surface. This group uses an average of 7.99 (let’s say 8) digital platforms on a more than weekly basis to get the news; this group uses 2.36 traditional sources this often as well. For both types of platforms, super subscribers are using nearly all of the mediums asked about on the survey. No other group uses such a wide ranging number of devices.

Experts, celebrities and community news: This group logs the second hightest rate of digital device usage, at 6.57. This is well above all other groups (in a statistically significant manner), with the exception of the super subscribers. Their usage of traditional devices, at 1.88, is also the second highest of the groups, though this figure is more in-line with the other cluster group compared to digital platform usage.

Holding others accountable: This small group uses 5.03 digital devices on a more than weekly basis, and 1.66 traditional devices.

Non-Google: Note for all three groups highlighted thus far, there is relative parity in terms of the proportion of digital or traditional devices used. However for “Non-Google,” we see that they were more likely to say they were using a traditional device on a more than weekly basis than a digital (compare “trad_mean_centered,” to “digital_mean_centered”).

Single topic users: They have an average digital device.

Cluster digital_mean trad_mean digital_mean_centered trad_mean_centered
Experts/celeb/community 6.57 1.88 65.74 62.76
No celeb/community/gov 2.83 1.47 28.26 48.96
Holding others accountable 5.03 1.66 50.27 55.32
Non-Google 4.26 1.59 42.61 52.87
Super subscribers 7.99 2.36 79.88 78.74
Single topic users 4.11 1.43 41.13 47.61

The next obvious question is which platforms are more popular with each group. Answering this question depends, to some degree, on how platform usage is measured – we have, in the past, focused on the percentage who said daily or more than daily and, more recently, the percentage who said “multiple times a week,” or more frequently.

However, this analysis requires a more nuanced approach in order to see the differences between the groups. As a result, the focus now turns to the average frequency rating each cluster collectively gave to each platform, on a 1 to 7 scale, with 1 representing “never,” and 7 representing “multiple times a day.” While this is not the most intuitive way of looking at the data, it guards against the loss of information often caused by categorization – necessary, as that process can be to ease interpretation.

Below the average frequency rating for each platform is shown on a cluster by cluster basis, but first a few general comments:

-Television is the most frequently used source across all groups, when ranked by the average frequency rating (indeed, television reigns supreme even if the focus shifts to the percentage who said “daily/more than daily” or “more than weekly”). This may seem surprising, given how digitally connected several of the Google news cluster groups are. However, the digital platform space is somewhat more fragmented than the traditional news mediums, especially television. This is not just an artiact of the survey – it reflects reality.
-Search engines where typically the second most frequently used source, ranking as a “2” in 4 of the 6 clusters Interestingly, the exceptions were “Holding others accountable,” and “Super subscribers,” both of whom had social media rank as number 2 (though the difference with search engines is not statistically significant, so we are speaking purely in nominal terms).
-The third most commonly used platform varied across the segments, with social media taking this place for 2 groups, search engines for another 2 groups, news websites for 1 and news aggregator for another 1 “No celebrities, community or government,” ranked news websites at number 3, while the “Non-Google,” group ranked news aggregators.

For reference: the overall rating score for each platform among respondents who are in one of the six clusters of interest (i.e. excluding the “not interested”) is:

-Television: 5.21
-Search engines: 4.67
-Social Media: 4.47
-News websites: 4.47
-Online videos: 4.18
-News aggregators: 4.07
-News apps: 4.05
-Radio: 3.97
-Print newspapers: 3.79
-Text messaging: 3.55 -Virtual Assistants: 3.51 -Email newsletters: 3.27
-Podcasts: 3.13

Average platform usage: Experts, celebrities and community news

Looking at each segment, we first turn to “Experts, celebrities and community news”. As can be seen below, TV was, just marginally (though it is statistically significant), the most common platform for this group. From there, a number of digital sources are used about as frequently: search engine (5.43 average rating), social media (5.42 rating), online videos (5.29) and news websites (5.2). News apps and news aggregators represent the next tier of platforms for this group, averaging at 4.88 or 4.82, indicating most respondents turn to these platforms on a weekly or “multiple times a week,” basis.

All other platforms are used less often, though most still receive a rating above “4,” again signaling the most common response was some sort of weekly usage. Podcasts and email newsletters trail behind the rest.

The median aggregate frequency rating for this group is 4.82, higher than all other groups except “super subscribers,” whom we trun to next.

Average Platform Usage: Super Subscribers

The chart below emphasizes how in-tune to the news this group is, or at least how many sources they consult and how frequently. While television is the top sources, there are six sources in total which have an average rating above 5.5, indicating the greatest amount of individuals said they used these platforms on a “more than weekly,” or “daily,” basis. These platforms include social media (5.64), search engines (5.61), online videos (5.58), news websites (5.57) and news apps (5.55).

Only radio and podcasts fall below the average rating of 5.

The median aggregate rating for this group is 5.43, the highest of all groups.

Average Platform Usage: Holding Others Accountable

The ratings for this group, for most platforms, tend to between “4” and just below “5,” indicating some level of weekly usage. Television is the most used, at 4.93; next is social media at 4.67. Search engines, news websites, news apps and online videos follow until we hit the next “traditional,” platform of radio (4.15). Virtual assistants and podcasts are at the bottom.

Average Platform Usage: Other Clusters

Below are the platform rating charts for the remaining clusters, ranked as above. For these groups, general usage falls, with the median frequency rating falling below 4.0. Television of course stays in front, for all groups – though it has a clear lead with the “no celebrities, community or government,” group relative to the other platforms. “Single topic users,” favor search engines and social media after television.

“Non Google,” interestingly uses search engines about as often as television, though they simply do not use Google search. Yahoo has the lead among the specified search engines asked about on the survey. **Also, “Non-Google” has several platforms near the top of its rankings which were less prominent among other groups, notably “virtual assistants” which places 4th, at least on a nominal basis, with a score of 3.98.

Time of Day Analysis

The survey asks respondents who said that they used any of the given news platforms at least once a month the following question:

-Q8: In a typical day, do you get new from [NEWS PLATFORM] while doing each of the following (yes or no):

  1. While getting ready for the day
  2. While commuting to or from work
  3. While working or at school
  4. While taking a break from work or school
  5. While exercising
  6. When you are passing time (e.g. waiting in line, waiting for an appointment, waiting for a meeting to begin)
  7. During a meal
  8. While relaxing
  9. Before going to sleep

In total, respondents could select up to 9 instances of when they used the platform last.

These items can be examined in several ways – not all of which will be examined here. We have already determined which platforms are broadly popular – television, search engines, social media, news websites and online videos tend to be the most popular platforms to receive news from, though there is important variation by cluster.

Are the most popular platforms also the ones used the most frequently on a daily basis? The answer is both “yes,” and “no.”

Television, the most popular mediums across the six segments of interest, registers a lower average number of ‘check-ins’ over the day compared to other platforms. Among all respondents in the six clusters, the number of instances people watched television (on average and out of nine possible opportunities) was 4.2 – a full unit below social media (5.2 instances), as well as online videos (4.7), text messaging (4.6), search engines (4.4) and news aggregators (4.3). As the below table shows (in particular the “Six Cluster Average column”), the traditional platform lag behind most of their digital counterparts in terms of how often individuals are checking them. Newspapers, for instance, is at the bottom of the list in terms of how many times people, on average, checked the news during the day.

Platform 6-Cluster Average Super.subscribers Experts.celeb.community No.celeb.community.gov Holding.others.accountable Non.Google Single.topic.users
Social media 5.2 6.6 5.8 4.1 5.3 4.3 5.1
Online videos 4.7 6.3 5.1 3.4 4.9 3.9 4.6
Text messaging 4.6 6.2 4.9 3.2 5.1 3.9 4.4
Search engine 4.4 6.2 4.9 3.0 4.8 3.4 4.0
News aggregator 4.3 6.2 4.4 2.8 4.7 3.8 3.8
News apps 4.3 6.2 4.6 2.8 4.9 3.5 3.8
News websites 4.2 6.1 4.6 2.7 4.4 3.6 3.7
Television 4.2 6.0 4.4 2.9 4.6 3.8 3.6
Virtual assistant 4.2 6.0 4.3 2.8 4.9 3.7 3.7
Radio 4.1 5.9 4.2 2.7 4.7 3.5 3.5
Email newsletter 4.0 5.9 3.9 2.5 4.8 3.7 3.5
Newspapers 3.8 5.8 3.8 2.2 4.6 3.3 3.2

If the platforms which are most frequently used do not precisely match those which are used most commonly throughout the day, there is a closer relationship between how often the Google news clusters check (overall and on average) the different platforms on any given day and their more general platform usage.

More specifically – and as can be seen in the above table – the super subscribers are frequently checking virtually all platforms (though keep in mind this is only among people who said they used the platform at least monthly). Social media is the most frequently checked source of news for this group – as it is by every other group; though online videos, text messaging, search engines and news aggregators are checked nearly as often.

No celebrities, community or government is notable for how much more frequently they check social media on any given day (4.1 times on average) compared to any other platform (3.4 times is the next highest level). However experts, celebrities and community news exhibit somewhat similar behavior: monthly users of social media in this cluster check social media to get the news an average of 5.8 times per day, compared to 5.1 times for online videos, the next most checked source.

Holding others accountable is notable for the relative lack of variation in terms of how many times, on average, people in this cluster check wach platform. All values hover around 5 instances, out of a possible 9.

Percentage who use given platforms during time of day by Google News Cluster

This now examines the time period each cluster group looks at the most salient (i.e. popular) platforms on a daily basis.

While there is variation as to when a platform is most used for each cluster, some general trends stand out:

-“Super subscribers,” are most likely to get their news ‘while getting ready for the day’ compared to any other reason though, as we have seen, this group tends to check in on the news quite a bit. -The time of day most other groups prefer to get their news is “while relaxing,” – where social media is especially used.
-More generally, “while relaxing,” is the most ommon time a person gets their news.

First, we turn to social media. The below table shows the percentage of people who said they used social media to get the news during each of the following activities. **For 5 of the 6 cluster groups, the most common time to get news from social media is “while relaxing,” including “experts, celebrities and community,” (78%), “single topic users,” (75%), and “holding others accountable,” (75%). All three of these groups were among the most frequent users of social media. The outright most frequent users of social media – the super subscribers – use this medium most frequently “while getting ready for the day” (85%).

## 
##      Email newsletters       News aggregators              News apps 
##                      9                      9                      9 
##          News websites             Newspapers Online video platforms 
##                      9                      9                      9 
##               Podcasts                  Radio         Search engines 
##                      9                      9                      9 
##           Social media             Television         Text messaging 
##                      9                      9                      9 
##     Virtual assistants 
##                      9
In a typical day do you get news from SOCIAL MEDIA while doing each of the following (% yes, among monthly users)
Wording Super.subscribers Experts.celeb.community No.celeb.community.gov Holding.others.accountable Non.Google Single.topic.users
While getting ready for the day 85 69 48 69 48 64
While commuting 71 58 31 54 43 47
While working or at school 68 53 31 51 39 47
While taking a break 75 73 55 65 55 65
While exercising 62 42 17 48 34 30
When passing the time 78 77 63 65 56 71
During a meal 69 59 38 50 46 53
While relaxing 80 78 73 69 65 75
Before going to sleep 81 73 57 66 56 68

Next, we turn to search engines. The most frequent users of this platform (as measured by average ratings) include the “Super subscribers,” “Celebs/experts and community news,” and “Holding others accountable.”

The below table shows when users of search engines use that platform on a typical day. The “Super subscribers,” have two favorite (or most frequent) times of day to check the news on search engines: while getting ready for the day and while relaxing.

For all other groups, “relaxing,” is the most frequent time to consume the news.

In a typical day do you get news from SEARCH ENGINES while doing each of the following (% yes, among monthly users)
Wording Super.subscribers Experts.celeb.community No.celeb.community.gov Holding.others.accountable Non.Google Single.topic.users
While getting ready for the day 78 57 29 60 37 45
While commuting 65 47 20 47 30 35
While working or at school 63 47 24 51 29 39
While taking a break 72 64 43 53 45 54
While exercising 58 32 10 44 20 23
When passing the time 73 68 48 61 47 57
During a meal 63 45 23 51 33 39
While relaxing 78 73 65 63 58 64
Before going to sleep 73 61 42 60 43 52

Next, we turn to television to see to what extent “time of day,” preferences vary when looking at a traditional, rather, than digital platform. Furthermore, television is the most heavily used source across all groups – though, as we have seen, it is not used the most frequently throughout the day as many digital sources.
Still, we see a similar pattern: “while relaxing,” is the most common time people consume news via television, though this is not true for the super subscribers. This group again is most likely to say want to get the news “while getting ready for the day.”

In a typical day do you get news from TELEVISION while doing each of the following (% yes, among monthly users)
Wording Super.subscribers Experts.celeb.community No.celeb.community.gov Holding.others.accountable Non.Google Single.topic.users
While getting ready for the day 85 66 47 66 59 56
While commuting 58 31 11 44 26 20
While working or at school 54 29 9 42 23 19
While taking a break 63 44 20 44 34 32
While exercising 57 28 9 42 27 20
When passing the time 67 50 24 50 43 37
During a meal 72 61 55 58 59 58
While relaxing 78 76 72 68 68 69
Before going to sleep 75 60 47 58 52 52

Next, we look at online videos. Again, “while relaxing,” is the main time of day people use this platform to get the news, except for “super subscribers.”

In a typical day do you get news from ONLINE VIDEO PLATFORMS while doing each of the following (% yes, among monthly users)
Wording Super.subscribers Experts.celeb.community No.celeb.community.gov Holding.others.accountable Non.Google Single.topic.users
While getting ready for the day 80 58 38 63 45 55
While commuting 66 46 24 47 37 39
While working or at school 62 43 20 47 34 37
While taking a break 73 66 42 53 46 57
While exercising 61 36 16 50 34 30
When passing the time 75 68 47 58 45 60
During a meal 67 53 35 53 42 51
While relaxing 77 78 70 68 64 71
Before going to sleep 77 70 56 63 53 62

Topic of Interest

topic.labs<-read.csv("topic_var_labels.csv", header=TRUE, stringsAsFactors = FALSE)%>%
  dplyr::as_tibble()

topic.vars<-colnames(df[806:820])


df.reduced<-df%>%
  drop_na(cluster_short_labels)%>%
  dplyr::filter(cluster_short_labels != "Not interested")


topic.celeb<-two_tab_long_looper_dep(df.reduced, "seg_experts_celebs", topic.vars, "WGT")%>%
  dplyr::mutate(pct=round(pct,0))%>%
  dplyr::select(-unweighted_n)%>%
  dplyr::filter(dep_category == "Selected")%>%
  left_join(topic.labs, by=c("dep_var"="QTAG"))%>%
  dplyr::select(-Wording)%>%
  relocate(q.label, .after="dep_var")

topic.celeb$ind_category<-factor(topic.celeb$ind_category, levels=c(0,1), labels=c("All other clusters", "Celeb/expert/community"))

topic.celeb$ind_category<-fct_relevel(topic.celeb$ind_category, "Celeb/expert/community")


ggplot(topic.celeb, aes(x=q.label, y=pct, fill=ind_category, label=pct))+
  geom_bar(stat="identity", color="black", position=position_dodge())+
  theme_minimal()+scale_fill_brewer(palette="Blues")+
  geom_text(aes(label=pct), vjust=1.6, color="white",
            position = position_dodge(0.9), size=3.5)+
  labs(x="", title="Experts, celebrities and community news")+theme(legend.position = "top")+theme(axis.text.x=element_text(size=10, angle=90))+
  theme(legend.title=element_blank())

topic.tab<-two_tab_long_looper_dep(df.reduced, "seg_super_subscribers", topic.vars, "WGT")%>%
  dplyr::mutate(pct=round(pct,0))%>%
  dplyr::select(-unweighted_n)%>%
  dplyr::filter(dep_category == "Selected")%>%
  left_join(topic.labs, by=c("dep_var"="QTAG"))%>%
  dplyr::select(-Wording)%>%
  relocate(q.label, .after="dep_var")

topic.tab$ind_category<-factor(topic.tab$ind_category, levels=c(0,1), labels=c("All other clusters", "Super subscribers"))

topic.tab$ind_category<-fct_relevel(topic.tab$ind_category, "Super subscribers")



#####Experts, celebrities and community news



ggplot(topic.tab, aes(x=q.label, y=pct, fill=ind_category, label=pct))+
  geom_bar(stat="identity", color="black", position=position_dodge())+
  theme_minimal()+scale_fill_brewer(palette="Blues")+
  geom_text(aes(label=pct), vjust=1.6, color="white",
            position = position_dodge(0.9), size=3.5)+
  labs(x="", title="Super subscribers")+theme(legend.position = "top")+theme(axis.text.x=element_text(size=10, angle=90))+
  theme(legend.title=element_blank())