Owning the agenda:
using machine learning to observe dynamics in issue salience

Professor Andrea Carson

La Trobe University

Professor Simon Jackman

University of Sydney

13 December 2022

Coverage of 2022 campaign widely criticised

  • light on substance and over-reliant on standard tropes of election coverage:
    • live vision of the ceremonial visit to the GG
    • televised “shouty” leader debates
    • the campaign “launches”
    • daily briefings, appearances & stunts by leaders
    • resulting “gotchas” & gaffes keenly sought by a media starved for compelling content.

  • #this_is_not_journalism Twitter hashtag

  • Adam Bandt, “Google it, mate”

Agenda setting and the media

  • Mixed findings over 100 years of research

  • Walter Lippman (1922) reflected on the power of the new media to present images to the public

  • Katz and Lazarsfeld (1955); Cohen (1963); McCombs and Shaw (1972).

  • Digital age (Norris 2004; Blumler 2016)

Our analysis

  • Latest AES 2022 (Macalister et al.) that shows most Australians follow election news online.

  • This allows us to better understand salience during election campaigns by seeing what the audience engages with (issue salience) rather than the past focus on what the media reports (media salience)

Data: Facebook posts by Australian media entities

  • Meta’s Crowdtangle API used to collect 98,388 Facebook posts by media entities

  • span 242 media organisations

  • FB posts published from federal election being called on Sunday 10 April 2022 until close of polls 6pm AEST Saturday 21 May 2022.

  • At the time of data collection shortly after the election, posts had attracted 34,898,854 user interactions.

Top 10 of 242 media entities ranked by total user interactions with FB content over the 2022 election campaign

Interactions/post
Publisher posts Subscribers Interactions (median) (per 1K subs)
news.com.au 2,359 2,356,732 5,977,219 321.0 0.14
Sky News Australia 7,627 1,221,534 2,964,189 167.0 0.14
ABC News 2,746 4,561,406 2,002,949 295.5 0.06
9 News 1,256 2,945,316 1,656,813 561.5 0.19
7NEWS Australia 2,115 2,441,651 1,324,896 293.0 0.12
7NEWS Sydney 2,774 2,445,236 1,055,390 132.0 0.05
The Sydney Morning Herald 2,008 1,230,593 1,043,701 240.0 0.20
9 News Melbourne 1,187 1,037,112 835,109 196.0 0.19
7NEWS Melbourne 2,010 1,857,843 806,824 171.0 0.09
Courier Mail 1,119 620,473 705,275 302.0 0.49

Sample of 10 from middle tercile of 242 media entities ranked by total user interactions with content over the 2022 election campaign

Interactions/post
Publisher posts Subscribers Interactions (median) (per 1K subs)
7NEWS Mackay 642 83,700 32,278 13 0.15
The Morning Bulletin 488 40,987 26,615 18 0.44
ABC Illawarra 166 68,746 25,339 68 0.98
The Bendigo Advertiser 608 61,098 18,682 9 0.15
South Western Times 353 33,301 13,395 10 0.30
ABC South East NSW 103 59,749 10,781 48 0.80
Warwick Daily News 106 21,518 9,005 38 1.77
Wangaratta Chronicle 211 13,499 5,156 8 0.59
The Maitland Mercury 254 48,623 4,811 12 0.25
Pilbara News 80 42,659 3,993 28 0.66

Sample of 10 from lowest tercile of 242 media entities ranked by total user interactions with content over the 2022 election campaign

Interactions/post
Publisher posts Subscribers Interactions (median) (per 1K subs)
The Condobolin Argus 193 8,025 3,837 5.0 0.62
Blacktown Advocate 114 42,409 3,615 8.0 0.19
The Dubbo News 114 12,142 2,775 9.5 0.78
Merimbula News Weekly 64 17,611 2,393 30.5 1.73
Parkes Champion Post 36 12,504 1,770 33.5 2.68
North West Telegraph 39 11,585 1,537 14.0 1.21
Murray Valley Standard 46 16,271 1,211 16.5 1.01
Macleay Argus 65 11,596 1,136 7.0 0.60
The Ararat Advertiser 95 11,055 1,064 7.0 0.63
Southern Highlands Express 8 1,020 63 6.0 5.89

Classification of documents into topics, pre-2019

  • “bag of words” model + latent Dirichlet allocation long-standing approach in polisci:
    • remove stop words,
    • stem,
    • count word co-occurrences in documents
    • form latent classes, interpret as topics

Post 2019

  • we use sentence-embedding models, that preserve semantic structure of sentences/documents.

  • Sentences are represented (“embedded”) as positions in a high-dimensional vector space.

  • the mapping from sentences to uses pre-trained large language models (LLMs), built using Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al. 2019) or Generative Pretrained Transformed (GPT) models (Brown et al. 2020)

  • transformers: deep-learning models, process entire input stream at once, learns to differentially weight different elements of the input stream, optimised on tasks such as predicting missing elements, including entire sentences (Wolf et al. 2020). Since 2020 has sat behind Google search.

  • we use all-mpnet-base-v2 derived from a LLM developed at Microsoft; produces 768-dimensional embeddings of sentences/documents; freely available at huggingface.co

Example: embeddings of 5 typical FB posts

  1. Sky News, 8 May 4:07pm: Albanese ‘untroubled’ by Howard’s scathing comments. Opposition Leader Anthony Albanese says he is untroubled by comments made by former prime minister John Howard, who labelled him a “left-wing inner-city bomb thrower”. (2,004 interactions)

  2. The Age, 10 April 10:08pm: How Anthony Albanese emerged from the shadows to become a leader in the spotlight. The Opposition Leader has built a reputation for discipline and toughness within the Labor Party, but the question is whether he can win enough public recognition. (732 interactions).

  3. 9 News, 28 April 9:14am: Soaring inflation puts pre-election interest rate hike on the cards. Prime Minister Scott Morrison says international challenges are to blame for Australia’s cost of living crisis, as inflation soars to a 21-year high. (3,986 interactions).

  4. Crikey, 19 May 2022, 4:15pm: Your Say: Crikey readers share their thoughts in the final week of the campaign. “The Coalition appears to be united in making an art form out of being dishonest, with Prime Minister Scott Morrison leading the way.” Crikey readers share their thoughts in the final week of the election campaign. (93 interactions).

  5. ABC News, 15 May 2022, 8:30am: Australia became a world leader in COVID infections this week. And the experts say this is why we need to care. With the country in middle of an election campaign, a war raging in Ukraine and cost of living rising, COVID-19 has dropped off the radar for some Australians. But experts say the number of daily cases and deaths is “extraordinary”. (5,299 interactions)

1st 20 elements of 768-length vector representation of each post:

Sky News -0.025 0.082 0.000 0.017 0.010 0.008 0.009 0.033 -0.043 -0.035 -0.028 0.037 -0.059 -0.072 -0.001 0.034 0.010 0.031 0.095 -0.029
The Age 0.004 0.110 0.006 0.033 0.007 -0.046 -0.041 0.009 -0.072 0.027 0.025 0.025 -0.027 -0.011 0.046 -0.077 -0.011 0.026 0.023 -0.001
9 News -0.036 0.024 0.019 -0.043 0.011 0.051 -0.050 -0.005 -0.006 0.020 0.024 0.008 -0.023 0.093 0.024 -0.004 0.025 -0.045 0.078 -0.032
Crikey -0.005 0.084 0.005 -0.017 -0.065 -0.001 -0.075 -0.019 -0.010 0.016 0.069 -0.018 -0.015 0.025 -0.019 -0.062 0.012 0.009 0.027 -0.008
ABC News -0.017 0.038 0.012 -0.066 -0.021 -0.044 0.004 -0.049 0.021 0.010 0.066 0.004 -0.014 0.102 0.023 -0.042 -0.001 -0.018 0.041 0.027

Cosine similiarities, vector representations of 5 posts

  1. Sky News, 8 May 4:07pm: Albanese ‘untroubled’ by Howard’s scathing comments. Opposition Leader Anthony Albanese says he is untroubled by comments made by former prime minister John Howard, who labelled him a “left-wing inner-city bomb thrower”. (2,004 interactions)

  2. The Age, 10 April 10:08pm: How Anthony Albanese emerged from the shadows to become a leader in the spotlight. The Opposition Leader has built a reputation for discipline and toughness within the Labor Party, but the question is whether he can win enough public recognition. (732 interactions).

  3. 9 News, 28 April 9:14am: Soaring inflation puts pre-election interest rate hike on the cards. Prime Minister Scott Morrison says international challenges are to blame for Australia’s cost of living crisis, as inflation soars to a 21-year high. (3,986 interactions).

  4. Crikey, 19 May 2022, 4:15pm: Your Say: Crikey readers share their thoughts in the final week of the campaign. “The Coalition appears to be united in making an art form out of being dishonest, with Prime Minister Scott Morrison leading the way.” Crikey readers share their thoughts in the final week of the election campaign. (93 interactions).

  5. ABC News, 15 May 2022, 8:30am: Australia became a world leader in COVID infections this week. And the experts say this is why we need to care. With the country in middle of an election campaign, a war raging in Ukraine and cost of living rising, COVID-19 has dropped off the radar for some Australians. But experts say the number of daily cases and deaths is “extraordinary”. (5,299 interactions)

Sky News 1.000 0.534 0.182 0.283 0.156
The Age 0.534 1.000 0.249 0.403 0.257
9 News 0.182 0.249 1.000 0.400 0.417
Crikey 0.283 0.403 0.400 1.000 0.345
ABC News 0.156 0.257 0.417 0.345 1.000

Classification after embedding

  • use bertopic modeling pipeline (Grootendorst 2022)

  • once sentences/documents embedded in a vector space, wide array of clustering/classification/dimension-reduction tools available

  • a two-step process, designed to deal with non-smoothness of embeddings and balance between imposing structure and over-stuffing documents into clusters/topics (our approach for initial unsupervised passes through the data)

    • Uniform Manifold Approximation and Projection for Dimension Reduction or UMAP McInnes, Healy, and Melville (2020)

    • hierarchical, density-based clustering or HDBSCAN McInnes, Healy, and Astels (2017)

  • alternatives would be PCA or k-means, but we find these too restrictive in the unsupervised/exploratory case.

Unsupervised model

  • defaults in bertopic

  • produces over 2,000 topics

  • but hierarchical structure makes for easy inspecting branches of topic tree, identifying higher-order topics

Semi-supervised model fitting

  • collapse topic branches from supervised model based on

    • our substantive interest in election campaign
    • prior expectations about topics based on inspection of daily print media front pages, TV news, etc.
  • no need to distinguish celebrity news from lifestyle

  • force model to extract election relevant topics (ERTs)

  • “hard code” a small set of documents based on keywords and phrases found in documents assigned to topics thought to exist a priori and well-extracted by unsupervised approach

Results

  • 22 mutually exclusive and exhaustive topics, 15 of them election-related topics (ERTs)

  • a “miscellaneous” topic is the single largest topic, spanning lifestyle, pets and animals and shopping to Elon Musk’s proposed takeover of Twitter.

  • Celebrity and entertainment news, sports and crime follow in both prevalence and user interactions.

  • even during an election campaign, FB posts featuring trees with water gushing from their trunks (882,000 interactions), a gorilla’s 65th birthday (235,000 interactions), sheep, dogs, babies, chickens (and combinations thereof) generate orders of magnitude more FB interactions than election news.

  • Top 50 posts by user interactions account for 5.25M or 15% of all user interactions with almost 100,000 media FB posts. None are related to the election.

Interactions with FB posts by media entities, by topic and day, for ERTs ranked 1 through 5 by total interactions, during the 2022 Australian federal election campaign.

Interactions with FB posts by media entities, by topic and day, for ERTs ranked 6 through 10 by total interactions, during the 2022 Australian federal election campaign.

Interactions with FB posts by media entities, by topic and day, for ERTs ranked 11 through 15 by total interactions, during the 2022 Australian federal election campaign.

Interactions versus posts, engagement versus supply

  • FB user interactions a function of supply of posts + number of subscribers + interest in topic
  • we factor out these components of supply
  • normalised interaction count (NIC) = interactions per post per thousand subscribers
  • median NIC within topic an indicator of topic salience

Interactions per post per thousand FB account subscribers, median within each topic (normalised interaction count, or NIC), plotted against volume of posts per topic

Our discoveries and contributions

  • media roundly criticised for 2022 campaign

  • but hip-pocket issues (housing affordability, cost of living, inflation, wages and interest rates) constituted the single largest source of content in the media’s coverage of the campaign.

  • “gotcha” and “gaffe” focus on Albanese small in volume when set against entirity of campaign content

  • analysis of FB interactions measure of public’s reaction to media content, novel contribution to the study of Australian campaigns and media - measure issue salience from the consumer side, not the supply side

  • topics centred on personalities — the leaders, upstart independents or controversial candidates — engage audiences disproportionately.

  • consistent with other analyses of 2022 election (e.g., AES), highlighting Morrison’s unpopularity

  • rather than lambast media for focus on personalities etc — and given political economy of contemporary news media — the surprise is that there isn’t more focus on these more vivid elements of campaigns

References

Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. 2020. “Language Models Are Few-Shot Learners.” arXiv. https://doi.org/10.48550/arXiv.2005.14165.
Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” arXiv. https://arxiv.org/abs/1810.04805.
Grootendorst, Maarten. 2022. BERTopic: Neural Topic Modeling with a Class-Based TF-IDF Procedure.” arXiv. https://arxiv.org/abs/2203.05794.
McInnes, Leland, John Healy, and Steve Astels. 2017. “Hdbscan: Hierarchical Density Based Clustering.” The Journal of Open Source Software 2 (11): 205. https://doi.org/10.21105/joss.00205.
McInnes, Leland, John Healy, and James Melville. 2020. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction.” arXiv. https://doi.org/10.48550/arXiv.1802.03426.
Wolf, Thomas, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, et al. 2020. “Transformers: State-of-the-Art Natural Language Processing.” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 38–45. Online: Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-demos.6.