While we can measure the overall sentiment of tweets using basic bag-of-words approaches to identify the proportion of negative versus positive words surrounding keywords, we can also perform more nuanced analysis of the sentiment of text using the PeRspective API. Using several trained machine learning models, we can measure the perceived impact of a comment (in this case, tweet) on a conversation. This package has been specifically designed to analyze “comments” or in other words “a single post to a web page’s comments section, a forum post, a message to a mailing list, a chat message…”. The models used to score comments are Convolutional Neural Network (CNN) trained models with GloVe work embeddings. They have been created using thousands of comments from online forums such as Wikipedia and New York Times. Each of these comments has been human-coded to train the models. The PeRspective API has several models, including alpha models and experimental models. For the purpose of this analysis, we will be using the alpha ‘TOXICITY’ model along with two experimental models: ‘IDENTITY_ATTACK’ and ‘THREAT’. Additionally, the ‘SEXUALLY_EXPLICIT’ and ‘FLIRTATION’ experimental models will be explored specifically in relation to tweets that mention women candidates.
The models mentioned above are specified below:
Each of these models score individual comments on a continuous probability scale from 0 to 1. The models return “model attribute scores” for each tweet where a higher value from 0 to 1 indicates a greater likelihood of the attribute level. In other words, the model predicts the probability that a tweet, for example, will be perceived as rude, disrespectful, or unreasonable (TOXICITY). Using these scores, we can investigate which topics elicit higher toxicity/threat/identity_attack scores and from who, whether men or women are more likely to engage in toxic language, the parties with the most candidates using toxic language, and more. Below I investigate each of the core issues outlined in the key issue sentiment section: the economy, immigration, the environment, indigenous issues, and gender and feminism. Tweets have been grouped by topic according to the following custom topic dictionary (note french and english tweets are both scored using PeRspective):