Distinct Platform Identity: A Cross-Platform Analysis of Linguistic Norms

Part I: Sentiment and emotional language on Tumblr

Author

Ellie Gómez Tovar

Do platforms have personalities?

Social media users often describe platforms as though they each have a distinct voice. Tumblr is seen as emotionally expressive and community-driven. Reddit is often described as discussion-heavy and argumentative. Threads tends to feel more public-facing and reactive. This project asks whether those impressions can be traced through language itself.

This first stage focuses on Tumblr as a case study. Using Tumblr API data, I collected and cleaned a sample of posts connected to fandom discussion and used lexicon-based sentiment analysis to examine the emotional language that appears most often.

Research Question

What does sentiment analysis reveal about the emotional tone of Tumblr posts, and what might that suggest about Tumblr’s platform-specific voice?

Why Tumblr?

Tumblr is a useful place to begin because it has a strong reputation for affect-heavy, highly stylized posting. Users often write in ways that are emotionally intensified, playful, and communal. That makes Tumblr a strong test case for asking whether a platform’s “personality” can be observed through sentiment patterns.

Collecting the Data

The dataset for this stage was collected through Tumblr’s API using tagged posts related to fandom or niche topic discussion. Because Tumblr’s tag endpoint only returns a limited number of posts at a time, I paginated backward through multiple batches and then deduplicated the results. The final cleaned dataset contains 499 posts.

Cleaning the Corpus

The raw Tumblr data required significant cleaning before analysis. Several fields contained nested structures, HTML-formatted text, and raw metadata that were not suitable for direct analysis. I reduced the dataset to the most relevant variables, cleaned the text fields, standardized the tags column, and created a more concise CSV for sentiment analysis.

A Quick Look at the Sentiment Analysis

The most common emotion noted in these posts is “trust!”