When there’s a “breaking news” story that seems consequential (which, though it’s not easy to judge, seems to be happening with increasing frequency), it’s interesting to see what priority and context is given to these items on popular news outlets of differing across the political sprectrum (cnn, npr, fox, msnbc, breitbart, huffington post).
Though a lot of the most interesting questions might require a more subtle data analysis (and more data overall) than is available to me at the moment, I’m interested to try to use the skills we’ve learned in this class to look into the similarities, differences, and trends of what is covered over the next week or two on several popular sites by pulling headline and story text of main, featured, and top articles at regular intervals (perhaps every several minutes) throughout the day and then looking into what’s in the data.
Since I will only be collecting data for about two weeks before finishing the project, I’m guessing that there will be some inherent limitations initially. Also, I think if I search more deeply, I might find some tools online that cover a similar area. Still, I’d like to at least use this final project as a means to think about what kind of tool can be built on this data and to use this project as a means to focus further learning on related skills over the next few months, perhaps making something like a real-time web app, for instance, that gives some feedback on this data. Just creating the tool that stores this data which lets you pull up for a given timeframe what the main and featured headlines were on the given sites during that time is a resource seems fun to put together. If there happens to be a “big” news event at a certain time, the content of which is more favorable to one or the other political persuasion, do the news site across the political spectrum give equal weight to it, or do some sites in some way try to blunt the emphasis that’s given by prioritizing other, unrelated stories on their sites at that time.
Over the entire corpus, it seems like there will be some sentiment analysis that can be done broadly on what is covered. In one of the articles below, I thought one interesting analysis to see how the word associations differ on a given topic for each site.
Are there other features of the text on sites that would be interesting to compare (average story length, reading level, % change in story text over time)?
What are the variety of stories covered by each site? Does some seem to covered a broader number of topics generally?
What is the turnover rate of stories on each site? Is it similar between sites?
Are there general trends in the arc of stories presented through the day? Obviously, this and the previous item have a large dependency on the news cycles themselves, but perhaps there are other emotional arcs that would become apparent.
For a given story on a site, I believe the url may stay consistent. It would be interesting to see how headlines for the same story change over time, if this is the case, and if there’s any trend that can be discerned.
This is really more of a long-term project, but I hear “news cycle” mentioned a lot. Is this metric something that can be quantified? Is it possible to track news cycles on different sites?
In addition that other tools online that might give me some direction on how to think of this data, I came across some interesting infographics in the following articles:
https://pudding.cool/2018/01/chyrons/
https://www.theatlantic.com/politics/archive/2016/09/debate-recaps-cable-news-clinton-trump-fox-msnbc-cnn/502223/
https://storify.com/jamesjalandoni/comparison-of-media-coverage-msnbc-vs-fox-news