Attention Plotter: A Tool for Exploring Media Ecosystems | Future of News and Participatory Media

Attention Plotter is a d3.js-based tool for graphing and comparing volumes of content from multiple media sources and word frequencies within that content over a range of dates. It’s now part of the Controversy Mapper project at the MIT Center for Civic Media. And it’s available on GitHub: https://github.com/erhardt/Attention-Plotter.

Live demo of Attention Plotter using Trayvon Martin Data and TF-IDF: http://erhardtgraeff.com/demo/aplotter/trayvon-tfidf.html

This work comes out a research project pursued by Matt Stempeck, Ethan Zuckerman, and myself, studying the media coverage of the Trayvon Martin story from last spring. We collated a diverse set of media sources involved in the media ecosystem around Trayvon Martin: blog and mainstream media articles (via Media Cloud), Newspaper front page percentages (via PageOneX), Broadcast TV mentions (via Archive.org), Google Searches for ‘Trayvon Martin’ and ‘George Zimmerman’ (via Google Trends), Tweets, and Change.org petition signatures. We normalized the amount of content produced by each media source per day against their own peaks over the time period. This gives us a relative measure of attention paid to the story over time.

Trayvon Martin Attention Histogram (original, static graph)

Our original, static attention histogram is great for telling the story of the ebbs and flows of attention across the whole ecosystem. However, it’s not ideal for comparing two or three media sources’ attention volumes directly. The graph also falls short as a tool for exploring the data more deeply. To be specific, our original process for analyzing the framings of the Trayvon Martin story was to find the most highly cited (linked to) articles in the Media Cloud corpus and read them, looking for keywords. One strong example was ‘Marijuana’ and ‘Drug Dealer.’ These were extracted from a key Daily Mail article citing The Daily Caller and Wagist blog, who went through Trayvon Martin’s social media profiles and dug up relevant details. This framing of Trayvon Martin as less than innocent was a strategy by the Right to battle against the gun control and racial profiling campaigns coming from the Left. Ideally though, we should be able to identify those keywords like ‘marijuana’ without reading a bunch of articles.

Enter Attention Plotter. The tool allows us to load in a dataset of normalized media volumes as well as a set of most common words in a media source per day. The interactivity supports viewing the original clustered bar graph or sparklines, which are interpolated line graphs charting the rises and falls of those normalized volumes. By clicking the media source color squares in the legend, you can toggle their bars or lines on and off to compare to sources directly. For instance, leaving just the Google Searches up you can see how ‘George Zimmerman’ searches overtake ‘Trayvon Martin’ searches in April around the time of Zimmerman’s arrest.

'Trayvon Martin' / 'George Zimmerman' Google Search Comparison

By rolling over the dates in the x-axis, you can view popover word clouds scaled to the magnitude of the each word’s significance. TF-IDF is used to calculate their magnitudes in order to adjust for the expected commonality of words like ‘trayvon’, ‘martin’, or ‘zimmerman’. Here then we see on March 26th, that the fourth most common word is ‘marijuana’. This aligns with the publication of that Daily Mail article.

March 26th Word Cloud

Is this a Tool for Journalists?
The goal for this project was and is to serve as an academic research tool allowing us and others to explore media controversies like Trayvon Martin. I can imagine two applications of this work for journalists though. One would be for media columnists interested in writing about how a story unfolded over multiple media spaces, giving them a window into an ecosystem view of the situation as well as some quantitative data to base their insights on. A second option might be integrating Attention Plotter into the news room as an analytical tool for tracking and assessing a news organizations performance against other media sources to aid in identifying gaps in coverage around such controversies.

Future Development
A few features will be added over the summer. I want to allow day-by-day timeline text to be imported so that key events can be included right on the visualization for reference. I also want to incorporate the tasks of normalizing the media sources data and the TF-IDF analysis natively in the JavaScript library so that less pre-processing is required for the data, as well as offering the ability to show data peaks in their raw numbers. Finally, we at the Center for Civic Media are planning to incorporate the visualization directly into the Media Cloud platform for other researchers to use as a native exploration tool on top of the corpus.