A Look at Occupy Boston’s Mailing Lists

Part of an ongoing project by the author to describe Occupy Boston’s mailing lists using network analysis.

Interactive Network Graph of Shared Users among Occupy Boston Mailing Lists

Interactive Network Graph of Occupy Boston's Mailing Lists linked by their Shared Users

The Story
Can we learn something about a social movement by looking at the digital tools it uses to organize? The Occupy Movement was defined as much by its highly visible occupation tactic as by its use of new digital media to organize and mobilize. The success of the movement was really to inject new language into our society about inequality. Think the 1% and the 99%. This was achieved through a sustained campaign of media activism. Language was developed to describe the inequalities between the common man and the rich, embodied by Wall Street — the perpetrators of the recent global financial crisis, and various forms of media were created to get the message out. The occupations then served to keep that message in mainstream media as they attracted sustained coverage themselves for both good and bad reasons.

We see this play out in the network of mailing lists. Occupy Boston’s general Media list had the most messages posted it in during the period September 2011 – October 2012, consistent with what we would expect from a movement focused on media activism. In terms of expansiveness of user participation, the Ideas mailing list takes the crown, which is where much of the early intellectual labor on defining Occupy Boston’s mission and direction was hashed out. In the data, we also see a lot of overlap amongst the mailing lists. All but one list (OB Updates, which was a unidirectional announcement list), shares many active users with other lists. The median degree is near 20, which is almost a perfect mesh network. This suggests that this public mailing lists, although sometimes dedicated to very specific themes or “committees,” enjoyed a lot of interconnection. Between the major mailing lists (seen as an outer ring on the network), which are more general interest, we see 100+ shared users on their mutual edges. This number drops off for some of the more niche mailing lists and could represent a few key organizers or overzealous mailing list participants. A more qualitative study is needed to tell the rest of this story.

Quick Statistics

  • Mailing Lists: 22
  • Total Messages: 36,303
  • Total Users: 922 (unique email addresses)
Distribution of Total Users and Messages across Mailing Lists

Distribution of Total Users and Messages across Mailing Lists (left y-axis is Messages scale; right y-axis is Users scale

How I Made the Network Graph
I downloaded the mailman archives from September 2011 to October 2012 from Occupy Boston’s public mailing lists, i.e. those that do not require moderator access to join. I wrote a Python script to parse the archives, which are in a standard mbox format, into an SQLite database. I devised a schema with a standard set of ids for mailing lists and individual users, and used these ids to extract a network of users shared among different mailing lists with a simple SQL query, storing resultant nodes (mailing lists) and edges (shared user relationships) in CSV files.

I imported the nodes and edges files into Gephi after hand editing their column names to conform to Gephi’s standard. Gephi automatically aggregated the edges between nodes to create weighted edges representing the total number of shared users. I adjusted the layout in Gephi to represent the weighted edges using different thicknesses. The nodes were scaled by total users active in each mailing list, an attribute extracted from my database, and their color was scaled on a pale to dark red spectrum according to the total number of messages during the period of analysis, also extracted from the database. I used the Forced Atlas 2 layout algorithm, which forces the most central nodes out of the center for easier comprehension, and then hit the graph a few times with the Expansion layout algorithm to give extra space between nodes.

Using the Sigmajs Exporter plugin, I exported the network so that it could be viewed on the web as an interactive visualization. I customized the default javascript and css in several ways to display the network graph more clearly. In the config.json file, I manipulated the graph properties to create greater contrasts between node sizes and edge weights, and adjusted the label threshold under drawing properties to ensure all nodes were labeled. I modified the sigma.js defaults for edge color, by forcing them to be a standard grey rather than the color of their source. This corrects for what is actually an undirected network (shared user relationships are mutual) being interpreted as directed. Finally, in the “Information Pane” I forced it to display the edge weights (shared number of users) between the active node and its neighbors, next to their listed names.

This entry was posted in All, Data Stories by Erhardt. Bookmark the permalink.

About Erhardt

Erhardt Graeff is a graduate student at the MIT Media Lab and MIT Center for Civic Media, studying information flows across mainstream and social media, and exploring technologies that help entrepreneurs from marginalized groups, especially youth, to be greater agents of change. Erhardt is also a founding trustee of The Awesome Foundation, which gives small grants to awesome projects, and a founding member of the Web Ecology Project, a network of social media and internet culture researchers. He holds an MPhil in Modern Society and Global Transformations from the University of Cambridge and B.S. degrees in Information Technology and International Studies from Rochester Institute of Technology.