We decode the social effects of media to put their power to good use.

Project Ratio

Evolving our news information ecosystems

Democracy depends on a well-informed citizenry that holds its elected representatives accountable for their actions. To obtain this outcome, three conditions must be met: (1) Citizens must have access to and consume accurate, reliable information about the world; (2) Citizens must use this information to inform their understanding and opinions; and (3) Citizens must act upon their understanding to influence policies and party platforms.

Numerous studies have raised troubling suggestions about our information pipeline—for example, that it is widely contaminated by fake or misleading news, that consumers are being algorithmically sorted, or are self-selecting, into partisan echo chambers, or that a large percentage of citizens are insufficiently engaged to have any views at all. Generally these studies have depended on small, one-off data collection efforts that fail to cohere into a complete picture. A major obstacle to improving the information ecosystem is the absence of a comprehensive, longitudinal picture of the production, consumption, and absorption of news.



Project Ratio seeks to improve the information ecosystems that underpin democracy, by providing a first-of-its-kind at scale, real-time, cross-platform mapping of news content, as it moves through the "information funnel,” from news production, through distribution and discovery, consumption, and absorption.    

Extending and deepening the work of mass media tracking initiatives like Media Cloud, Google Trends, Unfiltered.news, Internet Archive, GDELT, and others, Project Ratio will deploy state of the art natural language processing and machine learning to automatically classify news articles and segments in terms of topics and named entities (persons, places, and organizations), as well as more elusive dimensions like prominence and bias. The project is starting out by building a picture of news information supply dating back to 2014 across the web (over 2,500 English-language sources scraped daily) and television (in four major urban markets). Supply data will then be related to news consumption using two overlapping panels of over 50,000 news consumers, for web and television respectively.