A three-way collaborative team between IDSC Director of Visualization, Data Communication, and Information Design, Alberto Cairo (providing art-direction, project-management, etc.), El Universal (the main partner), and Google (who provided data, funding, and some Machine Learning expertise) has won one of the most prestigious award in the field of data journalism: the Sigma Award for their project “Zones of Silence.”
Formerly known as the Data Journalism Awards, the 2020 Sigma Awards competition awarded 10 winners and 2 honorable mentions out of 510 entries. Other winners included:
- WINNER: Best Data-Driven Reporting (large newsrooms) “The Troika Laundromat” by OCCRP and partners
- WINNER: Best Data-Driven Reporting (small newsrooms) “Made in France” by DISCLOSE
- WINNER: Best News Application “HOT DISINFO FROM RUSSIA (Topic Radar) by TEXTY.org.ua
- WINNER: Best Visualization (large newsrooms) “See How the World’s Most Polluted Air Compares With Your City’s” by The New York Times
- WINNER: Best Visualization (small newsrooms) “Danish Scam” by Pointer (KRO-NCRV)
- CO-WINNER: Innovation (large newsrooms) “AP DataKit: An Adaptable Data Project Organization Toolkit” by The Associated Press
- CO-WINNER: Innovation (large newsrooms) “Zones of Silence” by El Universal
- WINNER: Innovation (small newsrooms) “Funes: An Algorithm to Fight Corruption” by OjoPublico
- WINNER: Open Data “TodosLosContratos.mx” by PODER
- WINNER: Young Journalist—Rachel Dottle, FiveThirtyEight.com, IBM Data and AI, freelance
- HONORABLE MENTION: Best Visualization (large newsrooms) “Why Your Smartphone is Causing You ‘Text Next’ Syndrome” South China Morning Post
- HONORABLE MENTION: Best Data-Driven Reporting (large newsrooms) “Copy, Paste, Legislate” by USA TODAY, The Center for Public Integrity, and The Arizona Republic
|Zonas de silencio||Zones of Silence|
|El silencio, comparación entre notas y homicidios Elsa Hernández. Para entender el comportamiento de los artículos periodísticos y los homicidios en el periodo 2005-2019 —y conocer de esa forma la tendencia de estas dos variables en México respecto a años anteriores—, se calculó una tasa de variación para cada una de ellas por entidad federativa y una a nivel nacional.||The Silence, comparison between notes by reporter Elsa Hernandez, and homicides. In order to understand the behavior of the newspaper articles and the homicides in the period 2005-2019—and to recognize from this form the tendency of the various states in Mexico with respect to previous years—a variation rate was calculated for each state at the state level and one at the national level.|
Co-winner: Zones of Silence Organization: El Universal Organization size: Big Country: Mexico
Team: Edson Arroyo, Alberto Cairo, Miguel Garnica, Elsa Hernandez, Jenny Lee, Gilberto Leon, Dale Markowitz, Esteban Román, César Saavedra,
How do you measure the something that isn’t happening? What if the main cause of concern isn’t noise but silence? El Universal asked that question about the falling levels of coverage of homicides in Mexico, working on the hypothesis that journalists have been intimidated and harassed into silence. By comparing murder statistics with news stories over time, they were able to show where, and by how much, the troubling silence was growing.
Violent organized crime is one of the biggest crises facing Mexico. Journalists want to avoid becoming a target, so they choose to stay quiet to save their lives. We set out to measure this silence and its impact on journalism. To do so, we used AI to quantify and visualize news coverage, and to analyze the gaps in coverage across the country. In order to measure the degree of silence in each region of Mexico, we created a formula that allows us to see the evolution of this phenomenon over time.
Impact: Something akin to a code of silence has emerged across the country. We suspected that there were entire regions where journalists were not reporting on the violence, threats, intimidation, and murder that were well known to be part of daily life. This was confirmed by journalists who sought us out, after the story was released, to tell us they have been facing this problem. In collaboration with them, we are now preparing a second part of this story to focus on the patterns that lead to aggressions. Hopefully, this will lead us to some kind of alert when certain conditions (of news coverage and crime) are present in the regions of Mexico.
Our first step was to establish a process to determine the absence of news. We explored articles on violence to understand how they compare to the government’s official registry of homicides. In theory, each murder that occurs ought to correspond with at least one local report about the event. If we saw a divergence, or if the government’s reports were suddenly very different from local news coverage, we could deduce that journalists were being silenced. Early on, sorting through news articles seemed impossible. We knew we needed to find a news archive with the largest number of publications in Mexico possible so we could track daily coverage across the country. Google News’ vast collection of local and national news stories across Mexico was a good fit. The effort required us to identify the difference between the number of homicides officially recorded and the news stories of those killings on Google News. This required machine learning algorithms that were able to identify the first reported story, and then pinpoint where the event took place. With that information, we were able to connect reported events by media with the government’s reports on homicides across more than 2400 municipalities in Mexico. Finally, to measure the degree of silence in each region of the country, we created a formula that allows us to see the evolution of this phenomenon over time. The resulting data shows a fascinating mix of falls or peaks in unreported deaths, which coincide with events such as the arrival of new governments or the deaths of drug dealers. Further investigation will allow us to explain these connections.
The Hardest Part of The Project
The hardest part was creating the “formula for silence” to measure the degree of unreported homicides throughout the country. There are many variables behind the reasons why there aren’t as many articles as homicides in each region. So, in order to be sure the discrepancy was linked to violence and killings, we had to rule out, or include, segments of data along the way. This was extremely hard to do with machine learning, because the words in Spanish that are usually used to represent this kind of coverage, are also synonyms for other things. We had to validate (manually) a lot of the initial reports until we had a well-validated sample of results. This took us half a year. Then, we felt lost due to the amount of variables we had in our hands (disparity between events reported and published stories; matching stories reporting one single event by different websites; the uncertainty of internet penetration in all parts of the country and its evolution over time within the 14 years we analyzed . . .). Luckily, the interdisciplinary nature of our team (with economists, programmers, data experts, designers, and journalists) helped us find an answer that we felt was truly accurate.
What Others Can Learn From The Project
No matter how hard it is to measure a problem, there is always a way to do it, even if it’s not what you thought you would find in the beginning.