This is a release of the original dataset on which all analyses of the paper “Knowing Where and How Criminal Organizations Operate Using Web Content”, published in 2012 at the CIKM conference, together with Viridiana Rios.
The dataset includes a csv file. The csv file contains 13 columns. The columns are defined as follows:
- Column #1 “Code”: The zipcode of the municipality in Mexico referred by the row. The zipcode is compatible with INEGI codes. Note that the data type of this column was incorrectly set to numeric, so the 4-digit rows should be interpreted as having a leading zero, i.e. “1001” is “01001”.
- Column #2 “State”: The INEGI code of the state in which the municipality is located.
- Column #3 “Year”: The year to which the row refers. Note that data prior to 2004 is still reported, but less reliable than data after 2004.
- Columns #4-13: A column per DTO (Drug Trafficking Organization). The column name identifies the organization. We collect the 9 largest and most important organizations and we group in the last columns all mentions of the other DTOs.
If you find this data useful, please cite the paper originating this data as:
Coscia, Michele and Viridiana Rios (2012). Knowing Where and How Criminal Organizations Operate Using Web Content. CIKM, 12 (October – November).