This is the companion data of the paper “Average is Boring: How Similarity Kills a Meme’s Success“. Please cite the paper if you find the data useful!
The ZIP archive contains two text files.
The first file, “instanceid_generatoridfiltered_votes_text_timestep” contains the actual data in 5 tab-separated columns. The columns contains:
- ID of the meme implementation: this is a progressive ID. You can use it to retrieve the corresponding meme implementation using the URL http://memegenerator.net/instance/<implementation_id>. So meme “10057023” can be retrieved at the URL http://memegenerator.net/instance/10057023;
- ID of the meme that has been used for the implementation;
- Number of upvotes that the meme implementation got until the crawling time;
- Text of the implementation, the text that the user superimposed to the meme;
- The bimonthly timestep that has been derived from the meme implementation’s ID.
The second file “generators” contains additional metadata on the memes in 4 tab-separated columns:
- ID of the meme, this column is the primary key of the file and it matches column #2 of the previous file, as a relational database foreign key;
- URL slug of the meme;
- Meme name;
- URL of the meme template image.
Have (again) a good meme hunting!