## Community Discovery Network

**Download the Algorithm Similarity Network**

This is the companion supplementary material for the paper “Discovering Communities of Community Discovery”, currently under review to the ASONAM conference. The package includes the data & code to replicate the main results of the paper.

If you’re only interested in the main Algorithm Similarity Network, you can find it in the asn.tsv file in the main folder of the ZIP archive. The network’s nodes are algorithm solving the community discovery problem. I ran more than a thousand benchmarks and I counted the number of times in which the two algorithms returned very similar results, which is the basis for the edges. This version of the network has been filtered using the noise corrected backboning technique. The file has four columns: the first two are the algorithms’ names, and the second two are: nij (the number of benchmark in which the two algorithms agree) and score (a measure of significance of the edge weight).

Otherwise, the replication material is divided into two folders.

01_network contains the full non-backboned network (asn_all_complete_top5count.tsv) and various view of it, plus Python scripts to generate figures 1 & 2 in the paper.

02_robustness contains several alternative ways to build ASN: by averaging similarity scores (asn_all_complete_nmiavg.tsv); imposing a hard threshold to count the similarities between two algorithms (asn_all_backbone_hardthresh.tsv); only considering LFR benchmarks (asn_lfr_complete_top5count.tsv); or only considering real world networks as benchmarks (asn_real_complete_top5count.tsv).

Note that the Python scripts require a set of standard Python scientific libraries (numpy, scipy, pandas, scikit-learn, networkx, …). The package also includes compiled binaries to calculate Infomap communities (code from Daniel Edler & Martin Rosvall) and the overlapping normalized mutual information (code from McDaid’s GitHub). I compiled them on Ubuntu 18.04, thus they might not work on other systems and you’re encouraged to re-compile them yourself.

Due to space constraints, I could not cite the authors of all 73 algorithms in my paper. Thus I report here a map on how to find out which implementations I used. The tag I use to identify the algorithm is consistent with the one I use in the figure. (Disclaimer: of course, some of these links are bound to break in the future. I accessed them last time around November 2018, so the Internet Archive could help)

edgebetween, fastgreedy, hrg, labelperc, leadeig, louvain, spinglass, walktrap: igraph implementation in R (https://igraph.org/r/)

mcl: https://www.micans.org/mcl/#source

tabu, extr: [5] and [6] in http://deim.urv.cat/~sergio.gomez/radatools.php

ganet, ganet+, moganet: http://staff.icar.cnr.it/pizzuti/codes.html

ganxis: https://sites.google.com/site/communitydetectionslpa/

conclude: http://www.emilio.ferrara.name/code/conclude/

conga, copra, cliquemod, peacock: http://gregory.org/research/networks/

mlrmcl: https://sites.google.com/site/stochasticflowclustering/

metis: http://glaros.dtc.umn.edu/gkhome/metis/hmetis/download

slpa, fluid, kerlin, kclique: networkx implementation in python (https://networkx.github.io/documentation/stable/)

pmm: http://leitang.net/heterogeneous_network.html

crossass: https://faculty.mccombs.utexas.edu/deepayan.chakrabarti/software.html

demon: http://www.michelecoscia.com/?page_id=42

bigclam, agm: part of the SNAP library (https://snap.stanford.edu/)

hlc: http://barabasilab.neu.edu/projects/linkcommunities/

tiles: https://github.com/GiulioRossetti/TILES

oslom: https://sites.google.com/site/andrealancichinetti/software

kmeans, dbscan, ward, agglomerative, spectral, meanshift, affinity, birch: http://scikit-learn.org/stable/modules/classes.html#module-sklearn.cluster

code-dense: https://link.springer.com/article/10.1007/s10618-014-0373-y

moses, collapsed-sbm: https://sites.google.com/site/aaronmcdaid/downloads

gce: https://sites.google.com/site/greedycliqueexpansion/

ilcd: http://cazabetremy.fr/rRessources/iLCD.html

svinet, mmsb: https://github.com/premgopalan/svinet

bnmtf: http://www.cse.ust.hk/~dyyeung/code/BNMTF.zip

rmcl: https://rdrr.io/github/DavidGilgien/ML.RMCL/man/ML_RMCL.html

OLC: http://www-personal.umich.edu/~mejn/OverlappingLinkCommunities.zip

cme-td, cme-bu: https://github.com/linhongseba/ContentMapEquation

edgeclust: http://homes.sice.indiana.edu/filiradi/Data/radetal_algorithm.tgz

infocentr: https://www.w3.org/People/Massimo/papers/2004/community_pre_04.pdf (my own implementation)

msg, vm: http://www.biochem-caflisch.uzh.ch/node/385

ocg: http://tagc.univ-mrs.fr/tagc/index.php/software/ocg

savi: http://dsec.pku.edu.cn/~tieli/

mixnet: http://www.math-evry.cnrs.fr/logiciels/mixnet/mixnet

vbmod: https://github.com/jhofman/vbmod_python

bridgebound, bagrowLocal, clausetLocal, lwplocal: https://github.com/kleinmind/bridge-bounding

infomap, infomap-overlap: http://www.mapequation.org/code.html#Download-and-compile

graclus: http://www.cs.utexas.edu/users/dml/Software/graclus.html

graclus2stage: https://www.researchgate.net/profile/Padraig_Cunningham/publication/222089361_Community_Finding_in_Large_Social_Networks_Through_Problem_Decomposition/links/09e4150fe534d66b1c000000/Community-Finding-in-Large-Social-Networks-Through-Problem-Decomposition.pdf (my own implementation)

fuzzyclust: https://github.com/ntamas/fuzzyclust

netcarto: http://seeslab.info/downloads/network-cartography-netcarto/

linecomms: https://sites.google.com/site/linegraphs/ (to generate the line graph + igraph’s implementation of Louvain)

The 819 real world networks we use as a benchmark come from the following papers/links:

E. Gabasova, “The Star Wars social network.” Evelina Gabasova’s Blog

T. Snijders, G. van de Bunt, and C. Steglich, “Introduction to stochastic actor-based models for network dynamics.” Social Networks 32(1), 44-60 (2010)

G. Van de Bunt, M. Van Duijn, and T. Snijders. “Friendship networks through time: An actor-oriented dynamic statistical network model.” Computational & Mathematical Organization Theory 5.2. 167-192. (1999)

C. J. Rhodes and P. Jones, “Inferring missing links in partially observed social networks.” Journal of the Operational Research Society 60(10), 1373-1383 (2009)

K. Descormiers, and C. Morselli, “Alliances, conflicts, and contradictions in Montreal’s street gang landscape.” International Criminal Justice Review 21(3), 297-314 (2011)

S. R. Sundaresan et al., “Network metrics reveal differences in social organization between two fission-fusion species, Grevy’s zebra and onager.” Oecologia 151(1), 140-149 (2007)

M. W. Schein and M. H. Fohrman. “Social dominance relationships in a herd of dairy cattle.” The British J. of Animal Behaviour 3(2), 45-55 (1955)

A. G. S. Framis, “Illegal networks or criminal organizations: Power, roles and facilitators in four cocaine trafficking structures.” In ‘Third Annual Illicit Networks Workshop,’ Montreal (2011)

A. Beveridge and J. Shan, “Network of Thrones.” Math Horizons 23(4), 18-22 (2016)

W. De Nooy, “A literary playground: Literary criticism and balance theory.” Poetics 26 (5-6), 385-404 (1999)

http://vlado.fmf.uni-lj.si/pub/networks/data/bio/foodweb/foodweb.htm

http://wwwlovre.appspot.com/support.jsp

D. F Lott, “Dominance relations and breeding rate in mature male American bison.” Zeitschrift für Tierpsychologie 49(4), 418-432 (1979)

J. Kaminski et al., “Moviegalaxies – Social Networks in Movies.” https://doi.org/10.7910/DVN/T4HBA3, Harvard Dataverse, V3 (2018)