Denmark + Italy

This zip file contains the scripts and the data to reproduce the main figures and tables of the paper “Comparing the Italian and Danish Music Industries with Network Analysis”.
Most of the figures can be rendered using Gnuplot with the provided scripts. Python generates automatically the Latex code of most tables. The scripts depend on the following Python libraries: numpy, scipy, pandas, torch, torch_geometric, scikit-learn, matplotlib, seaborn, networkx, umap, and adjustText. You’re also going to need the non-US version of the relaimpo R package.
The archive already contains the correct folder structure. All that is needed to do is to navigate to the 01_scripts folder and run the scripts in their numbered order. Beware that:
- Some scripts depend on the output of preceding scripts. Running the scripts out of order will likely result in crashes.
- Folder 02_outputs needs to be there or the scripts will crash. It contains the structure needed to produce the proper outputs.
- We do not include some Cytoscape files for the visualizations of the Italian network as they can be found in the supplementary material of the original paper (cited in our paper).
- Script 09a_figure9.py will generate automatically two 09b gnuplot scripts with which you can generate the two parts of Figure 9.
Italy
This is the data and code necessary to reproduce the results in the paper “Node Attribute Analysis for Cultural Data Analytics: a Case Study on Italian XX-XXI Century Music.”
Running the code requires Python and R. To also generate the figures, you can use the provided scripts with Gnuplot.
To run the code, you’re going to need the following python libraries: numpy, scipy, pandas, networkx, torch, torch_geometric, scikit-learn, pandarallel, seaborn, and matplotlib.
You’re also going to need the following libraries for R: relaimpo and stargazer.
The archive contains four folders.
- “data”. This folder contains the raw data. The dataset is stored in five files. They are all tab-delimited.
- “genres.tsv”. The first column is a band id. The following columns contain the count of the number of records the given band has released for the given genre.
- “ids.tsv”. A two-column file connecting a band id with its name.
- “network.tsv”. The temporal bipartite network. Three columns. Each row is an artist (first column) playing for a band (second column) in a given year (third column).
- “regions.tsv”. The first column is a band id. The following columns contain a binary value, equal to one if the band originated from a given region.
- “years.tsv”. The first column is a band id. The following columns contain a binary value, equal to one if the band released a record in the given year.
- “figures”. This folder contains three gnuplot scripts to generate some of the figures of the paper. You can call them with the command “gnuplot scriptname.gp”, provided you have run the python scripts that will generate their inputs beforeheand.
- “outputs”. The outputs of the python scripts will be put here. The folder is pre-populated with the Cytoscape session files generating the network visualizations of the paper, along with the unipartite projections of the bipartite network.
- “scripts”. This folder contains the code to reproduce the results. It has a “lib” subfolder with custom python libraries necessary to run the code. All scripts can be run by calling “python scriptname.py” or “Rscript scriptname.r” and none of them requires any parameter setting. Be sure to run the in order and before calling the scripts in the “figures” folder.
