Project Ideas

Below are some ideas for Master’s theses, bachelor projects and other projects for students at the IT University of Copenhagen. Contact me (mcos@itu.dk) to get started.

  • Cultural Data Analysis and Networks: build networks about the production of culture. Examples are artist-band connections using Discogs data, movie networks from IMDb, citations networks from paintings or books, character networks from books and/or graphic novels, notable people networks from Wikipedia, … The idea is to have networks with rich node metadata to study some aspect of cultural production. Examples are new genre classifications, temporal analysis of eras of cultural production, geographical analysis, and more.
  • Estimation of Multipolar Polarization: I have developed a measure to estimate how polarized the political discourse is on Facebook and Twitter. However, the measure can only be calculated for two opposing opinions. How can we deal with scenarios where users have multiple opinions, such as supporting multiple different political parties? This project can benefit from NLP techniques to generate word embeddings (GPT, W2V, …) and a machine learning side (PCA, NNMF, t-SNE, …) so not only for network enthusiasts!
  • Build Urban Networks from OpenStreet Data: by using rich data about points of interests, we can characterize cities by how diverse their amenities are, potentially leading to innovative livability scores.
  • Human Mobility and Development: Using a new dataset estimating the development level of places in developing countries, we can estimate the link between wealth and the place where one lives. Then we can investigate the effect of lockdowns on the livelihood of people: if human mobility gets much harder, are all people equally affected or is there a disproportionate negative impact for people living in low-income communities?
  • Network Analysis Library using Torch: Torch geometric is a nice library implementing many graph learning functions for GPU processing. There are some GPU bindings for Networkx, but it would be great to develop from the ground up a more complete library with GPU computing at its core. You can help me making it by implementing and testing a handful of network analysis functions.
  • Network Generating Model: I have an algorithm that, given a network, it outputs the connection rules underlying it, by exploring frequent patterns. If we are given a set of rules, can we generate a new network from scratch that looks like the one we obtained the rules from?
  • Some other random keywords (ask me about it!): economic complexity of the Roman Empire, Ice Cube neutrino observatory data, network entropy, node vector clustering, phoneme networks, utilitarian “would you rather?” game, US health insurance data, flywire.ai …