07 July 2016 ~ 0 Comments

Building Data-Driven Development

A few weeks ago I had to honor to speak at my group’s  “Global Empowerment Meeting” about my research on data science and economic development. I’m linking here the Youtube video of my talk and my transcript for those who want to follow it. The transcript is not 100% accurate given some last minute edits — and the fact that I’m a horrible presenter :-) — but it should be good enough. Enjoy!


We think that the big question of this decade is on data. Data is the building blocks of our modern society. We think in development we are not currently using enough of these blocks, we are not exploiting data nearly as much as we should. And we want to fix that.

Many of the fastest growing companies in the world, and definitely the ones that are shaping the progress of humanity, are data-intensive companies. Here at CID we just want to add the entire world to the party.

So how do we do it? To fix the data problem development literature has, we focus on knowing how the global knowledge building looks like. And we inspect three floors: how does knowledge flow between countries? What lessons can we learn inside these countries? What are the policy implications?

To answer these questions, we were helped by two big data players. The quantity and quality of the data they collect represent a revolution in the economic development literature. You heard them speaking at the event: they are MasterCard – through their Center for Inclusive Growth – and Telefonica.

Let’s start with MasterCard, they help us with the first question: how does knowledge flow between countries? Credit card data answer to that. Some of you might have a corporate issued credit card in your wallet right now. And you are here, offering your knowledge and assimilating the knowledge offered by the people sitting at the table with you. The movements of these cards are movements of brains, ideas and knowledge.

When you aggregate this at the global level you can draw the map of international knowledge exchange. When you have a map, you have a way to know where you are and where you want to be. The map doesn’t tell you why you are where you are. That’s why CID builds something better than a map.

We are developing a method to tell why people are traveling. And reasons are different for different countries: equity in foreign establishments like the UK, trade partnerships like Saudi Arabia, foreign greenfield investments like Taiwan.

Using this map, it is easy to envision where you want to go. You can find countries who have a profile similar to yours and copy their best practices. For Kenya, Taiwan seems to be the best choice. You can see that, if investments drive more knowledge into a country, then you should attract investments. And we have preliminary results to suggest whom to attract: the people carrying the knowledge you can use.

The Product Space helps here. If you want to attract knowledge, you want to attract the one you can more easily use. The one connected to what you already know. Nobody likes to build cathedrals in a desert. More than having a cool knowledge building, you want your knowledge to be useful. And used.

There are other things you can do with international travelers flows. Like tourism. Tourism is a great export: for many countries it is the first export. See these big portion of the exports of Zimbabwe or Spain? For them tourism would look like this.

Tourism is hard to pin down. But it is easier with our data partners. We can know when, where and which foreigners spend their money in a country. You cannot paint pictures as accurate as these without the unique dataset MasterCard has.

Let’s go to our second question: what lessons can we learn from knowledge flows inside a country? Telefonica data is helping answering this question for us. Here we focus on a test country: Colombia. We use anonymized call metadata to paint the knowledge map of Colombia, and we discover that the country has its own knowledge departments. You can see them here, where each square is a municipality, connecting to the ones it talks to. These departments correlate only so slightly with the actual political boundaries. But they matter so much more.

In fact, we asked if these boundaries could explain the growth in wages inside the country. And they seem to be able to do it, in surprisingly different ways. If you are a poor municipality in a rich state in Colombia, we see your wage growth penalized. You are on a path of divergence.

However, if you are a poor municipality and you talk to rich ones, we have evidence to show that you are on a path of convergence: you grow faster than you expect to. Our preliminary results seem to suggest that being in a rich knowledge state matters.

So, how do you use this data and knowledge? To do so you have to drill down at the city level. We look not only at communication links, but also at mobility ones. We ask if a city like Bogota is really a city, or different cities in the same metropolitan area. With the data you can draw four different “mobility districts”, with a lot of movements inside them, and not so many across them.

The mobility districts matter, because combining mobility and economic activities we can map the potential of a neighborhood, answering the question: if I live here, how productive can I be? A lot in the green areas, not so much in the red ones.

With this data you can reshape urban mobility. You know where the entrance barriers to productivity are, and you can destroy them. You remodel your city to include in its productive structure people that are currently isolated by commuting time and cost. These people have valuable skills and knowhow, but they are relegated in the informal sector.

So, MasterCard data told us how knowledge flows between countries. Telefonica data showed the lessons we can learn inside a country. We are left with the last question: what are the policy implications?

So far we have mapped the landscape of knowledge, at different levels. But to hike through it you need a lot of equipment. And governments provide part of that equipment. Some in better ways than others.

To discover the policy implications, we unleashed a data collector program on the Web. We wanted to know how the structure of the government in the US looks like. Our program returned us a picture of the hierarchical organization of government functions. We know how each state structures its own version of this hierarchy. And we know how all those connections fit together in the union, state by state. We are discovering that the way a state government is shaped seems to be the result of two main ingredients: where a state is and how its productive structure looks like.

We want to establish that the way a state expresses its government on the Web reflects the way it actually performs its functions. We seem to find a positive answer: for instance having your environmental agencies to talk with each other seems to work well to improve your environmental indicators, as recorded by the EPA. Wiring organization when we see positive feedback and rethinking them when we see a negative one is a direct consequence of this Web investigation.

I hope I was able to communicate to you the enthusiasm CID discovered in the usage of big data. Zooming out to gaze at the big picture, we start to realize how the knowledge building looks like. As the building grows, so does our understanding of the world, development and growth. And here’s the punchline of CID: the building of knowledge grows with data, but the shape it takes is up to what we make of this data. We chose to shape this building with larger doors, so that it can be used to ensure a more inclusive world.


By the way, the other presentations of my session were great, and we had a nice panel after that. You can check out the presentations in the official Center for International Development Youtube channel. I’m embedding the panel’s video below:

Continue Reading

09 June 2016 ~ 0 Comments

Netsci 2016 Report

netsci1

Another NetSci edition went by, as interconnected as ever. This year we got to enjoy Northeast Asia, a new scenario for us network scientists, and an appropriate one: many new faces popped up both among speakers and attendees. Seoul was definitely what NetSci needed at this time. I want to spend just a few words about what impressed me the most during this trip — well, second most after what Koreans did with their pizzas: that is unbeatable. Let’s go chronologically, starting with the satellites.

You all know I was co-organizing the one on Networks of networks (you didn’t? Then scroll down a bit and get informed!). I am pleased with how things went: the talks we gathered this year were most excellent. Space constraints don’t allow me to give everyone the attention they deserve, but I want to mention two. First is Yong-Yeol Ahn, who was the star of this year. He gave four talks at the conference — provided I haven’t miscounted — and his plenary one on the analysis of the Linkedin graph was just breathtaking. At Netonets, he talked about the internal belief network each one of us carries in her own brain, and its relationship with how macro societal behaviors arise in social networks. An original take on networks of networks, and one that spurred the idea: how much are the inner workings of one’s belief network affected by the metabolic and the bio-connectome networks of one own body? Should we study networks of networks of networks? Second, Nitesh Chawla showed us how high order networks unveil real relationships among nodes. The same node can behave like it is many different ones, depending on which of its connections we are considering.

yy1

Besides the most awesome networks of networks satellite, other ones caught my attention. Again, space is my tyrant here, so I get to award just one slot, and I would like to give it to Hyejin Youn. Her satellite was on the evolution of technological networks. She does amazing things tracking how the patent network evolved from the depths of 1800 until now. The idea is to find viable innovation paths, and to predict which fields will have the largest impact in the future.

When it comes to the plenary sessions, I think Yang-Yu Liu stole the spotlight with a flashy presentation about the microcosmos everybody carries in their guts. The analysis of the human microbiome is a very hot topic right now, and it pleases me to know that there is somebody working on a network perspective of it. Besides scientific merits, whoever extensively quotes Minute Earth videos — bonus points for it being the one about poop transplants — has my eternal admiration. I also want to highlight Ginestra Bianconi‘s talk. She has an extraordinary talent in bringing to network science the most cutting edge aspects of physics. Her line of research combining quantum gravity and network geometry is a dream come true for a physics nerd like myself. I always wished to see advanced physics concepts translated into network terms, but I never had the capacity to do so: now I just have to sit back and wait for Ginestra’s next paper.

netsci2

What about contributed talks? The race for the second best is very tight. The very best was clearly mine on the link between mobility and communication patterns, about which I showed a scaling relationship connecting them (paperpost). I will be magnanimous and spare you all the praises I could sing of it. Enough joking around, let’s move on. Juyong Park gave two fantastic talks on networks and music. This was a nice breath of fresh air for digital humanities: this NetSci edition was orphan of the great satellite chaired by Max Schich. Juyong showed how to navigate through collaboration networks on classical music CDs, and through judge biases in music competitions. By the way, Max dominated — as expected — the lighting talk session, showing some new products coming from his digital humanities landmark published last year in Science.  Tomomi Kito was also great: she borrowed the tools of economic complexity and shifted her focus from the macro analysis of countries to the micro analysis of networks of multinational corporations. A final mention goes to Roberta Sinatra. Her talk was about her struggle into making PhD committees recognize that what she is doing is actually physics. It resonates with my personal experience, trying to convince hiring committees that what I’m doing is actually computer science. Maybe we should all give up the struggle and just create a network science department.

And so we get to the last treat of the conference: the Erdos-Renyi prize, awarded to the most excellent network researcher under the age of 40. This year it went to Aaron Clauset, and this pleases me for several reasons. First, because Aaron is awesome, and he deserves it. Second, because he is the first computer scientist who is awarded the prize, and this just gives me hope that our work too is getting recognized by the network gurus. His talk was fantastic on two accounts.

aaron1

For starters, he presented his brand new Index of Complex Networks. The interface is pretty clunky, especially on my Ubuntu Firefox, but that does not hinder the usefulness of such an instrument. With his collaborators, Aaron collected the most important papers in the network literature, trying to find a link to a publicly available network. If they were successful, that link went in the index, along with some metadata about the network. This is going to be a prime resource for network scientists, both for starting new projects and for the sorely needed task of replicating previous results.

Replication is the core of the second reason I loved Aaron’s talk. Once he collected all these networks, for fun he took a jab at some of the dogmas of networks science. The main one everybody knows is: “Power-laws are everywhere”. You can see where this is going: the impertinent Colorado University boy showed that yes, power-laws are very common… among the 5-10% of networks in which it is possible to find them. Not so much “everywhere” any more, huh? This was especially irreverent given that not so long before Stefan Thurner gave a very nice plenary talk featuring a carousel of power laws. I’m not picking sides on the debate — I feel hardly qualified in doing so. I just think that questioning dearly held results is always a good thing, to avoid fooling ourselves into believing we’ve reached an objective truth.

netsci3

Among the non-scientific merits of the conference, I talked with Vinko Zlatic about the Croatian government on the brink of collapse, spread the search for a new network scientist by the Center for International Development, and discovered that Korean pizzas are topped with almonds (you didn’t really think I was going to let slip that pizza reference at the beginning of the post, did you?). And now I made myself sad: I wish there was another NetSci right away, to shove my brain down into another blender of awesomeness.  Oh well, there are going to be plenty of occasions to do so. See you maybe in Dubrovnik, Tel-Aviv or Indianapolis?

Continue Reading

20 May 2016 ~ 0 Comments

Program of Netonets 2016 is Out!

As announced in the previous post, the symposium on networks of networks is happening in less than two weeks: May 31st @ 9AM, room Dongkang C of the K-Hotel Seoul, South Korea. Przemek Kazienko, Gregorio D’Agostino and I have a fantastic program and set of speakers to keep you entertained on multilayer, interdependent and multislice networks. Take a look for yourself!

Session I

9:00 – 9:15: Room set up
9:15 – 9:30: Welcome from the organizers
9:30 – 10:15: Invited I: Yong-Yeol Ahn: Dynamics of social network of belief networks
10:15 – 11:00: Invited II: Luca Maria Aiello: The Nature of Social Links

11:00 – 11:30: Coffee Break

Session II

11:30 – 12:15: Invited III: Jianxi Gao: Networks of Networks: From Structure to Dynamics
12:15 – 13:00: Invited IV: Tomasz Kajdanowicz: Fusion methods for classification in multiplex networks

13:00 – 14:30: Lunch Break

Session III

14:30 – 15:15: Invited V: Michael Danziger: Beyond interdependent networks
15:15 – 15:35: Contributed I: Bruno Coutinho: Greedy Leaf Removal on Hypergraphs
15:35 – 15:55: Contributed II: Yong Zhuang: Complex Contagions in Clustered Random Multiplex Networks

15:55 – 16:30: Coffee Break

Session IV

16:30 – 17:15: Invited VI: Nitesh Chawla: From complex interactions to networks: the higher-order network representation

17:15 – 18:00: Round table – Open discussion
18:00 – 18:15: Organizers wrap up

Remember to register to the main NetSci conference if you want to attend.

Incidentally, the end of May is going to be a rather busy period for me. Besides co-organizing Netonets and speaking at the main Netsci conference, I’m going to present also at the Core50 conference in Louvain-la-Neuve, Belgium, on the role of social and mobility networks in shaping the economic growth of a country. Thanks to Jean-Charles Delvenne for inviting me!

I hope to see many of you there!

Continue Reading

17 March 2016 ~ 0 Comments

Networks of Networks @ NetSci 2016

EDIT: Deadlines & speakers updated. Submission deadline is on April 27th, notification on April 29th.

 

Dear readers of this blog — yes, both of you –: it’s that time of the year again. As tradition dictates, I’m organizing the Networks of Networks symposium, satellite event of the NetSci conference.

Networks of networks are structures in which the nodes may be connected through different relations. They can represent multifaceted social interaction, critical infrastructure and complex relational data structures. In the symposium, we are looking for a diversity of research contributions revolving around networks of networks of any kind: in social media, in infrastructure, in culture. The call for contributed talks is OPEN, and you can submit your abstract here: https://easychair.org/conferences/?conf=non2016

The deadline for submissions is April 15th, 2016 April 27th, 2016, just a month from now. We will notify acceptance by April 22nd, 2016 April 29th, 2016.

Here’s my handy guide to few of the many reasons to come:

  • Networks of networks are awesome, a hot topic in network science and a lot of super smart people work on them. You wouldn’t pass the opportunity to mingle with them, would you?
  • We have a lineup of outstanding confirmed keynotes this year — truth to be told, we have that every year:
  • This year NetSci will take place at the K-Hotel, Seoul, Korea (South, whew…). You really should not miss this occasion to visit such fascinating place.

The Networks of Networks symposium will be held on May 31st, 2016. The full conference, including all satellites, runs from May 30th to June 3rd. You can find all relevant information for the conference in the official NetSci website. Our symposium has a website too: check it out. In it, you will find also the fundamental information about all the people organizing this event with me: without them none of this would be possible. Here they are:

And also a list of other people, helping with their ideas, time and enthusiasm:

  • Matteo Magnani
  • Ian Dobson
  • Luca Rossi
  • Leonardo Duenas-Osorio
  • Dino Pedreschi
  • Guido Caldarelli
  • Vito Latora

Hope to see many of you in Korea!

Continue Reading

10 July 2015 ~ 0 Comments

Collective Intelligence 2015 Report

As I wrote previously, this year I missed NetSci, the yearly appointment for everybody who is interested in network analysis. The reason is that I was invited to give a talk at the Collective Intelligence conference, which happened almost at the same time. And once I got an invitation from Lada Adamic, I knew I couldn’t say no to her. Look at the things she did and is doing: she is a superstar scientist! So I packed my bags and went to the West Coast.

The first day was immediately a blast. Jeff Howe chaired the first session with some great insights about crowdsourcing. As you know, crowdsourcing is a super hip thing nowadays. It goes like this: individually, each one of us is pretty terrible at solving a hard problem. But if we put together enough terrible people, the average of their errors cancels out and we get an almost perfect performance. The term itself crowdsourcing was basically invented by Jeff (and Mark Robinson) when he was writing for Wired. The speakers in Jeff’s session showed us some cool examples of crowdsourcing research. The one that stuck with me the most was from Ágnes Horvát: she and her co-authors were able to analyze the internal communications of a hedge fund about investments and use the features of this communication (frequency of messages, mood, etc) to predict how the investments would perform. And they got it right much more than the strategists at the hedge fund itself.

stockmarket

The second day started with the session with my talk in it. A talk about memes of course! The people I got lined up with were spectacular. Jacob Foster talked about the collective intelligence of science. How do scientists make sense of the incredible amount of research out there? And how is it possible to advance knowledge in such hard times, when there are tens of new studies published every day? Dean Eckles gave an insightful talk about how Facebook users react when their stories get “snoped” (Snopes is a website dedicated to debunk hoaxes). Finally, fellow Italian Walter Quattrociocchi also spoke about hoaxes on Facebook: how they spread, how conspiracy believers interact with skeptics, and so on.

In the next session I attended, I particularly liked two talks. First, Ben Green talked about collective intelligence, and what it actually is. It reminded me of community discovery in networks: scientists dove enthusiastically into it, producing hundreds of papers. However, many didn’t realize that “communities” (and “collective intelligence”) are not so easily defined. Green is trying to fix that. Richard Mann‘s talk was also very interesting: in his work with Dirk Helbing he designed incentive strategies for getting the best out of the wisdom of crowds.

shutterstock_82798759

The lunch keynote was from a superstar in collective intelligence: Regina Dugan. Just to give you an idea about her, her CV sports a position as program manager at DARPA and she currently is vice president of Engineering, Advanced Technology and Projects at Google. Not bad. She shared her experiences in directing and experiencing the process of doing cutting edge research. Her talk was a textbook example of motivational speaking for scientists and entrepreneurs alike.

Finally, I had the pleasure to attend a couple of talks about prediction markets. These communities are basically a stock market for opinions. Given an event, say the 2016 president elections, people can put money on their prediction of who is going to be the winner. Websites like SciCast put in place some rules about buying and selling opinion “stocks” and eventually the market price converges on people’s best estimate of every candidate’s odds to win. Prediction markets are a favorite of Nate Silver, and he talks quite a lot about them in “The Signal and the Noise”.

f8bc367700f618001b1024ce49d68bd9_400x400

Unfortunately, my report of the conference ends abruptly here, as I had to miss the last day of conference. But the experience was well worth the trip, and I am very grateful for the invitation to Lada Adamic, Scott Page and Deborah Gordon. Unfortunately, this also means that I discovered a shiny event that overlaps with NetSci. Next year, I’ll have to face hard decisions when I allocate my conference time in early June.

Continue Reading

29 May 2015 ~ 0 Comments

Networks of Networks – NetSci 2015

The time has finally come! The NetSci conference—the place to be if you are interested in complex networks—is happening next week, from June 1st to June 5th. The venue is in Zaragoza, Spain. You can get all the information you need about the event from the official website. For the third year, I am co-organizing one of its satellite events: the Multiple Network Modeling, Analysis and Mining symposium, this year held jointly with Networks of Networks. The satellite will take place on June 2nd. As I previously said, unfortunately I am not going to be physically present in Span, and that makes me sad, because we have a phenomenal program this year.

We have four great invited speakers: Giovanni Sansavini, Rui Carvalho, Arunabha Sen and Katharina Zweig. It is a perfect mix between the infrastructure focus of the networks of networks crowd and the multidisciplinary approach of multiple networks. Sansavini works on reliability and risk engineering, while Carvalho focuses on characterizing and modeling networks in energy. Sen and Zweig provide their outstanding experience in the fields of computer networks and graph theory.

Among the contributed talks I am delighted to see that many interesting names from the network analysis crowd decided to send their work to be presented in our event. Among the highlights we have a contribution from the group of Mason Porter, who won last year’s Erdos Prize as one of the most outstanding young network scientists. I am also happy to see contributions from the group of Cellai and Gleeson, with whom I share not only an interest on multiplex networks, but also on internet memes. Contributions from groups lead by heavyweights like Schweitzer and Havlin are another sign of the attention that this event has captured.

I hope many of you will attend this seminar. You’ll be in good hands: Gregorio D’Agostino, Przemyslaw Kazienko and Antonio Scala will be much better hosts than I can ever be. I am copying here the full program of the event. Enjoy Spain!

NoN’15 Program

Session I

9.00 – 9.30 Speaker Set Up

9.30 – 9.45 Introduction: Welcome from the organizers, presentation of the program

9.45 – 10.15 Keynote I: Giovanni Sansavini. Systemic risk in critical infrastructures

10.15 – 10.35 Contributed I: Davide Cellai and Ginestra Bianconi. Multiplex networks with heterogeneous activities of the nodes

10.35 – 10.55 Contributed II: Mikko Kivela and Mason Porter. Isomorphisms in Multilayer Networks

10.55 – 11.30 Coffee Break

Session II

11.30 – 12.00 Keynote II: Rui Carvalho, Lubos Buzna, Richard Gibbens and Frank Kelly. Congestion control in charging of electric vehicles

12.00 – 12.20 Contributed III: Saray Shai, Dror Y. Kenett, Yoed N. Kenett, Miriam Faust, Simon Dobson and Shlomo Havlin. A critical tipping point in interconnected networks

12.20 – 12.40 Contributed IV: Adam Hackett, Davide Cellai, Sergio Gomez, Alex Arenas and James Gleeson. Bond percolation on multiplex networks

12.40 – 13.00 Contributed V: Marco Santarelli, Mario Beretta, Giorgio D’Urbano, Lorenzo Spina, Renato De Leone and Emilia Marchitto. Soccer and networks: changing the way of playing soccer through GPS, video analysis and social networks

13.00 – 14.30 Lunch

Session III

14.30 – 15.00 Keynote III: Arunabha Sen. Strategic Analysis and Design of Robust and Resilient Interdependent Power and Communication Networks with a New Model of Interdependency

15.00 – 15.20 Invited I: Alfonso Damiano,Univ. di Cagliari – Electric Market – Italy; Antonio Scala CNR-ICS, IMT, LIMS

15.20 – 15.40 Contributed VI: Rebekka Burkholz, Antonios Garas, Matt V Leduc, Ingo Scholtes and Frank Schweitzer. Cascades on Multiplexes with Threshold Feedback

15.40 – 16.00 Contributed VII: Soumajit Pramanik, Maximilien Danisch, Qinna Wang, Jean-Loup Guillaume and Bivas Mitra. Analyzing the Impact of Mentioning in Twitter

16.00 – 16.30 Coffee Break

Session IV

16.00 – 16.30 Keynote IV: Katharina Zweig. Science-theoretic musings on the analysis of networks (of networks)

16.30 – 16.50 Contributed VIII: Vinko Zladic, Sebastian Krause, Michael Danziger. Avoidable colors percolation

16.50 – 17.10 Contributed IX: Borut Sluban, Jasmina Smailovic, Igor Mozetic and Stefano Battiston. Sentiment Leaning of Influential Communities in Social Networks

17.10 – 17.30 Invited II: one speaker from the CI2C project (confirmed, yet to be defined)

17.30   Planning Netonets Future Activities

Continue Reading

26 June 2014 ~ 0 Comments

NetSci 2014 Report

NetSci, the top global conference about network science, never fails to be a tornado of ideas. Now that the dust has settled, I feel a bit easier to put this year’s thoughts on this post. Yes, this is yet another conference report by yours truly.

Let’s first get over the mandatory part of the report: an evaluation of the awesomeness of the Multiple Networks satellite I co-organized with my friends scattered around Europe. As said, this year’s edition was open to submissions and we received 17 of them. I think that, as a start, that is a good figure. Also, the attendance was more than satisfactory, and it appears scattered only because we got the largest room of the conference! Here’s proof!

DSC_0544.JPG

The overall event was a great success. The talks were very interesting and we had a great unexpected bonus point. One of our keynotes, as you might remember, was Mason Porter. Well, the guy actually got the Erdos-Renyi prize this year! The Erdos-Renyi prize has been established in 2012 and it goes to outstanding young researchers in network science. Well, make a note of this: speaking at the Multiple Networks satellite will eventually get you some important awards. After all, everybody knows that correlation = causation.

My favorite satellite (besides the one I organized, obviously) continues to be the Arts, Humanities and Complex Networks symposium. This year it was a little bit tougher than usual, with a lot of qualitative stuff that not everybody can appreciate. However, their keynote by Lada Adamic was nothing short of outstanding. She is currently working at Facebook, a position that gives her a privileged vantage point over memes and viral events. You know that those things tickle my curiosity very strongly, and Lada’s work is really great. She presented her work, where she proves that meme evolution and mutation on Facebook follows very closely the same mechanics of evolution and mutation we find in the biological world. Good news for my old paper, which was heading in the same direction!

Which brings me to the main conference, because one of the best talks I attended was from Jon Kleinberg, who collaborated with Lada on another memes-meet-Facebook work. In that case, there is less good news for me. My research plan is to use meme content to predict virality. However, the Kleinberg-Adamic dream team showed that content is actually a very weak factor! (Here’s a blog post about it).

There is still hope, though. My way to deal with content is fundamentally different than theirs. Plus the problem they are studying is slightly different from mine: they are analyzing memes that are already going viral and they want to know how popular they will get. I’m more focused on knowing if the meme is going to be popular at all, and I’m not that concerned about whether everybody will know it or only a niche group.

Virality of content was a very hot topic this year, because there were two other fantastic talks about it. One was by Sinan Aral, and he talked about how much we are influenced by a post’s popularity when we read it. Controlling for content (and believe me when I say that Sinan is one of the best experiment designers out there), if we know that a post is popular we are more likely to upvote it. This is so true that Reddit itself decided, for some subreddits, to hide the post score for the first few hours, so that real good content will eventually flow to the top once the discussion is settled.

On top of that, also James Gleeson talked about a theoretical model that can account for the popularity distribution of memes. The model sounds simple. You just assume that a person has a box containing all the memes they saw in the past. With some probability, the person will either come up with something new or reshare a meme from their box. When resharing from the box, there is a memory effect for which more recent memes are more likely to be reshared. Whenever you share something, regardless if it is new or not, it ends up in your friend’s boxes. Even if it looks so simple, the actual solution of the model isn’t it at all and James is so good he defies belief. And, at the end of the day, everything works like a charm. Again, this does not bother me too much, because it only predicts the distribution of popularity, not which memes are going to be popular, a different problem.

Besides all this work meme popularity, there were other very interesting talks. I mention:

  • The very elegant talk by Chris Moore on community discovery, which also has the by-product of providing witty one liners for many occasions (for example “Physicists like to minimize functions because, you know, rocks fall”);
  • The nice talk by Frank Schweitzer on the role of active individuals in collaboration networks, who have the side effect of making the networks more unstable and prone to breaking apart (damn you, hyper-active people!);
  • The usual fun of the lighting talks (they could not call them ignite talks because of copyright issues). My favorite for this year was from Max Schich, with a really great panorama of the art market in London, Paris and Amsterdam from the Getty dataset. Aaron Clauset and Roberta Sinatra deserve to be mentioned too, with two great talks about climbing the greasy pole in academia (is it really worth it to shoot for big name universities? Short answer: no).

That’s it! You can see that also this year there was a lot to see and to think about. I am already looking forward for next year!

Continue Reading

22 May 2014 ~ 0 Comments

The NetSci Multiple Networks Menu

Friends, scientists, network fanatics, lend me your eyes: I come to announce the program of the Multiple Network Modeling, Analysis and Mining symposium, introduced some months ago on these pages. To give you a quick recap: this is a satellite event which will happen at the 2014 edition of NetSci, a major network science event of the year. The symposium will take place on Monday June 2nd, while the conference itself will start on June 4th and it will last until the end of the week. Differently from last year, we now have space for contributed talks and I like the program we were able to set up. So, I’ll boast about it here.

You can find the overview of the entire event on the official website, but let me give you the highlights.

We have four invited speakers: Frank Schweitzer, Renaud LambiotteNitesh Chawla and Mason Porter. They come from different backgrounds (System Design, Mathematics and Computer Science) which is a great plus for the event. They are going to:

  • Tackle the mathematical foundations of multiple networks;
  • Describe models for multiple networks;
  • Analyse them, both in the flavour of bipartite temporal social networks and in the extension of the classic link prediction problem. Usually in link prediction we are interested in evaluating the likelihood of seeing “a” connection between two nodes. Since in multiple networks there are different types of connections, we are also interested in predicting “which” connection we will observe.

As for the contributed talks, we have a pretty good team, including (but not limited to) works signed by David Lazer from Northeastern University, Juyong Park from KAIST, Eugene Stanley from Boston University and many more. We had such a positive reaction to our call for papers, that we had to increase the slots for contributed talks from 5 to 7 and still reject presentations that we really wanted to see. Among my favourites works there are:

  • Multiple network applications to study the productivity of countries and predicting their growth;
  • The study of evolution of different relations among almost 2000 students from 14 US universities;
  • A network-based approach for ranking the performances of sport teams;
  • Novel way to classify nodes in complex networks where multiple different relations are present;
  • … and more!

For completeness, here’s the detailed schedule, I hope to see many of you there!

Session I

9.00 – 9.30 Registration / Set Up
9.30 – 9.50 Introduction: Welcome from the organizers, presentation of the program
9.50 – 10.30 Keynote I: Frank Schweitzer, Professor for Systems Design at ETH Zurich
Analysing temporal bipartite social networks
10.30- 11.00 Coffee Break

Session II

11.00 – 11.40 Keynote II: Renaud Lambiotte, Associate Professor, Department of Mathematics at University of Namur
Non-Markovian Models of Networked Systems
11.40 – 12.00 Daniel Romero, Nina Mishra and Panayiotis Tsaparas
Estimating the Relative Utility of Networks for Predicting User Activities
12.00- 13.30 Lunch

Session III

13.30 – 14.10 Keynote III: Nitesh Chawla, Associate Professor, Department of Computer Science & Engineering at the University of Notre Dame
Predicting links in heterogeneous social networks
14.10 – 14.30 Katherine Ognyanova, David Lazer, Michael Neblo, Brian Rubineau and William Minozzi
Ties that bind across contexts: personality and the evolution of multiplex networks
14.30 – 14.50 Neave O’Clery
A Multi-slice Approach to Understanding the Evolution of Industrial Complexity and Growth
14.50 – 15.30 Coffee Break

Session IV

15.30 – 16.10 Keynote IV: Mason Porter, Associate Professor at the Oxford Centre for Industrial and Applied Mathematics
Mathematical Formulation of Multilayer Networks
16.10 – 16.30 Seungkyu Shin, Sebastian Ahnert and Juyong Park
Degree-Neutralizing Weighted Random Walk Ranking in Competition Networks
16.30 – 16.50 Tomasz Kajdanowicz, Adrian Popiel, Marcin Kulisiewiecz, Przemysław Kazienko and Bolesław Szymański
Node classification in multiplex networks
16.50 – 17.10 Francesco Sorrentino
Stability of the synchronous solutions for networks with connections of different types
17.10 – 17.30 Andreas Joseph, Irena Vodenska, Eugene Stanley and Guangron Chen
MLR Fit-Networks: Global Balance of Payments

Conclusion and final announcements

17.30 – 18.00

Continue Reading

15 July 2013 ~ 0 Comments

ICWSM 2013 Report

The second half of the year, for me, is conference time. This year is no exception and, after enjoying NetSci in June, this month I went to ICWSM: International Conference on Weblogs and Social Media. Those who think little of me (not many, just because nobody knows me) would say that I went there just because it was organized close to home. It’s the first conference for which I travel not via plane, but via bike (and lovin’ it). But those people are just haters: I was there because I had a glorious paper, the one about internet memes I wrote about a couple of months ago.

In any case, let’s try to not be so self-centered now (good joke to read in a personal website, with my name in the URL, talking about my work). The first awesome thing coming to my mind are the two very good keynotes. The first one, by David Lazer, was about bridging the gap between social scientists and computer scientists, which is one of the aims of the conference itself. Actually, I have been overwhelmed by the amount of the good work presented by David, not being able to properly digest the message. I was struck with awe by the ability of his team to get great insights from any source of data about politics and society (one among the great works was about who and how people contact other people after a shock, like the recent Boston bombings).

For the second keynote, the names speak for themselves: Fernanda Viégas and Martin Wattenberg. They are the creators of ManyEyes, an awesome website where you can upload your data, in almost any form, and visualize it with many easy-to-use tools. They constantly do a great job in infographics, data visualization and scientific design. They had a very easy time pleasing the audience with examples of their works: from the older visualizations of Wikipedia activities to the more recent wind maps that I am including below because they are just mesmerizing (they are also on the cover of an awesome book about data visualization by Isabel Meirelles). Talks like this are the best way to convince you of the importance of a good communication in every aspect of your work, whether it is scientific or not.

As you know, I was there to present my work about internet memes, trying to prove that they indeed are proper memes and they are characterized by competition, collaboration, high-order organization and, maybe I’ll be able to prove in the future, mutation and evolution. I knew I was not alone in this and I had the pleasure to meet Christian Bauckhage, who shares with me an interest in the subject and a scientific approach to it. His presentation was a follow-up to his 2011 paper and provides even more insights about how we can model the life-span of an internet meme. Too bad we are up against a very influential person, who recently stated his skepticism about internet memes. Or maybe he didn’t, as the second half of his talk seems to contradict part of the first, and his message goes a bit deeper:

Other great works from the first day include a great insight about how families relate to each other on Facebook, from Adamic’s group. Alice Marwick also treated us to a sociological dive into the world of fashion bloggers, in the search of the value and the meaning of authenticity in this community. But I have to say that my personal award for the best presentation of the conference goes to “The Secret Life of Online Moms” by Sarita Yardi Schoenebeck. It is a hilarious exploration of YouBeMom, a discussion platform where moms can discuss with each other preserving their complete anonymity. It is basically a 4chan for moms. For those who know 4chan, I mean that literally. For those who don’t, you can do on of two things to understand it: taking a look or just watching this extract from 30 Rock, that is even too vanilla in representing the reality:

I also really liked the statistical study about emoticon usage in Twitter across different cultures, by Meeyoung Cha‘s team. Apparently, horizontal emoticons with a mouth, like “:)”, are very Western, while vertical emoticons without a mouth are very Eastern (like “.\/.”, one of my personal favorites, seen in a South Korean movie). Is it possible that this is a cultural trait due to different face recognition routines of Western and Eastern people? Sadly, the Western emoticon variation that includes a nose “:-)”, and that I particularly like to use, apparently is correlated with age. I’m an old person thrown in a world where young people are so impatient that they can’t lose time pressing a single key to give a nose to their emoticons :(

My other personal honorable mention goes to Morstatter et al.’s work. These guys had the privilege to access the Twitter Firehouse APIs, granting them the possibility of analyzing the entire Twitter stream. After that, they crawled Twitter using also the free public APIs, which give access to 1% of all Twitter streams. They shown that the sampling of this 1% is not random, is not representative, is not anything. Therefore, all studies that involve data gathering through the public APIs have to focus on phenomena that include less than 1% of the tweets (because in that case even the public APIs return all results), otherwise the results are doomed to be greatly biased.

Workshops and tutorials, held after the conference, were very interesting too. Particularly one, I have to say: Multiple Network Models. Sounds familiar? That would be because it is the tutorial version of the satellite I did with Matteo Magnani. Luca Rossi and others at NetSci. Uooops! This time I am not to blame, I swear! Matteo and Luca organized the thing all by themselves and they did a great job in explaining details about how to deal with these monstrous multiple networks, just like I did in an older post here.

I think this sums up pretty much my best-of-the-best picks from a very interesting conference. Looking forward to trying to be there also next year!

Continue Reading

16 June 2013 ~ 0 Comments

NetSci 2013 Report

As I mentioned a couple of months ago, during the first week of June the NetSci conference took place. NetSci is the main venue that brings together all researchers interested and involved in network science. It has always been a gigantic opportunity to put you in contact with the big shots in network analysis and an excellent playground for very interesting discussions. This year was no different.

Of course, for me the most important part of it was the very first day, when the satellite on multiple networks (organized by myself together with Matteo Magnani, Dino Pedreschi, Luca Rossi, Guido Caldarelli and Przemyslaw Kazienko) happened. As I wrote more than once in the past, multiple networks are networks in which the nodes may be connected with different kinds of interactions (friendship, collaboration, and so on).

It was an extremely interesting event; a first step to bring together many researchers working on the topic of multiple networks, most of whom hadn’t spoken to each other up until then. And when I say it was a smooth and successful operation, you don’t have to take my word for it. We have proof of a room full of brilliant minds taking up all the available spots… and beyond:

The talks were very impressive:

  • We learnt how to measure eigenvector centrality on multiple networks (and you can too);
  • We learnt how to extend basic measures from regular complex networks to multiple networks (and you can too);
  • We learnt how to mine network with heterogeneous information on nodes and edges (and you can too);
  • We learnt how to detect communities on multiple networks (and you can too);
  • We learnt how to infer the latent structure of inter-related networks (and you can too);
  • We learnt how a random walker behaves on dynamic networks (and you can too);
  • We learnt about the structure and dynamics of multiple networks (and you can too);
  • And we learnt how the properties of multiple networks arise when adding one network at a time (and you can too).

But NetSci, of course, was much more than just this satellite. Another event you absolutely didn’t want to miss there was the Arts, Humanities and Complex Networks Symposium, organized by Max Schich and Isabel Meirelles.

They are both great guys, with a gigantic knowledge about art and design. For example, they picked up a great reference for the logo of their symposium, namely one of the most known infographics made about visual arts, by Alfred Barr:

And besides the usual great lineup of talks (from the Wikidata project to a very cool movie ranking multiple network algorithm) you can learn surprising stuff about basically everything. My favorite: the observation of one of the speakers about the above visualization itself. Apparently, he was the first to realize that there is a bull up there (hint: Cubism lays in between the bull’s horns). As Max then puts it:

Then… the rest of the conference. It is impossible to even give a close idea of the overload of ideas and flashes of genius that populated the venue for those three days. I’ll work around the problem and cheat by giving you a laundry list of (a very tight subset of) the things that most impressed me during the conference:

  • The excellent invited talk by Shlomo Havlin about interdependent networks (networks which depend on each other to function, much like a computer network controlling the electric grid). This interests me because he claims that interdependent networks are a more general case of multiple networks (although I personally have an inkling that perhaps they can be reduced to the same model);
  • The usual spectacular presentation style of my friend Cesàr Hidalgo, who this time talked about a complex system showing a nested structure: namely, the cultural exports of different countries;
  • A really great contributed talk by Esteban Moro, which in my opinion could have been a keynote speech as well. Dr. Moro highlighted how people have a trade-off between social capacity (how many relationships we can keep alive) and social activity (how many new people we can meet). As a consequence, different social strategies arise;
  • A brilliant mathematical formulation of a network problem by Jure Leskovec, that, in my opinion, could be the final word about the problem itself. And it resembles the formal mathematical formulation of the same algorithmic idea behind my DEMON;
  • And the hilarious ignite talks, 5 minutes and 20 slides for each speaker. There was no possibility of interacting, with the presentation automatically jumping to the next slide every 15 seconds. Next year I definitely want to try to do one too.

And, of course, many other things. But you get the idea: blog posts about it are boring, you really have to experience it yourself.

Continue Reading