Archive | Development

15 April 2013 ~ 0 Comments

Aid 2.0

After the era of large multinational empires (British, Spanish, Portuguese  French), the number of sovereign states exploded. The international community realized that many states were being left behind in their development efforts. A new problem, international development, was created and nobody really had a clue about how to solve it. Eventually, the solution started by international organizations such as the UN or the World Bank culminated on the Millennium Development Goals (MDGs): a set of general objectives that humanity decided to achieve. The MDGs are obviously very noble. Nobody can argue against eradicating hunger or promoting gender equality. The real problem is that the logic that produced them is quite flawed. Some thousands of people met around 2000 and decided that those eight points were the most important global issues. That was probably even true, but what about particular countries, where none of the eight MDGs is crucial, but a ninth is? More importantly: why the hell am I talking about this?

I am talking about this because, not surprisingly, network science can provide a useful perspective on this topic. And it did, in a paper that I co-authored with Ricardo Hausmann and César Hidalgo, at the Center for International Development in Boston. In the paper we explain that the logic behind MDGs is a classical top-down, or strictly hierarchical, one: there are few centers where all information is collected and these centers direct all efforts towards the most important problems. This implies that (see the above picture):

  1. The information generated at the bottom level passes through several steps to get to the top, in a perverted telephone game where some information is lost and some noise is introduced;
  2. If some organization at the bottom level wants to coordinate with somebody else at the same level, it has to pass through several levels even before starting, instead of just creating a direct link.

In this world, if all funds for health are allocated to fighting HIV and child mortality, countries that do not have these problems but face, say, a cholera or a malaria epidemic are doomed to be left behind.

What it is really necessary is a mechanism with which aid organizations can self-organize, by focusing on the issues they are related to and on the places where they are really needed, without broad and inefficient programs. In this world, a small world, everybody can establish a weak link to connect to anybody else, instead of relying on a cumbersome hierarchy. In an editorial in the Financial Times, Ricardo Hausmann used the Encyclopedia Britannica as a metaphor for representing the top-down approach of the MDGs, against the Wikipedia of a self-organized and distributed system.

The question now is: is it really possible to enable the self-organization of international aid? Or: how do we know what country is related to what development issue, and which organization has an expertise on it? Well, it is not an easy question to answer, but in our paper we try to address it. In the paper we describe a system, based on web crawling (i.e. systematically downloading web pages), that capture the number of times each aid organization mentions an issue or a country in its public documents. That is no different from what Google does with the entire web: creating a global knowledge index that is at your fingertips.

Using this strategy, we can create network maps, like the one above (click to see a higher resolution version), to understand what is the current structure of aid development. We are also able to match aid organizations, developing countries and development issues according to how closely they are related to each other. The possible combinations are still quite high, so to actually use our results it is necessary to create a nice visualization tool. And that’s another thing we did: the Aid Explorer (developed and designed by yours truly).

In the Aid Explorer you can confront organizations, countries and issues and see if they are coordinating as they should. For example, you can check what are the issues related to Nordic Fund. Apparently, Microenterprise is a top priority. So, you can check how Nordic Fund relates to countries, according to how they are related to Microenterprise. That’s a good positive correlation! It means that indeed the Nordic Fund really relates most to the countries that are very related to Microenterprise. If we would have found a negative correlation that would have been bad, because it would have meant that Nordic Fund relates with the wrong countries. A general picture over all issues (or over all countries) of Nordic Fund can also be generated. Summing up these general pictures, we can generate rankings of organizations, countries and issues: the more high relevance and high correlation we observe together, the better.

Hopefully, this is the first step toward an ever more powerful Aid Explorer, that can help organizations to get the maximum bang for their buck and countries to get more visibility for their peculiar issues, without being overlooked by the international community because they are not acting in line with the MDG agenda.

Continue Reading

04 November 2012 ~ 0 Comments

Destroying Drug Traffic, One Query at a Time

in·tel·li·gence NOUN: a. The capacity to acquire and apply knowledge.

The intelligence process, like in Central Intelligence Agency, is the process any person or organization should go through when making important operative decisions. But this is a description of a perfect world. In reality, organizations have to face phenomena that are very complex. When the organization itself is significantly smaller than the complexity it has to face, its members have to rely on intuition, art or not solidly grounded decisions.

This is usually the case for the local police when facing organized crime. Large crime organizations like the Italian Camorra or the drug cartels in Mexico are usually international. If you read Roberto Saviano’s Gomorrah, you’ll realize that Camorra operates as far as Germany or Scotland, while drug cartels usually span from Colombia to the US passing through Mexico. On the other hand, most of their activities happen at the local level: kidnappings, killings, drug traffic. Their main adversary is not usually a broadly operating institution like the FBI, but the local police. But for the local police, to gather a satisfying amount of information to face them is usually hopeless.

With this problem in mind, I teamed up with a Mexican colleague of mine at Harvard, Viridiana Rios. Our aim was to develop a system to enable a cheap and cost-effective way to gather intelligence operations about criminal activities.

The problem with criminal activities is that they are not only part of the complex organism of organized crime. They are also usually hidden from the public. Of course, no head of a mafia family wants to conduct his business en plein air. However, whether he likes it or not, some of these activities reach the general public anyway. This happens because, “luckily”, bad news sells a lot of newspapers. Criminal activities usually leave a clear footprint in the news. Mexican drug traffic in this is also particular, for the tradition of leaving the so called narcomensajes. These messages are writings painted on walls or on highway billboards. They are used by the criminal organizations to threaten each other or the government and the police. A narcomensaje looks like this:

When we design a system for tracking the activities of crime organizations, we want this system to be as automatic as possible. Therefore, we use some computer science tricks and we rely on the information present on the websites of newspapers. Web knowledge has a lot of problems: it’s big, it’s about many different things and it’s subject to reliability concerns. However, Google News deals with most of these problems by carefully selecting topics and reliable sources. What was left for us to do, was to systematically query the system with its APIs and clean the results. The details of this process are in a paper presented by me this week at the Conference for Information and Knowledge Management (CIKM 2012).

We did not have any way to understand if our queries connecting drug traffickers to Mexican municipalities were capturing real connections. For this reason, we performed the very same task using Mexican state governors. With our great surprise, we were able to detect with high accuracy their real patterns of activities. (Not that we are drawing a parallel between organized crime and politics, just to be clear!) This indicates that our method of tracking people’s activities by using Google News data is valid. Here are some maps of some state governors. In red the municipalities where they are detected and with a large black border their state:

What did we find?

Mexican drug traffic follows a fat-tail distribution. The meaning? There is an incredible amount of municipalities with a weak drug traffic presence and some others are an explosive factory where the employees have to carry flamethrowers. Moreover, it really looks like a hydra: destroying one hub is likely just to generate another hub, or ten smaller hubs.

And the system is growing fast, jumping from one order of magnitude to a larger one in less than a decade.

We are also able to classify cartels with several features: how much they like to compete or to explore the territory. In the future, this may be used to predict where and when we will see a spike of activity for a particular drug cartel in a particular municipality (in the picture, the migration patter of the Los Zetas cartel).

Apart from the insights, our methodology really aids the intelligence problem, whenever there are no sufficient resources to perform an actual intelligence task. We used the case of criminal activities, but the system is fairly general: by using a list of something other than a drug cartel and something other than a Mexican municipality, you can bend the system to give you information about your favorite events.

Continue Reading

11 October 2012 ~ 0 Comments

The Product Space and Country Prosperity

As reported in different parts of this website, the group I am currently working in is called “Center for International Development”. The mission of this group, in the words of its head Ricardo Hausmann, is quite trivial and unambitious: to eradicate poverty from the world. Knowing how to do it is far from easy and there are different schools of thought about it. The one that Hausmann chose starts with understanding how production and economic growth work, i.e. why some industries are successful in a country and not in another.

To address this question, Hausmann (together with César A. Hidalgo, Bailey Klinger and Albert-Laslo Barabasi) developed the Product Space. The original idea has been published in this paper in 2007, far before I joined the group, but my (overestimated) expertise was heavily exploited to generate its current implementation (the Atlas of Economic Complexity, a book freely available in electronic format). For this reason I feel no shame in writing a post about it in this blog (and to share the Product Space itself in my Dataset page).

The fundamental assumption of the Product Space is the following: countries are able to have a comparative advantage in exporting a given product because they have the capabilities to export it. In abstract, it means: “I do this, because I can“. Quite reasonable. In pictures, with the awesome “A capability is a piece of LEGO” metaphor created by Hidalgo:

From this assumption, it follows that if a country can export two different products, it is because it has the capabilities to export both. Also this step is quite easy. The conclusion is immediate: a country’s development success is lead by its capabilities. The more capabilities the country has (meaning that its LEGO box is big and it contains a lot of pieces), the more capabilities it will be able to acquire, the faster it will grow.

There is a small problem with this conclusion: we can’t observe the capabilities. Of course, if we could then they would be blatantly obvious, so every country could employ its growth strategy based on them. (There is actually another problem: capabilities are tacit knowledge, as Nonaka and Takeuchi would say, so you really can’t teach them, but we will come to this problem later)

What we can observe is simply which countries export what:

But remember: if two products are exported by the same country, then there may be common capabilities needed for their productions; while if two countries export the same products, then they share at least part of the same capabilities. In mathematical terms, it means that the observed country export picture is actually the multiplication of the two halves of the second picture of the post, or:

This is nice because it means that we can collapse the country-product relationships into product-product relationships and then mapping which product is related to which other product, because it requires the same capabilities. The advice for countries is then: if you are exporting product x, then you are likely to have most of the capabilities to export all the products that are connected to x. And this is how the Product Space was born, a single picture expressing all these relationships:

(click on the picture for a higher resolution, or just browse the Atlas website, that is also dynamic).

In the picture the nodes are colored according to the community they belong to (for more information about communities, see a previous post). The communities make sense because they group products that intuitively require the same extended set of capabilities: in cyan we have the electronic products, light blue is machinery, green is garments and so on and so forth.

Is the structure of the Product Space telling us something reliable? Yes, countries are way more likely to start export products that are close, in the Product Space, to the products they already export. Also, it is important to know where the export products of a country are in the Product Space. The more present a country is in the denser cores of the Product Space, the more complex it is said to be. This measure of complexity is a better predictor of GDP growth than classical measures used in political economy like average years of schooling.

Is the structure of the Product Space telling us something interesting? Hell yeah, although it’s not nice to hear. The Product Space has communities, so if you export a product belonging to a community then you have a lot of options to expand inside the community. However, many products are outside the communities, and they are very weakly connected with the rest, often through long chains. The meaning? If you are only exporting those products, you are doomed to not grow, because there is no way that you’ll suddenly start exporting products of a community from nothing (because this would require tacit knowledge that you cannot learn and is not close to what you know). And guess what products the poorest countries are currently exporting.

To conclude, a couple of pictures to provide one proof of the reasoning above. In 1970, Peru had more average years of schooling, more land and twice the GDP per capita of South Korea. Traditional political economy would say that Peru was strong and there was no way that South Korea could catch up. Where South Korea is today is evident. What was the difference between the two countries in terms of Product Space?

(again, click for higher resolution. The black square border indicates which products the country is exporting). There is not a lot of difference in quantities (and this explains why South Korea was poorer). However, South Korea had those two or three products in a very valuable position, while Peru had only products in the branches: a long way to the core. In 2003, this was the result:

Peru is still mainly on the edges, South Korea occupies the center. And that’s all.

Continue Reading