Empirics and Institutions

I just completed a section in class on institutions and growth. This had led me to re-read several papers, including Acemoglu, Johnson, and Robinson (2005) on “The Rise of Europe: Atlantic Trade, Institutional Change, and Economic Growth.” The idea in the paper is that the places that grew most quickly in Europe were those that could (1) take advantage of Atlantic trade and (2) had good initial institutions around 1500 AD.

The point of the paper is section II, called “Our Hypothesis”. It is this:

In countries with easy access to the Atlantic and without a strong absolutist monarchy, Atlantic trade provided substantial profits and political power for commercial interests outside the royal circle. This group could then demand and obtain significant institutional reforms protecting their property rights. With their newly gained power and property rights, they took advantage of the growth opportunities offered by Atlantic trade invested more, traded more, and fueled the First Great Divergence.

This is a fine hypothesis. It seems plausible, and my guess is that one could make a solid historical case for it. In essence, Spain and France were not able to take advantage of Atlantic trade because their monarchies were too rigid/absolutist/powerful, and England and the Netherlands did take advantage because their monarchies were already relatively weak.

AJR undertake an empirical study in this paper meant to bolster their hypothesis. But the more I read through it, the less convinced I am that this empirical exercise really tells us anything useful. In particular, it doesn’t do anything to convince me that their hypothesis is right.

To do this empirical exercise, they construct an index of initial institutions around the year 1500, coded from 1 to 3, that is supposed to measure the constraints on executives. 1 is few constraints (bad) and 3 is some constraint (good). They assign a value of 1–3 to each European country. To be clear, I’m not making an Albouy-like claim about the assignment itself. I’m willing to stipulate that “institutions” in the Netherlands at the time were a 3, while those in Spain were a 1, and France somewhere in between with a 2. That’s not the issue.

Their empirical work is based on comparing the effect of initial institutions on subsequent economic outcomes among Atlantic traders. In practical terms, this means that they are looking at variation across five countries (England, France, Netherlands, Spain, and Portugal) with meaningful Atlantic trade. This is a small enough group to immediately see how things are going to come out. The Netherlands has a 3 for institutions in 1500, England a 2, France a 2, Spain a 1.5, and Portugal a 1. Already, this index is closely correlated to income per capita in these countries in subsequent years they analyze (1600, 1700, 1820). The index is almost exactly colinear with income per capita around 1700. The real issue is that later on (by 1820), England takes a lead over France in income per capita, so that doesn’t correlate with their institutional measure. However, if you weight the institutional index by the amount of Atlantic trade a country does (i.e. number of voyages), then England looks better than France.

One issue is that it doesn’t seem necessary at all to run their regressions. Their table 7 presents a series of 30(!) regressions showing that having a big value of the initial institutional index (once you weight it by the volume of trade) is correlated with urbanization, GDP per capita, and future values of the institutional index. But a (not even careful) reading of history would lead you to same correlations. Their section II, in fact, is just such a reading of history. What does it matter that the coefficient they estimate is 0.21? What does that mean?

Further, the statistical significance and coefficient estimates are arbitrary. As there are no natural units for the index, there is no meaning to going from a 2 to a 3. Are institutions in the Netherlands 33% better than in France? While the ranking is informative, the numbers themselves are not. Imagine that I re-indexed the institutions in the AJR paper as Netherlands = 5, France and England = 4, Spain = 1.5, and Portugal = 1. I haven’t rescaled the entire index, I’ve just added 2 to the Dutch, French, and English scores. This preserves the rank ordering, but changes the spread. This will affect both the slope estimate and the standard error. If I fiddle just right with the index (e.g. adding 3 rather than 2, or giving Portugal a 0) then I can make the slope and standard error come out however I want.

Last, the regressions do not even necessarily confirm their historical intuition. They may have coded the index based on institutions, but that index picks up anything that varies widely between the Netherlands and Portugal, with England and France as some sort of intermediate case. The propensity for enjoying tulips? The prevalence of dairy farming? The inverse of average February temperatures? Just because you call it an index of institutions doesn’t mean that it will only pick up variation in institutions (And no, country fixed effects do not necessarily wash out my silly examples, for the same reason that FE don’t wash out the institutions index – they’re continuous measures).

The really frustrating part is that this veneer of empiricial support is completely unnecessary. AJR have a perfectly plausible explanation for the general historical facts. It’s a pretty compelling story, from my perspective. But there is not ever going to be a rock-solid empirical identification strategy for this kind of work. It’s worth remembering that lacking identification is not the same thing as rejecting a theory. For this kind of historical hypothesis, we have to get comfortable with ambiguity.

Examples of Institutional Failure

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

Miles Kimball posted a link this article from the WSJ on the mess that is Nigeria’s electricity grid. A key factoid is that Nigeria produces as much electricity as Montana, yet Nigeria has about 170 million people while Montana has only 1 million. The author (Drew Hinshaw) has comparisons for other African countries that are as disturbing.

The story focuses on Tony Elumelu, who recently purchased a power plant. His logic here is sound:

Thanks to all-day outages, Nigerians consume scant electricity—less than Puerto Rico. Once electricity flows into their homes, though, tens of millions of people will rush to buy refrigerators, air conditioners, electric kettles, he added—all pulling power from his turbines.

But the real question is whether the institutions in Nigeria will allow this to happen. As mentioned in the article, there is rampant theft of electricity from the grid, the grid itself is old and failing, and money is often extorted from providers in exchange for not blowing their equipment up.

I don’t know how to properly define “institutions”, but when we say that institutions matter, I think this is what we mean. The gains to having regular electricity are obvious, but the question is whether someone like Tony Elumelu will be able to keep enough of the profits from selling electricity to make him stick with his investment.

In a completely unrelated, but to me equally shocking story, Miles Kimball was my professor for Money and Banking at Michigan – 23 years ago.

Defining Development Economics

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

Trying to precisely define an area of study is impossible, but thinking through the definition of “development economics” is an interesting diversion. As with most fields, the definition is tied closely to the people doing research in that area. So today, “development economics” is the kind of research done by people like Esther Duflo, Ted Miguel, Michael Kremer, and a group of other very smart people who I will offend by not mentioning here. This kind of development economics has several key features. (1) It very consciously takes place in developing countries. These researchers are out collecting surveys or doing studies “in the field”. Perhaps the best way to define a development economist today is as someone whose presentation includes a picture of the village they worked in to collect data. (2) It is intensely concerned with identification of causal effects. Thus this field aspires to do randomized control trials (RCTs) to identify the causal effect of some {X} (e.g. de-worming treatments) on some {Y} (e.g. school attendance), as in Kremer and Miguel (2004). Failing that, some kind of natural experiment that features quasi-random treatment effects is examined. (3) It tends to be a-theoretical. The RCTs are showing reduced-form empirical effects of some kind of treatment on some kind of outcome. The de-worming paper of Kremer and Miguel is purely empirical, for example. This isn’t generally true, as there are papers that explicitly are testing some theory, but the dominant portion of the literature is purely empirical.

Through some historical inertia in the profession, we call this research “development economics”. But I think that this type of research is more properly called “poverty economics”, the study of individuals living in particularly poor, under-developed countries. Duflo and Banerjee‘s 2011 book is actually called Poor Economics, and the tag-line is A Radical Rethinking of the Way to Fight Global Poverty. The focus is on alleviating the conditions of extreme poverty: poor health, poor nutrition, and low education. The RCTs are evaluations of interventions that aim to improve health, or nutrition, or educational attainment. By going out into these developing countries, these researchers are acutely aware of the constraints facing poor people, and are studying ways to alleviate those constraints.

This is all valuable research. It is perhaps more admirable in its motivations than other sub-fields of economics (*cough* finance *cough*). But it is not about “development”.

Economic development is about the transition of whole economies from low-productivity, poor places into high-productivity industrial economies. This transition encompasses several aspects: a move out of agriculture and into manufacturing or services, urbanization, declining fertility rates, integration with global markets. Current research in development economics – the RCTs and their like – does not study the transition. “What will make these people better off today?” is a different question than “What will make this economy develop?”.

If you go back far enough in the development literature, you’ll find that the second question is the dominant one. Lewis, Nurske, Rosenstein-Rodan, Boserup, Gerschenkron, Hirschman all were concerned with what drove the transition to high-productivity industrial economies. But while they focused on this broader question, their work also contained assumptions that steered the profession away towards “poverty economics”. An (often unspoken) assumption of much of this early development work was that rural peasants were irrational. That is, they did not respond to prices or incentives the way that people in modern economies did. They were tradition-bound, stuck in their ways. Development meant breaking this resistance to change and educating them to operate in a market-based economy.

The reaction to this, most notably associated with T. W. Schultz’s 1964 book on Transforming Traditional Agriculture, was that peasants were in fact rational, but faced a unique set of constraints. If they stuck to traditional means for organizing production, then that is because those traditions were solving some concrete problem. Perhaps the best example is share-cropping, which even Alfred Marshall critiqued for being inefficient. It seems that by share-cropping, the marginal return for the peasant of more labor or capital is lowered, and hence less effort or investment is put forth. To the early development economists, share-cropping was an example of a traditional institution that prevented higher output. Except that once you appreciate the uncertainty involved in family farming, it is possible that share-cropping is the optimal contract to pick because it shares risk between the land-owner and farmer.

This led development economists on a different heading. The hunt was on for reasonable explanations of the observed behavior in these underdeveloped villages. What were the constraints or conditions that prevented these peasants from making more investments, or adopting better technology? This led to all kinds of seminal insights. Joseph Stiglitz, as one example, cites his time in Kenya in the late 60’s as pivotal for developing his ideas on the economics of information.

However, in pursuing this line of thinking, development economics got so deep into the details of optimizing behavior in under-developed villages that it lost track of the larger question: “What will make this economy develop?”. The study of the nuances of under-developed markets became an end in itself. A concrete example is the survey by Otsuka, Chuma, and Hayami (1992, JEL, sorry no link) on “Land and Labor Contracts in Agrarian Economies”. They say, “Through a critical review of the existing studies of agrarian contracts, this essay points towards building a `general model’ in which land tenancy, labor employment, and owner cultivation are modeled together as substitutes along a continuous spectrum of contract choice.”. And it is a very nice synthesis of the literature in this area to that point. But what implications does it have for development? What does this general model tell us about how or why a country will make the transition into an industrial economy? Knowing that peasants are rational rather than irrational is great, but I still would like to know how those peasants’ kids or grandkids will (or will not) end up as machinists or office workers living in city one day.

This is where development economics started turning into poverty economics. The focus became purely on understanding the constraints facing poor people in under-developed countries. As these constraints were dire, and led to such bad outcomes (poor health, low education, etc..), alleviating those constraints became the first-order concern. And let’s be clear, it is very much a first-order concern. Millions of people dying from an easily preventable disease is a travesty. Running RCTs to establish the best way to distribute that treatment is incredibly valuable research.

Despite that, current development economics doesn’t address the broader questions, the older questions, of what drives development in the long-run. The field of growth economics has essentially adopted this set of questions as part of its own research agenda. One of the things that this “macro-development” research does is establish the aggregate impact of micro-level features of under-developed economies. Does a given micro-level distortion or constraint incur such costs that it is a material reason for why a country remains relatively poor? Two recent examples are Hsieh and Klenow‘s 2009 paper on the aggregate effects of misallocations across firms in China and India, or Lagakos and Waugh‘s 2013 model of selection and cross-country income differences.

This doesn’t make the growth/macro-development approach better or worse than poverty economics. The two fields are just looking at different questions, with different implications. It’s worth keeping that in mind when evaluating the research in the two fields. In particular, it is not helpful to criticize one literature with the tools of the other (e.g. “But how do you plan on getting identification of this effect?” or “But who cares about the reduced form effect? What’s the mechanism you think is at work here?”). Different questions, different approaches, different techniques.

Is Capital Important?

There is kind of a disconnect in teaching economic growth. We spend a lot of time telling students about the Solow model and capital accumulation, but at the same time the general consensus among growth economists is that total factor productivity is more important to understanding levels of output per worker.

Why do we think that capital isn’t terribly important to levels of output per worker? Basically, because the correlation between capital per worker and output per worker is low – or rather, we assume that it is low. Here’s a way of thinking about this in terms of simple regressions. If I was interested in how important capital per worker was in explaining output per worker, I could run this regression for a sample of countries ({i})

\displaystyle  \ln{y}_i = \beta_0 + \beta_1 \ln{k}_i + \epsilon_i \ \ \ \ \ (1)

where I’ve put output per worker ({y_i}) and capital per worker ({k_i}) in logs. Logs keep countries with very small or very big values of capital per worker from being so influential, and in logs this regression will have an obvious interpretation for the coefficient {\beta_1}.

If I run this regression, I’ll get some estimated coefficient {\hat{\beta}_1}, which is the elasticity of output per worker with respect to capital per worker. Moreover, I could look at the R-squared of this regression. This R-squared will tell me what fraction of the variance of log output per worker ({Var(\ln{y}_i)}) is explained by variation in log capital per worker ({Var(\ln{k}_i)}). The R-squared is really what I want; it’s the answer the question “How important is capital in explaining differences in output per worker?”. The coefficient by itself doesn’t tell us that answer.

Now, there are some big problems with this regression. Most importantly, it is almost certainly the case that {\ln{k}_i} is correlated with {\epsilon_i}, the residual. The residual captures things like technology levels, institutions, human capital, etc.. etc.. and capital per worker tends to be large when these things are “big”, meaning that they have a big positive effect on output per worker.

So that means we cannot trust our estimate {\hat{\beta}_1}, and cannot trust our value of R-squared. It’s worth writing out what the “true” R-squared is if we in fact had the right estimate of {\beta_1}. I’ll pre-apologize for the fact that this involves a lot of steps, but I’m writing them all out so it is easier to follow.

\displaystyle  \begin{array}{rcl}  R^2 &=& \frac{{\beta}_1^2 Var(\ln{k}_i)}{Var(\ln{y}_i)} \\ \nonumber &=& \beta_1 \frac{Cov(\ln{k}_i,\ln{y}_i)}{Var(\ln{k}_i)}\frac{Var(\ln{k}_i)}{Var(\ln{y}_i)} \\ \nonumber &=& \beta_1 \frac{Cov(\ln{k}_i,\ln{y}_i)}{Var(\ln{y}_i)} \\ \nonumber &=& \frac{Cov({\beta}_1\ln{k}_i,\ln{y}_i)}{Var(\ln{y}_i)} \\ \nonumber &=& \frac{Cov({\beta}_1\ln{k}_i,{\beta}_1 \ln{k}_i + \epsilon_i}{Var(\ln{y}_i)} \\ \nonumber &=& \frac{ Var({\beta}_1\ln{k}_i) + Cov({\beta}_1 \ln{k}_i,\epsilon_i)}{Var(\ln{y}_i)}. \nonumber \end{array}

The last line is identical to what Pete Klenow and Andres Rodriguez-Clare (1997, and KRC hereafter) use to evaluate the importance of capital in explaining cross-country output per worker differences. In other words, KRC are just looking for an R-squared. But as they point out, they cannot simply run the regression I proposed above and get the R-squared from that, because almost certainly {\hat{\beta}_1 \neq \beta_1}.

Rather than run the regression, KRC suggest that we use some alternative means of estimating {\beta_1}. They propose using the share of total output that gets paid to capital. Why? Because under perfect competition and constant returns to scale, that share should be precisely equal to {\beta_1}. In data from the U.S., capital’s share of output is usually something between 0.3–0.4, and KRC use {\hat{\beta}_1 = 0.3}. The rest of their data ({\ln{k}_i} and {\ln{y}_i}) is exactly the same data that one would use to run the regression. The only thing they are doing differently is plugging in their outside estimate of {\hat{\beta}_1}. What KRC find is that their R-squared is about 0.30, or that only 30% of the variation in log output per worker across countries is accounted for by variation in capital per worker across countries. This is a big reason why growth economists don’t think capital is of primary importance in explaining cross-country differences in output per worker.

It’s interesting to consider, though, what could rescue capital as an important explanatory variable. KRC use the idea that capital’s share in output is equal to {\beta_1} under perfect competition and constant returns to scale. But what if there is not perfect competition and/or constant returns to scale? There is a neat little relationship that holds if we assume that firms are cost-minimizers. That is

\displaystyle  s_K = \frac{\beta_1}{\mu} \ \ \ \ \ (2)

where {s_K} is capital’s share in output (which KRC say is about 0.3) and {\mu \geq 1} is the markup over marginal cost for firms. {\mu = 1} only under perfect competition, and if there is imperfect competition or increasing returns to scale then markups are greater than one, meaning that the price charged by firms is greater than their marginal cost. From this we see that capital’s share may understate the value of {\beta_1} if {\mu>1}. In particular, if there are increasing returns to scale at the firm level (i.e. fixed costs) but perfect competition (i.e. free entry/exit) then {s_K} still measures the payments to capital accurately, but {\mu} will be greater than one as firms with increasing returns need to charge more than marginal cost in order to cover the fixed costs.

Practically, if {\beta_1 = 0.55}, meaning that {\mu = 1.83}, or a markup of 83%, then the R-squared for capital goes to one. That is, with {\beta_1 = 0.55}, capital perfectly explains the varation in output per worker. Even with {\beta_1 = 0.45}, the R-squared is 0.67, meaning capital explains 2/3 of the variation in output per worker. So a relatively slight adjustment in the value of {\beta_1} changes the conclusion regarding capital’s importance for output levels.

The issues with this line of thinking are (1) if there are increasing returns to scale at the firm level, why don’t we see increasing returns to scale at the aggregate/country level? (2) even if capital explains most of the variation in output per worker, there isn’t any data showing that savings rates actually vary across countries meaningfully. The differences in capital are probably the result of different technologies/institutions, and so those are the more fundamental source of variation.

Technology and Scale

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

This is a neat write-up by Matt Ridley regarding some research done by anthropologists Michelle Kline and Rob Boyd (ungated original paper here). They collected information on marine foraging technology used by 10 different Pacific Islander tribes at the time they first met Western explorers/colonizers. According to Ridley they assigned scores not only for the number of tools but also for their complexity. “A stick for prying clams from the reef, for example, counted as one techno-unit, whereas a bamboo crab trap with a baited lever counted as 16, because it comprised 16 working parts, each a technology in its own right.” The actual paper can give you a more detailed idea of the method.

The big take-away is that the higher the population, the more complex the technology being used. Hawaii, with 275,000 people, had seven times as many tools and those were of twice the complexity of those in Malekula, which only had 1,100 people. Further, the size of the network mattered. Island tribes that had more connections with other tribes also tended to have more tools and tools of higher complexity.

This is precisely what goes into our standard models of technological innovation. We tend to say something like \dot{A}/A = \theta L/A, so that the growth rate of technology is increasing in population, as the anthropologists found, but decreasing in the level of technology itself. Moreover, that population L need not be limited to a country, but is really the population of those economies that are integrated enough to share ideas. Regardless, the idea that technological change is positively related to population size can seem counter-intuitive the first time you encounter it. But Kline and Boyd’s study gives a really nice demonstration of the power of scale. Simply put, more people means more chances for someone to have an “Aha!” moment, and more people tinkering around with existing ideas.

A model like this has the implication that long-run technological change is proportional to the rate of population growth (of integrated economies). In other words, long-run living standards depend positively on the population growth rate. Population growth may instill some drag on living standards because of fixed resources and/or lower capital/labor ratios, but ultimately the positive effect of population growth on technology wins out.

I’m absolutely saving this paper to use next time I teach growth at any level.

Growth in Sea Transportation

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

I saw these incredibly cool images of shipping routes over time (original source is assistant professor Ben Schmidt from Northeastern). They map actual voyages taken, ship by ship, so you can see not just the routes but the density of usage.

First, one showing routes from mid-19th century

1860 shipping

Second, here is the same map done with data from 1980-1997.

1990 shipping

Very nice, quick summaries of the expansion in trade over a century. Shows the expansion of Asia, in particular, in this time period.

Markets and Industrialization

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

I just finished reading Karl Polanyi’s The Great Transformation, part of an effort to actually read through some classic texts in economics. He gives an account of the rise of capitalism in order to set himself up to describe the crisis in capitalism that he sees around him (he is writing in 1944 and for obvious reasons things don’t seem so rosy at that point).

His ultimate point regarding the role of the gold standard as a kind of organizing principle around which capitalism revolves is dated, but there is some value in thinking about how the flow of funds and trade between countries does have some distinct social implications. However, I found myself struggling with the description of industrial development he lays out as background for his discussion of the gold standard. He begins by highlighting the lack of money-based transactions in pre-industrial Europe, and how this indicates that markets did not exist.

Polanyi makes a common error, which is confusing money as a unit of account with money as a medium of exchange. When we talk about supply and demand, and equilibrium prices and quantities, we almost always talk in terms in some currency (e.g. the number of pizzas is 15 and they each are sold for $10, for a total value of $150). But this is using currency as a unit of account only. It is convenient to talk about dollars, but not necessary. I could just as easily draw supply and demand diagrams, and find out equilibrium prices and quantities using another commodity as the unit of account (e.g. the number of pizzas is 15 and they each are sold for 3 beers, for a total value of 45 beers).

So while modern economies use money to make exchanges in markets, that does not meant that an absence of money implies the absence of a market. You can have a fully functioning market even though no slips of paper or little metal coins are used in any transaction. Most importantly, the logic of supply and demand is perfectly valid even without money being used as the medium of exchange.

Further, the absence of observed exchanges does not imply the absence of a market either. It is quite possible to have a market in which the equilibrium outcome is for everyone to consume their endowment of goods. In fact, I think this is more likely to occur in very simple economies where the number and types of goods are limited. If we all produce one chicken and four sacks of wheat, and we all have similar preferences, then in the end we’ll all end up eating one chicken and four sacks of wheat. If we’re smart, we’ll make waffles and fry the chicken, but that’s a different topic.

In the chicken/wheat economy, there is an implicit price for chicken and a price for wheat, even though we don’t observe any transactions at this price. The absence of transaction data means it is hard to estimate what demand or supply curves look like, and therefore makes it hard to predict what would happen in the event of some kind of demand or supply shock, but the curves are still there.

Polanyi makes much of the shift from non-money to money exchanges, assumning that this suggests a move from autarkic production (all households acting individually) to market production (households taking prices as given). But the absence of money doesn’t mean autarky and an absence of markets, and so his underlying premise just doesn’t hold up. While there are still some fascinating passages to consider in the book, as a functional story of industrial development it isn’t terribly useful.