More on Mathiness

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

I managed to not float away this week in the flood in Houston, so I’m back with everyone’s favorite topic: mathiness. I hit on this last week, and there continues to be an on-going discussion about Paul Romer’s original paper on this.

For me, there are two interesting conversations to have about Romer’s critique. The first is about the actual economics of innovation and growth. Ideas mean increasing returns at some level. Is that at the firm level, or just in the aggregate? Is market power over ideas a more useful way of thinking about innovation, or are there frictions or limits to diffusion of ideas? For now, I’m going to hold off on this conversation, because it deserves a thorough discussion and I have to get that more organized in my head before I write anything down.

The second conversation arising from Romer’s paper is about the use of math and language in economics. This is not specific to growth theory, and I’m going to have another go at this conversation in this post.

One of the issues is that Romer’s original concept of “mathiness” was not entirely clear. Having e-mailed a little with Romer on this, I think he will freely admit that he did not get the concept across as clearly as he’d like. His last post on the subject states this outright, and he tries to clarify his position. You should probably read that first if you really want to get your head around this debate. What follows is me thinking out loud about the concept of mathiness.

Pure snark: Let’s start with this interpretation of mathiness. A common response to Romer’s article was that this is some kind of score-settling or point-scoring exercise by someone who is miffed that others are not using his ideas. Essentially, mathiness just means that the authors didn’t take Romer’s comments seriously.

Even if this were true, I don’t care. I don’t care because the conversation about how we use language and math in economics is still worth having, whatever the original motivation. Do I think Romer is just out to settle personal beef’s? No. But I’ve had people tell me that I’m being hopelessly naive about this, and that I’m working too hard to find a reasonable nugget at the core of this whole ball of BS.

Well, yeah, I am. I’m not in junior high. I’ve got better things to do than worry about whatever drama lurks beneath the surface here, because it is irrelevant to the interesting questions.

Let’s actually get to some meaningful ideas about mathiness and what it means.

Math and science: My original post on the subject offered an interpretation of mathiness as confusing math with science. In short, just because you can prove that a certain conclusion follows mathematically from certain assumptions, that does not mean that this is how the world works. And while I think that this is an issue, especially in communicating economic research to the public, this is not what Romer was talking about.

The papers he cites specifically do not make these kinds of claims. And while it is possible to misinterpret their findings, a reader mistaking math for science is not something that you can lay at the feet of the authors in these cases.

Decorative math: Another possible interpretation of mathiness is that it refers to what I think of as “decorative math”. A paper may have a simple model, but there are all these adornments added (endogenous savings rates, endogenous labor supply decisions, heterogeneous agents, etc..) even though they have absolutely nothing to do with the simple model and change none of the conclusions. This decorative math actually makes the paper harder to understand, because now you have to keep track of all this additional notation.

I have a recent paper with Remi Jedwab and Doug Gollin, on urbanization and industrialization, that has a very simple model in it. No dynamics, no endogenous productivity growth, and we don’t even bother to write down a utility function. All the intuition we need for the empirical work we do is in this dead simple model. And yet, throughout our experience of submitting this paper to different journals, we were told repeatedly that we’d have to come up with a fancier model (heterogeneous preferences or productivity levels for individuals in different regions, endogenous productivity growth, dynamic decision-making, explicit congestion and agglomeration technologies for cities, etc. etc..) if we wanted to publish this paper in a top journal. We were supposed to decorate the model, I guess to show that we could?

I think this a frustrating feature of modern economics, but this “decorative math” is not what Romer had in mind, either.

Extreme abstraction: Perhaps “mathiness” refers to something that is almost the opposite of decorative math, extreme abstraction. Chris House’s post on Romer defines mathiness this way. He uses the example of Mankiw, Romer, and Weil (1992) and their Solow-like model that includes both physical and human capital. House wonders if MRW displays mathiness because they assume technology levels are identical across countries, and grows exogenously, and the savings and education rates are exogenously given, etc.. etc.. But I think House is wrong in saying that MRW is an example of mathiness. Romer isn’t arguing against abstraction, as he makes clear in his latest post. He praises the original Solow model for its clarity, despite incredible levels of abstraction. (As an aside, House is clear that adding all the decorative math back into MRW would make it worse.)

Divorcing words and math: I think this is where Romer is going when he discusses the McGrattan and Prescott (2010) paper. This particular post from Romer probably gives the best explanation, as he digs a little deeper into the MP paper for examples of mathiness.

The point of the MP paper is that by failing to accurately measure intangible capital, the BEA falsely finds that there is a difference between the rate of return earned by foreign subsidiaries of US firms (9.4%) and US subsidiaries of foreign firms (3.2%). Okay, cool. That’s a neat problem to think about. MP claim that about 2/3 of that gap in returns is due to mis-measuring intangible capital.

MP use a model to make this claim, and that model needs a production function with constant returns to scale over intangible capital and physical inputs. And I think that if they had said, “We assume there is a stock of intangible capital, X, and a stock of physical inputs, Z, and there are constant returns to scale with respect to these two inputs into production,” that Romer wouldn’t be as bothered. This is abstract. This is hand-waving. We can argue and disagree about that assumption (why are there declining returns to intangible capital?, for example). But it is a relatively clear statement of what is being abstracted from. The words match the math.

What makes Romer’s head explode is that MP don’t just say this. They have a set-up that involves “technology capital” (M), which is a count of the number of “technologies” that are owned by firms in the economy. I guess a technology is something like a firm, as MP use the example later of a technology being Wal-mart or Home Depot. So technology capital is just the number of firms? But there are locations, which I guess are separate markets, and each technology can be operated in each location. What makes a location distinct from another is not ever defined. Oh, and there is also the production function for each technology at each location, which involves the statement “..where A is parameter determining the level of technology..”, and that is presumably different than the prior term “technology” or the term “technology capital”.

What does all this set-up buy you, by the way? A constant returns to scale function over intangible capital and the stock of physical inputs. In the end, MP’s accounting for the discrepancy in rates of return between types of firms has nothing to do with this location/technology thing. It serves only to confuse the situation.

The “mathiness” of the paper comes from the disconnect of the language from the math. The math does not serve to sharply illuminate a piece of intuition, it sows confusion. One example is the word “technology” being used 3 different ways in 2 pages, for no clear purpose. Another is using the word “location” despite there being absolutely no sense of location in any economic transaction in the paper.

So I’m with Romer on this point: it is completely fair to ask for better writing, even from big names like McGrattan and Prescott. Especially from big names, actually, since they are the ones that are going to be read the most. My guess is that we’re too deferential to big names, and excuse this kind of stuff by assuming that we don’t quite get their really deep insight. But we’ve got to expect better; the burden should be on authors to be clear.

Mathiness versus Science in Growth Economics

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

Paul Romer created a bit of a firestorm over the last week or so with his paper and posts regarding “Mathiness in the Theory of Economic Growth”. I finally was able to sit down and think harder about his piece (and several reactions to it).

Before I get to the substance, let me make two caveats. First, Romer has been relentless in continuing to publish blog posts and tweets about his paper, so I’m kind of hopelessly behind at this point. I will probably make some points that someone has made in response, or talk about something that Romer has already brought up elsewhere. If so, the lack of links or attribution is not intentional. I just haven’t caught up yet. (See DeLong, and Wren-Lewis for two responses).

Second, one of the papers that Romer discusses is by Bob Lucas and Ben Moll. I know Ben a little, as he gave a talk at UH last year. We recently e-mailed regarding Romer’s criticisms of their paper. Some of what I will write below is based off of notes that Ben is writing up as a response. That isn’t to say that this post is a defense of Lucas and Moll, just a disclaimer.

What’s the issue here? Romer says in his paper:

For the last two decades, growth theory has made no scientific pogress toward a consensus. The challenge is how to model the scale effects introduced by non-rival ideas… To accommodate them, many growth theorists have embraced monpolistic competition, but an influential group of traditionalists continues to support price taking with external increasing returns. The question posed here is why the methods of science have failed to resolve the disagreement between these two groups.

One thing we have come to a consensus on is that economic growth is driven by innovation, and not simply accumulating physical or human capital. That innovation, though, involves non-rival ideas. Non-rival ideas (e.g. calculus) can be used by anyone without limiting anyone else’s use of the idea. But modeling a non-rival idea is weird for standard economics, which was built around things that are rival (e.g. a hot dog). In particular, allowing for non-rival ideas in production means that we have increasing returns to scale (if you double all physical inputs *and* the number of ideas then you get more than twice as much output). But if we have increasing returns to scale, why don’t we see growth rates accelerating over time? We should be well on our way to infinite output by now with IRS.

To answer that, Romer pioneered the use of monopolistic competition in describing ideas. Basically, even though ideas are non-rival, they *are* excludable. Give someone market power over an idea (i.e. a patent) and this allows them to earn profits on that idea. Because not everyone can actually use the idea for free, this keeps the growth rate from exploding. The profits that an owner of an idea earns from people using it are what incent people to come up with more ideas. So these market power models explain why the IRS doesn’t result in infinite output, and explains why people would bother to innovate in the first place.

An alternative is that ideas are non-rival, and non-excludable. That is, anyone is capable of adopting them immediately and for free. To keep the economy from exploding to infinite output, in these models you have to build in some friction to the free flow of ideas between people. Yes, you can freely adopt any idea you find, but it takes you a while to find the new ideas lying around. What you can retain in models like this is the idea of price-taking competition. No market power is necessary for any agent in the model.

Romer’s paper then proposes that the lack of consensus on this is due to one side (the latter, price-taking increasing returns group) making arguments for their side not on the basis of scientific evidence, but on mathiness’. Let’s hold off for a moment on that term.

Is this really a big disagreement? In one sense, yes. You certainly still have papers from both camps in top journals, by top economists. In another sense, no. When it comes to doing any kind of empirical work in growth, there is no question that firm-level, market-power models of innovation that grew out of Romer’s work are the standard. The problem with the price-taking models is that they say nothing about firm dynamics (e.g. entry and exit), and these dynamics are a huge part of growth. With price-taking, there isn’t a reason for any specific firm to exist, and so things like entry and exit aren’t well-defined.

Are there math mistakes? The latter part of Romer’s paper discusses how several recent examples of price-taking models are sloppy in connecting words and math, and how some of them in fact contain mathematical errors. He discusses, in particular, the Lucas and Moll paper and an issue taking a double limit. This is something that Ben broached with me, and his explanation of the issue seems reasonable, in the sense that Lucas and Moll do not seem wrong. But between Romer, Lucas, Moll, and me, I am the last person you should ask about this.

More important, from the perspective of this mathiness’ question, the math mistakes themselves are irrelevant. Romer’s larger point would be worth discussing even if the math were perfect. Pointing out a flaw doesn’t change his argument, and in fact probably detracts from it. He isn’t asking Lucas and Moll (or the others) to simply correct their paper, he wants to change the way they think about doing research.

Doesn’t everyone make silly assumptions? This was Noah Smith‘s initial reaction to the mathiness post. The assumptions made by the market power theories are just as impossible to justify as the competition theory. The price-taking theory assumes that people just randomly walk around, bump into each other, and magically new ideas spring into existence. The market power theory assumes that people wander into a lab, and then magically new ideas just spring into existence, perhaps arriving in a Poisson-distributed process to make the math easier. Why is the magical arrival of ideas in the lab less fanciful than the magical arrival of ideas from people meeting each other? In the models, they are both governed by arbitrary statistical processes that bear no resemblance to how research actually works.

At their heart, both of these theories have some kind of arbitrary process involved. But that is not Romer’s point. Every theory is going to make some kind of fanciful abstraction regarding the real world. If it didn’t, it wouldn’t be a model, it would be reality.

Okay, smart guy. What is Romer’s point? I think it is this: math is not science.

Here’s how the science on this would work. Collect data on the growth rate, number of innovations produced, and/or productivity growth. Test whether countries/states/firms that operate with price-taking grow at the same rate as those that operate with market power over ideas. If they do grow at the same rate, then you fail to reject the price-taking theory of innovation. You don’t accept it, you fail to reject it. And then you go on your way scrounging for more data to see if that particular test was just a result of sample noise.

If the price-taking market doesn’t grow or produce innovations, then you reject the price-taking theory. And then you go on your way scrounging for more data to see if that particular test was just a result of sample noise.

Let’s say that you fail to reject the price-taking theory. Now what? Now you start pulling out other predictions from both theories, preferably ones where they differ. And you test those predictions. If you could reject the market power theory predictions, but fail to reject the price-taking predictions, then you’d probably conclude that price-taking is the better explanation of innovative activity (but new data could overturn that). And vice versa. I’m not saying this is easy (how do you identify which economies are price-taking versus market power?), but this is how you’d do it.

That’s the science. That doesn’t mean the math is worthless. You have to have the math – the model – in order to come up with the predictions and hypotheses that we’re going to test with the data. Without the model we don’t know how to interpret what we see. Without the model, we don’t know what tests to run. So a paper like Lucas and Moll is useful in allowing us distinguish what a price-taking world might look like compared to a market power world.

Here is where we reach the crux of Romer’s argument, to me. Most people, including many inside academia who should know better, assume that math equals science. And rather than remind readers that math is not equal to science, authors often play along with that fiction. They play along by using very complicated math – “mathiness” – making their idea look more “science-ish”. They let people believe their model shows how the world does work, rather than how it might work.

[Update 5/21 6:30pm: That’s not a specific indictment of Lucas/Moll, but a re-statment of Romer’s argument. And a valid question here is whether we should expect every paper to actively re-state the concept that models are about how the world might work, not how it does work. Moreover, if someone misuses a theory paper, is that the fault of the authors?]

So all these guys are liars? No, I don’t think that Lucas and Moll, for example, are part of some conspiracy. I know Ben is somewhat flummoxed by what their paper did to come under such fire from Romer. They wrote a theoretical paper that strung out the implications of a certain set of assumptions. I think they are perfectly amenable to, and would support, anyone who could come up with good empirical evidence on their model.

How other people use these theories is a different story. Brad DeLong has suggested that the problem with the “mathiness” of these papers is that they allow people to reverse engineer support for their preferred political position. If we have a price-taking competitive economy, then any interference (i.e. taxes or subsidies) will generate deadweight loss. Are we in a price-taking economy? These papers by smart economists show that we *could* be, and if you confuse math with science then you assert that we *are* in a price-taking economy. Hence no interference is justified.

Why can’t we all just get along? Perhaps we should have different models for different situations. Dani Rodrik has made this point before (H/T Israel Arroyo for the link), urging economists to focus on choosing the right model, not trying to shove everything into one grand unified theory. The market power theory is useful in understanding innovation in pharmaceuticals, for example, or innovation in a leading-edge Western country like the U.S.

But the price-taking theory is useful in a situation where the innovation we are talking about is not actually a brand new idea, but rather an existing idea (even an old one) that people have not adopted yet. Think of something like proper fertilizer application among peasant farmers. Some farmers use the fertilizer properly, some don’t. But this isn’t because some farmers have property rights over the knowledge of how to use fertilizer. How long it takes the good practices to diffuse over the whole population of farmers may well be modeled as a a series of interactions between farmers over time, and knowledge gets passed along at each step. Taking the arrival of truly new innovations as exogenous may be a reasonable assumption to make for some developing countries.

If Lucas and Moll had framed their theory this way, would that mitigate the mathiness of the paper? I think it might.

Now what? I think Romer’s paper is right that we are not careful enough about distinguishing math from science in economics. It is easy to slip, and I have no doubt that some people take advantage of this slippage to push their viewpoints.

One thing to insist on (of papers you referee, of speakers, or of one’s own work) is that falsifiable predictions are clearly stated. What could the data show that would make your theory wrong? Force the authors to be clear on how science can be used to evaluate your theory. That isn’t to say that every theoretical paper needs to have an empirical test added to it. I am always in favor of smaller, more concise papers. But the follow-up empirical work for your theory should be obvious. Then hope some grad student doesn’t take the bait and prove you wrong.

Trend, Cycles, and Assumptions about Fluctuations

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

I am kind of veering off of the typical growth topic in this post (although not as badly as the last post). I was playing around trying to write questions for the first-year macro comp next month, and ended up thinking about how we distinguish trend from cycle in GDP. In particular, how do the assumptions built in to how we decompose GDP into trend and cycle tie into how we conceive of fluctuations?

Let’s take the simplest set-up, where log GDP follows a linear trend

$\displaystyle \ln{GDP}_t = \alpha + g t + u_t, \ \ \ \ \ (1)$

and ${g}$ is the growth rate. ${\alpha}$ is the intercept, and it fixes the level of GDP in period 0. What I’m going to say below is all related to this linear trend assumption, but the concepts would follow even if you allowed for some polynomial in ${t}$ on the right-hand side, or if you did some kind of fancy filtering, like Hodrik-Prescott.

How do I do the decomposition? First, I get the estimated values ${\hat{\alpha}}$ and ${\hat{g}}$ from the data by running an ordinary least squares regression. I get the cycle as the deviations of log GDP from the estimated trend using these values. The cyclical deviations are just the residuals of that regression,

$\displaystyle \hat{u}_t = \ln{GDP}_t - \hat{\alpha} - \hat{g}t. \ \ \ \ \ (2)$

Voila. You have a trend/cycle decomposition of GDP. Now that I’ve extracted the trend, I can write down a model to explain business cycles and try to match the information in ${\hat{u}_t}$.

This seems really sensible as an approach. But there is a lot embedded in this procedure, and I think it has consequences (intended or unintended) for how people think about cycles.

The first mathematical condition for running an ordinary least squares regression is

$\displaystyle \sum_{t=0}^T \hat{u}_t = 0. \ \ \ \ \ (3)$

It always seems like a bit of a throw-away assumption when you teach econometrics. Yes, you say, make sure the errors add up to zero. If that were not true, then we could just adjust the intercept term until that assumption was true.

But this is not an innocuous assumption from an economic standpoint. If you use this when you try to estimate trend GDP, then you are asserting that deviations of GDP from trend must by necessity cancel out over time. That is, after you have estimated trend GDP and recovered your cyclical component (${\hat{u}_t}$), the booms must be exactly offset by busts.

If you build a theory of business cycles around this trend/cycle decomposition, then you are limited to a theory that only admits symmetrical deviations. You are pushed towards using symmetrical, (log)-normally distributed shocks to create cycles, for example.

You are also nudged towards treating “booms” as a pathology similar in every aspect to “busts”, only with the signs reversed. In particular, you are pushed towards the belief that busts are necessary to offset the booms. It suggests that we must “pay for” the excesses of the boom period with lower GDP in some other period.

This is wrong. Statistically, the fact that the best way to fit a line requires deviations to add up to zero does not mean that booms and busts must be perfectly symmetric. The economy does not have a lifetime budget constraint, which is what this symmetry implies. It has a dynamic budget constraint which simply says that real spending in one period has to add up to real production in that period. But the fact that you have a dynamic budget constraint does not mean that you can roll this up into a fixed lifetime budget constraint.

An example is helpful here. I have a dynamic budget constraint relating my real expenditure of calories on a given day to my real supply of calories on that day. The expenditure is all my basic metabolism plus whatever I burn going to the gym. The supply is whatever I eat plus the stock of calories I’ve got stored up (i.e. the flabby parts). The dynamic budget constraint says that the calories I burned today at the gym have to come from somewhere.

But this dynamic budget constraint doesn’t have any implication for how much I can burn over the course of my life. Now that classes are over, I’ve gone to the gym a few extra mornings, and expended more calories than normal. Does that mean I – by necessity – have to exercise less at some point in the future? Of course not. If I have just made a fundamental change in my exercise habits, then I can continue to hit the gym 5 days a week rather than 3 until I die. I can stay “above trend” forever. Similarly, if I decide to say “f*** it” and stop going to the gym, my calorie expenditure will fall below trend. And it can stay there forever. If I don’t exercise today, there is nothing about my dynamic budget constraint that requires me to go exercise tomorrow to make up for it. On this point many an exercise plan has failed.

GDP is like calorie expenditures. Yes, real expenditures must add up to real production in a given period. Great. But that doesn’t mean that GDP must conform to some infinite-period constraint. So if GDP is “above trend” for a while, that does not imply that it must fall “below trend” in order to balance that infinite-period constraint. Similarly, if we fall “below trend” for a while, there is nothing that requires us to necessarily have a boom in order to make up for the lost production. There is no lifetime budget constraint for GDP.

Back to the decomposition. The assumption that

$\displaystyle \sum_{t=0}^T \hat{u}_t = 0 \ \ \ \ \ (4)$

says that there is a lifetime budget constraint. It says that all deviations above trend must be precisely and exactly offset by deviations below trend.

But that need not be the right assumption. Remember, our goal as economists is not to minimize the sum of squared residuals here, but to explain economic fluctuations from trend. So why not assume

$\displaystyle \sum_{t=0}^T \hat{u}_t = -.03 \times T, \ \ \ \ \ (5)$

which would imply that the typical period is 3% below trend. That is, booms are more than offset over time by busts. You could set the summation to a larger number, and get that the economy is continually below trend, and never experiences a boom. Why not? Friedman proposed a “plucking model” of fluctuations, where there are occasional negative deviations from trend/potential GDP, but these are not necessarily offset by symmetrical booms.

The point here is that the statistical techniques used to recover measures of cyclical deviations embed an assumption that is not true. The GDP of an economy is not subject to a lifetime budget constraint. Therefore, booms and busts are not required to cancel each other out. What is the right assumption to make? I have no idea. Maybe it’s the -3% assumption I mentioned above. Maybe it’s -2%, or +5%.

I am sure that someone can tell me that “there is literature on this!!” already. Which is great. But that literature is not part of the standard toolkit that I am familiar with for first-year graduate macro. And I have downloaded and read a lot of lecture notes from first-year courses. I’ve never seen this discussed. Happy to see or hear of alternatives.

So I Wrote a Book

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

This is going to be the king of off-topic posts.

I wrote a book. A fiction book for kids. You can find it on Amazon here, in either Kindle or paperback form. I didn’t go through a publisher, and just used Amazon’s self-publishing option. I get enough rejection letters from editors of journals, and I didn’t need one more group of people telling me that what I wrote isn’t what they are looking for.

The book is a mystery/fantasy adventure story, and I wrote it for my girls. They’re 11 and 9 now, and I’d say the book is good for kids 3rd-7th grade. But you could read it to kids as young as 1st grade or kindergarten. I know there are a ton of advanced elementary school students who read this blog regularly, so I figure the publicity can’t hurt.

For my academic co-authors out there, don’t worry. I wrote the book over a couple years, a few hours at a time. (For any UH readers, Friday afternoons post-Treebeard’s were prime time for writing.) I’m not sure if that made it easier or harder. A couple times I lost the thread of what I was writing because I missed a week. But the regular, limited schedule also made it relatively easy to keep going. I knew I only needed to work those few hours at a time. It never ended up feeling like a burden.

Most of the motivation how much my daughters love books like this. And a little bit of it came from reading a few of their books and thinking, “That’s not very good. I could write that.” After a while, you realize that you’d better actually do it, or just shut up. So I gave it a shot.

I will say that having done this, I have a greatly enhanced opinion of those who write fiction books like this, even those who write the bad ones that motivated me. Whatever the quality, these people finished a story. Coming up with an idea for a story is easy. But finishing the story, filling in all the details, organizing the flow of it, wrapping it up, and keeping it consistent, that is very hard. Finishing is an accomplishment, so I’ve learned to ease up on my criticism of books that do not thrill me. At least those authors did it.

And let me tell you, putting fiction out there is terrifying compared to submitting or presenting research. I can usually fall back on some kind of literature to support my research question or assumptions when I do research, but with fiction it is utterly on me. If this book is stupid, then it is because I wrote a stupid story. There is no hiding from that, and it probably explains why it took me close to a year to publish this book.

But my girls and a few of their friends who have read it enjoyed the book, and that finally got me to put it out there. If you know a kid who might like it, or if (like me), you read a lot of your kid’s books, take a look. If you like it, nice reviews are greatly appreciated. If you don’t like it, then you are an awful person, and probably like to kick puppies or cut people off on the highway.

Do Agricultural Conditions Matter for Institutions?

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

Last weekend there was a conference at Brown University on long-run determinants of growth (see program here). I unfortunately did not get to attend this year, so I am vicariously attending by reading some of the papers.

One that particularly caught my eye is by Roy Elis and Stephen Haber, on “Climate, Geography, and the Evolution of Economic and Political Institutions”. As you can probably tell from the title, this is not a paper shy about swinging for the fences.

Let’s get one point out of the way first. Yes, of course agricultural conditions matter for institutions. All sorts of household and village institutions are going to be built around agricultural conditions. Share-cropping norms, tenancy concepts, risk-sharing arrangements etc.. are all going to be keyed in to the specifics of agricultural production in different areas. Elis and Haber are, like I said, swinging for the fences. They are talking about broader institutions like the degree of democracy and the overall level of property rights protection. “Institutions” in the big-think historical literature sense.

They do a great job of explaining their ideas in manageable language, so your best bet is to read the paper yourself. But I can give some flavor of what they propose. Their idea is that economic and political institutions depend, to some extent, on the natural level of transactions that are supported by the hinterland surrounding the core area of different economies. Think of the Nile valley, or the Yangtze, or the Paris basin.

The level of transactions increases as the cost of transportation falls, as the hinterland is effectively larger. So places that have relatively flat terrains and lots of waterways will have larger hinterlands. They have a great map showing the difference between New York and Mexico City on this.

Elis and Haber worked out how far you could get from each city using 40 megajoules of energy. This is the amount of energy that it would take to transport a metric ton of goods 50 miles on flat land using a wagon. The exact number 40 isn’t terribly important here. What matters is that they are using a fixed energy budget and then looking out from each city for the distance that energy budget gets you.

As you can see, the hinterlands of New York are much wider than Mexico City (yes, the maps are on the same scale). Effectively, New York could draw on an area from Boston to DC, while in Mexico City the area available for growing crops that could be transported into the city is much narrower. So New York naturally had more transactions available or possible, in that sense. A larger agricultural area was capable of transacting with New York.

The second aspect of the level of transactions is the degree to which shocks are correlated spatially within that hinterland. If when things go bad, *everyone* has it bad, then there is little scope for transacting. Think of Egypt’s Nile Valley. If the monsoons failed in central Africa then there was no flood and no one had any crops to trade. In contrast, places with rain-fed agriculture were more likely to have uncorrelated shocks (it rained on my farm, but not on yours) and so more transactions were possible.

The last aspect of transactions is the ability to store crops. The more storage, the more transacting could take place. And perhaps more importantly, transacting over *time* could take place. In tropical places where crops rot quickly (i.e. in days) then you don’t have the opportunity to engage in long-term contracts.

What Elis and Haber propose is that places with more “natural transactions” were more likely to develop democratic or representative institutions, and more likely to develop economic institutions that support growth. Lots of natural transactions lead to “Transactional States” that support specialization, property rights, democracy etc.. etc.. Why? Well, that is kind of left in a black box. They don’t explain exactly why anyone in charge in these areas wouldn’t just level confiscatory taxes on all those transactions, or why they wouldn’t take over the entire hinterland and internalize the transactions.

An alternative is “Insurance States” that have few natural transactions, because they have correlated shocks (i.e. the Nile) and few natural transactions. The state develops in order to smooth consumption, and that requires relatively heavy-handed taxation, autocracy to enforce that taxation, etc. etc.. Again, why it necessarily follows that this is *bad* for growth in the long-run is unclear. Why doesn’t the need to coordinate insurance across time lead to a high-functioning state with the ability to support wider economic activity? It certainly seems possible that democratic society could choose to organize a deep welfare state. Denmark is still a country, right?

Leave their particular interpretation aside, and just look at the empirics. They use cross-country data, and regress the Polity score of a country (rescaled to 0 for autocracy and 100 for democracy) on several measures of “natural transactions”. They find that having a hinterland (around the main city in a country) that is more productive in cereals is positively related to Polity scores today. That positive effect, though, disappears if the hinterland is in a tropical area where crop storage is not possible. They also find that hinterlands with more severe shocks (as coded by drought or overly wet conditions, and indicating highly correlated shocks within an area) have lower Polity scores.

If you look at GDP per capita and/or years of schooling regressed on the same variables, then you find that cereal production in the hinterland is highly correlated with both. While the point estimate suggests that this effect is smaller in tropical areas without storage, it is insignificant. The number of severe shocks also has no effect on GDP or schooling. So the most you can say about this result is that there is evidence that more productive hinterlands are associated with more economic development and education today.

I’m not sure that these second results should be taken as evidence that Elis and Haber’s story about institutions is correct. My mental null hypothesis here is that higher agricultural productivity, because of low income elasticities for food, means more industrialization/urbanization/non-ag production. If non-ag production has some inherent advantage, or is subject to some positive agglomeration effects, then this means places with more non-ag workers will be richer. It doesn’t necessarily have to be about institutions at all. It’s just about climate and geographically determined productivity levels. Their results on GDP per capita and education seem entirely consistent with my null hypothesis.

The Polity result is more interesting, in that sense, because there is not obvious connection between agricultural productivity level and the type of government you end up with. The results seem to suggest that there is an association worth exploring. But what is the mechanism through which this works? Elis and Haber, as I mentioned above, don’t give a model of *why* this connection should work this way. It need not be that these agricultural conditions had anything to do with institutions themselves. Perhaps higher ag. productivity led to more industrialization, which in turn leads to higher demand for better institutions.

It need not even be the case that there is any connection of geography/climate and institutions at all, even given their results. It certainly seems plausible that institutions could be spatially correlated, as people and ideas move slowly across space. So a few institutional lucky breaks (the Magna Carta?) that hit in places with certain geographic characteristics will spread most rapidly through other similar areas. And then we end up with good institutions in places that have hinterlands with uncorrelated shocks and high productivity, even though the geography and climate have absolutely nothing to do with institutions themselves.

The Connection of Urbanization with Growth

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

Paul Romer has a nice post up about how urbanization “passes the Pritchett test” for development. Pritchett’s test is that urbanization (in this case) is related both in the cross-section and the time-series to living standards, and positive shocks to urbanization are associated with higher living standards. So Romer argues that we should be studying urbanization as a route towards higher living standards in developing countries. This jives to some degree with his charter city concept, which proposes establishing new cities with functioning institutions in developing areas. There isn’t really anything to argue with about the rough correlations in the data. Urbanization is, and has been, associated with higher living standards for a long time.

But there are some subtleties in those relationships that mean simply urging everyone to flood into cities is not necessarily wise. Everything I’m going to talk about now is based on joint work of mine with Remi Jedwab, who studies urbanization in developing countries very deeply.

The first caveat I’ll point out is that the absolute pace of urbanization matters a lot. Moving an extra 10,000 people into a city in a year may improve productivity in that city and overall in the country. Moving 1,000,000 people into the same city in a year will probably generate such awful congestion costs that productivity in that city falls and country-level productivity may be lower. Remi and I lay out a simple model of this in a working paper that we have out (and which is being furiously revised right now). We show that if the absolute growth of city population is too large, then city wages will actually get pushed down due to the overwhelming congestion effects, even if there is some exogenous technological progress. That part isn’t incredibly shocking. What we then do is show that if population growth is endogenous, and rises as wages get lower, then too-rapid city population growth pushes a city into what we call a poor mega-city equilibrium. The city gets stuck with low wages and high population growth, and cannot overcome the congestion costs of that growth. We explain the arrival of poor mega-cities like Dhaka, Lagos, and Karachi as a kind of perverse result of the mortality transition after World War II, as it raised the absolute growth of cities beyond a critical threshold. Cities like these grow by 400,000 or 500,000 residents per year, while historically cities like New York or London only grew – at their peak – by maybe 200,000 per year. Urbanization that happens too rapidly can have counter-productive results.

The second caveat is that what drives urbanization matters. Remi and I, along with Doug Gollin, have a paper on urbanization and natural resources. If you look across countries, as Romer does, then there is a clear relationship of GDP per capita and urbanization rates. However, urbanization rates are not necessarily correlated with industry or tradable service production.

The figure above shows the lack of a firm relationship, and this shows up if you use just manufacturing, just manufacturing and finance, or some other reasonable definition of what constitutes tradable goods and services. There are lots of countries in the world that have high urbanization rates, but are not industrialized, and they tend to be resource exporters. And this isn’t just places like Dubai. Angola – a major oil exporter – has an urbanization rate equal to China’s. We document that natural resource exports are a significant driver of urbanization. We even have a neat little diff-in-diff type specification that looks at discoveries of resources and shows that urbanization rates jump in the decade after the discovery. Perhaps more important, though, we show that cities in places that urbanize because of natural resource booms have very different urban workforces than typical “industrial” urbanizers. Cities in places like Angola have a big percentage of their urban workforce in personal services and small-scale retail trade, and few people in industry or high-value services. This contrasts with China, where their urban workforce has a huge percentage of people in sectors that produce tradable goods or services (i.e. finance). The point is that urbanization is not homogenous. What drives urbanization matters, in that it determines what sectors people in those urban areas end up working in.

The last caveat kind of takes off from the second. Urbanization has been related to higher living standards over much of history, but that doesn’t mean it always will be. Remi and I did a survey paper on the relationship of urbanization and GDP per capita over time. Yes, they are positively related in every year we look at, going back to 1500. But that doesn’t mean that urbanization rates have increased primarily because countries have gotten richer.

What we see in the data is that urbanization rates have shifted higher at every level of GDP per capita over time. A country with GDP per capita of \$1,000 had an urbanization rate of about 10-15% in 1500, but by 2010 a country with the same GDP per capita would be between 35-50%. Most urbanization over history has occurred not because of countries getting richer, but simply because urbanization has gone up everywhere. One implication is that the positive relationship of urbanization and living standards can only go down in the future. Rich countries are maxed at at urbanization rates of 100%. So if poorer countries continue to urbanize, then the relationship of GDP per capita and urbanization has to fall.

The over-arching point is that the positive relationship between urbanization and living standards we see in existing data is an equilibrium relationship, not necessarily a causal one. There are plausibly negative impacts of too-rapid urbanization on living standards. And Romer is careful in his post not to make any kind of strong causal claim. He thinks we should be studying urbanization more carefully to try and understand what exactly it is that generates the positive relationships. I’d strongly agree with that. I’d like to think that Remi and Doug and I have given some clues towards an answer, perhaps just by pointing out things that are not responsible for the positive relationship.