Measuring Misallocation across Firms

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

One of the most active area of research in macro development (let’s not call it growth economics, I guess) is on misallocations. This is the concept that a major explanation for why some countries are rich, while others are poor, is that rich countries do a good job of allocating factors of production in an efficient manner across firms (or sectors, if you like). Poor countries do a bad job of this, so that unproductive firms use a lot of labor and capital up, while high productivity firms are too small.

One of the baseline papers in this area is by Diego Restuccia and Richard Rogerson, who showed that you could theoretically generate big losses to aggregate measures of productivity if you introduced firm-level distortions that acted like subsidies to unproductive firms and taxes on produtive firms. This demonstrated the possible effects of misallocation. A paper by Hsieh and Klenow (HK) took this idea seriously, and applied the basic logic to data on firms from China, India and the U.S. to see how big of an effect misallocations actually have.

We just went over this paper in my grad class this week, and so I took some time to get more deeply into the paper. The one flaw in the paper, from my perspective, was that it ran backwards. That is, HK start with a detailed model of firm-level activity and then roll this up to find the aggregate implications. Except that I think you can get the intuition of their paper much more easily by thinking about how you measure the aggregate implications, and then asking yourself how you can get the requisite firm-level data to make the aggregate calculation. So let me give you my take on the HK paper, and how to understand what they are doing. If you’re seriously interested in studying growth and development, this is a paper you’ll need to think about at some point, and perhaps this will help you out.

This is dense with math and quite long. You were warned.

What do HK want to do? They want to compare the actual measured level of TFP in sector {s}, {TFP_s}, to a hypothetical level of TFP in sector {s}, {TFP^{\ast}_s}, that we would measure if we allocated all factors efficiently between firms.

Let’s start by asking how we can measure {TFP_s} given observable data on firms. This is

\displaystyle  TFP_s = \frac{Y_s}{K_s^{\alpha}L_s^{1-\alpha}}, \ \ \ \ \ (1)

which is just measuring {TFP_s} for a sector as a Solow residual. {TFP_s} is not a pure measure of “technology”, it is a measure of residual productivity, capturing everything that influences how much output ({Y_s}) we can get from a given bundle of inputs ({K_s^{\alpha}L_s^{1-\alpha}}). It includes not just the physical productivity of individual firms in this sector, but also the efficiency of the distribution of the factors across those firms.

Now, the issue is that we cannot measure {Y_s} directly. For a sector, this is some kind of measure of real output (e.g. units of goods), but there is no data on that. The data we have is on revenues of firms within the sector (e.g. dollars of goods sold). So what HK are going to do is use this revenue data, and then make some assumptions about how firms set prices to try and back out the real output measure. It’s actually easier to see in the math. First, just write {TFP_s} as

\displaystyle  TFP_s =\frac{P_s Y_s}{K_s^{\alpha}L_s^{1-\alpha}}\frac{1}{P_s} = \overline{TFPR}_s \frac{1}{P_s} \ \ \ \ \ (2)

which just multiplies and divides by the price index for sector {s}. The first fraction is revenue productivity, or {\overline{TFPR}_s}, of sector {s}. This is a residual measure as well, but measures how produtive sector {s} is at producing dollars, rather than at producing units of goods. The good thing about {TFPR_s} is that we can calculate this from the data. Take the revenues of all the firms in sector {s}, and that is equal to total revenues {P_s Y_s}. We can add up the reported capital stocks across all firms, and labor forces across all firms, and get {K_s} and {L_s}, respectively. We can find a value for {\alpha} based on the size of wage payments relative revenues (which should be close to {1-\alpha}). So all this is conceptually measurable.

The second fraction is one over the price index {P_s}. We do not have data on this price index, because we don’t know the individual prices of each firms output. So here is where the assumptions regarding firm behavior come in. HK assume a monopolistically competitive structure for firms within each sector. This means that each firm has monopoly power over producing its own brand of good, but people are willing to substitute between those different brands. As long as the brands aren’t perfectly substitutable, then each firm can charge a price a little over the marginal cost of production. We’re going to leave aside the micro-economics of that structure for the time being. For now, just trust me that if these firms are monopolistically competitive, then the price index can be written as

\displaystyle  P_s = \left(\sum_i P_i^{1-\sigma} \right)^{1/(1-\sigma)} \ \ \ \ \ (3)

where {P_i} are the individual prices from each firm, and {\sigma} is the elasticity of substitution between different firms goods.

Didn’t I just say that we do not observe those individual firm prices? Yes, I did. But we don’t need to observe them. For any individual firm, we can also think of revenue productivity as opposed to their physical productivity, denoted {A_i}. That is, we can write

\displaystyle  TFPR_i = P_i A_i. \ \ \ \ \ (4)

The firms productivity at producing dollars ({TFPR_i}) is the price they can charge ({P_i}) times their physical productivity ({A_i}). We can re-arrange this to be

\displaystyle  P_i = \frac{TFPR_i}{A_i}. \ \ \ \ \ (5)

Put this expression for firm-level prices into the price index {P_s} we found above. You get

\displaystyle  P_s = \left(\sum_i \left[\frac{TFPR_i}{A_i}\right]^{1-\sigma} \right)^{1/(1-\sigma)} \ \ \ \ \ (6)

which depends only on firm-level measure of {TFPR_i} and physical productivity {A_i}. We no longer need prices.

For the sector level {TFP_s}, we now have

\displaystyle  TFP_s = \overline{TFPR}_s \frac{1}{P_s} = \frac{\overline{TFPR}_s}{\left(\sum_i \left[\frac{TFPR_i}{A_i}\right]^{1-\sigma} \right)^{1/(1-\sigma)}}. \ \ \ \ \ (7)

At this point, there is just some slog of algebra to get to the following

\displaystyle  TFP_s = \left(\sum_i \left[A_i \frac{\overline{TFPR}_s}{TFPR_i}\right]^{\sigma-1} \right)^{1/(\sigma-1)}. \ \ \ \ \ (8)

If you’re following along at home, just note that the exponents involving {\sigma} flipped sign, and that can hang you up on the algebra if you’re not careful.

Okay, so now I have this description of how to measure {TFP_s}. I need information on four things. (1) Firm-level physical productivities, {A_i}, (2) sector-level revenue productivity, {\overline{TFPR}_s}, (3) firm-level revenue productivities, {TFPR_i}, and (4) a value for {\sigma}. Of these, we can appeal to the literature and assume a value of {\sigma}, say something like a value of 5, which implies goods are fairly substitutable. We can measure sector-level and firm-level revenue productivities directly from the firm-level data we have. The one big piece of information we don’t have is {A_i}, the physical productivity of each firm.

Before describing how we’re going to find {A_i}, just consider this measurement of {TFP_s} for a moment. What this equation says is that {TFP_s} is a weighted sum of the individual firm level physical productivity terms, {A_i}. That makes some sense. Physical productivity of a sector must depend on the productivity of the firms in that sector.

Mechanically, {TFP_s} is a concave function of all the stuff in the parentheses, given that {1/(\sigma-1)} is less than one. Meaning that {TFP_s} goes up as the values in the summation rise, but at a decreasing rate. More importantly, for what HK are doing, this implies that the greater the variation in the individual firm-level terms of the summation, the lower is {TFP_s}. That is, you’d rather have two firms that have similar productivity levels than one firm with a really big productivity level and one firm with a really small one. Why? Because we have imperfect substitution between the output of the firms. Which means that we’d like to consume goods in somewhat rigid proportions (think Leontief perfect complements). For example, I really like to consume one pair of pants and one shirt at the same time. If the pants factory is really, really productive, then I can lots of pants for really cheap. If the shirt factory is really un-productivie, I can only get a few shirts for a high price. To consume pants/shirts in the desired 1:1 ratio I will end up having to shift factors away from the pants factor and towards the shirt factory. This lowers my sector level productivity.

There is nothing that HK can or will do about variation in {A_i} across firms. That is taken as a given. Some firms are more productive than others. But what they are interested in is the variation driven by the {TFPR_i} terms. Here, we just have the extra funkiness that the summation depends on these inversely. So a firm with a really high {TFPR_i} is like having a really physically unproductive firm. Why? Think in terms of the prices that firms charge for their goods. A high {TFPR_i} means that firms are charging a relatively high price compared to the rest of the sector. Similarly, a firm with a really low {A_i} (like our shirt factory above) would also be charging a relatively high price compared to the rest of the sector. So having variation in {TFPR_i} across firms is like having variation in {A_i}, and this variation lowers {TFP_s}.

However, as HK point out, if markets are operating efficiently then there should be no variation in {TFPR_i} across firms. While a high {TFPR_i} is similar to a low {A_i} in its effect on {TFP_s}, the high {TFPR_i} arises for a fundamentally different reason. The only reason a firm would have a high {TFPR_i} compared to the rest of the sector is if it faced higher input costs and/or higher taxes on revenues than other firms. In other words, firms would only be charging more than expected if they had higher costs than expected or were able to keep less of their revenue.

In the absence of different input costs and/or different taxes on revenues, then we’d expect all firms in the sector to have identical {TFPR_i}. Because if they didn’t, then firms with high {TFPR_i} could bid away factors of production from low {TFPR_i} firms. But as high {TFPR_i} firms get bigger and produce more, the price they can charge will get driven down (and vice versa for low {TFPR_i} firms), and eventually the {TFPR_i} terms should all equate.

For HK, then, the level of {TFP_s} that you could get if all factors were allocated efficiently (meaning that firms didn’t face differential input costs or revenue taxes) is one where {TFPR_i = \overline{TFPR}_s} for all firms. Meaning that

\displaystyle  TFP^{\ast}_s = \left(\sum_i A_i^{\sigma-1} \right)^{1/(\sigma-1)}. \ \ \ \ \ (9)

So what HK do is calculate both {TFP^{\ast}_s} and {TFP_s} (as above), and compare.

To do this, I already mentioned that the one piece of data we are missing is the {A_i} terms. We need to know the actual physical productivity of firms. How do we get that, since we cannot measure physical output at the firm level? HK’s assumption about market structure will allow us to figure that out. So hold on to the results of {TFP_s} and {TFP^{\ast}_s} for a moment, and let’s talk about firms. For those of you comfortable with monopolistic competition models using CES aggregators, this is just textbook stuff. I’m going to present it without lots of derivations, but you can check my work if you want.

For each firm, we assume the production function is

\displaystyle  Y_i = A_i K_i^{\alpha}L_i^{1-\alpha} \ \ \ \ \ (10)

and we’d like to back out {A_i} as

\displaystyle  A_i = \frac{Y_i}{K_i^{\alpha}L_i^{1-\alpha}} \ \ \ \ \ (11)

but we don’t know the value of {Y_i}. So we’ll back it out from revenue data.

Given that the elasticity of substitution across firms goods is {\sigma}, and all firms goods are weighted the same in the utility function (or final goods production function), then the demand curve facing each firm is

\displaystyle  P_i = Y_i^{(\sigma-1)/\sigma - 1} X_s \ \ \ \ \ (12)

where {X_s} is a demand shifter that depends on the amount of the other goods consumed/produced. We going to end up carrying this term around with us, but it’s exact derivation isn’t necessary for anything. Total revenues of the firm are just

\displaystyle  (P_i Y_i) = Y_i^{(\sigma-1)/\sigma} X_s. \ \ \ \ \ (13)

Solve this for {Y_i}, leaving {(P_i Y_i)} together as revenues. This gives you

\displaystyle  Y_i = \left(\frac{P_i Y_i}{X_s}\right)^{\sigma/(\sigma-1)}. \ \ \ \ \ (14)

Plug this in our equation for {A_i} to get

\displaystyle  A_i = \frac{1}{X_s^{\sigma/(\sigma-1)}}\frac{\left(P_i Y_i\right)^{\sigma/(\sigma-1)}}{K_i^{\alpha}L_i^{1-\alpha}}. \ \ \ \ \ (15)

This last expression gives us a way to back out {A_i} from observable data. We know revenues, {P_i Y_i}, capital, {K_i}, and labor, {L_i}. The only issue is this {X_s} thing. But {X_s} is identical for each firm – it’s a sector-wide demand term – so we don’t need to know it. It just scales up or down all the firms in a sector. Both {TFP_s} and {TFP^{\ast}_s} will be proportional to {X_s}, so when comparing them {X_s} will just cancel out. We don’t need to measure it.

What is our {A_i} measure picking up? Well, under the assumption that firms in fact face a demand curve like we described, then {A_i} is picking up their physical productivity. If physical ouput, {Y_i}, goes up then so will revenues, {P_i Y_i}. But not proportionally, as with more output the firm will charge a lower price. Remember, the pants factory has to get people to buy all those extra pants, even though they kind of don’t want them because there aren’t many shirts around. So the price falls. Taking revenues to the {\sigma/(\sigma-1)} power captures that effect.

Where are we? We now have a firm-level measure of {A_i}, and we can measure it from observable data on revenues, capital stocks, and labor forces at the firm level. This allows us to measure both actual {TFP_s}, and the hypothetical {TFP^{\ast}_s} when each firm faces identical factor costs and revenues taxes. HK compare these two measures of TFP, and find that in China {TFP^{\ast}_s} is about 86-115% higher than {TFP_s}, or that output would nearly double if firms all faced the same factor costs and revenue taxes. In India, the gain is on the order of 100-120%, and for the U.S. the gain is something like 30-43%. So substantial increases all the way around, but much larger in the developing countries. Hence HK conclude that misallocations – meaning firms facing different costs and/or taxes and hence having different {TFPR_i} – could be an important explanation for why some places are rich and some are poor. Poor countries presumably do a poor job (perhaps through explicit policies or implicit frictions) in allocating resources efficiently between firms, and low-productivity firms use too many inputs.

* A note on wedges * For those of you who know this paper, you’ll notice I haven’t said a word about “wedges”, which are the things that generate differences in factor costs or revenues for firms. That’s because from a purely computational standpoint, you don’t need to introduce them to get HK’s results. It’s sufficient just to measure the {TFPR_i} levels. If you wanted to play around with removing just the factor cost wedges or just the revenue wedges, you would then need to incorporate those explicitly. That would require you to follow through on the firms profit maximization problem and solve for an explicit expression for {TFPR_i}. In short, that will give you this:

\displaystyle  TFPR_i = \frac{\sigma}{\sigma-1} MC_s \frac{(1+\tau_{Ki})^{\alpha}}{1-\tau_{Yi}}. \ \ \ \ \ (16)

The first fraction, {\sigma/(\sigma-1)}, is the markup charged over marginal cost by the firm. As the elasticity of substitution is assumed to be constant, this markup is identical for each firm, so generates no variation in {TFPR_i}. The second term, {MC_s}, is the marginal cost of a bundle of inputs (capital and labor). The final fraction are the “wedges”. {(1+\tau_{Ki})} captures the additional cost (or subsidy if {\tau_{Ki}<0}) of a unit of capital to the firm relative to other firms. {(1-\tau_{Yi})} captures the revenue wedge (think of a sales tax or subsidy) for a firm relative to other firms. If either of those {\tau} terms are not equal to zero, then {TFPR_i} will deviate from the efficient level.

* A note on multiple sectors * HK do this for all manufacturing sectors. That’s not a big change. Do what I said for each separate sector. Assume that each sector has a constant share of total expenditure (as in a Cobb-Douglas utility function). Then

\displaystyle  \frac{TFP^{\ast}_{all}}{TFP_{all}} = \left(\frac{TFP^{\ast}_1}{TFP_1}\right)^{\theta_1} \times \left(\frac{TFP^{\ast}_2}{TFP_2}\right)^{\theta_2} \times ... \ \ \ \ \ (17)

where {\theta_s} is the expenditure share of sector {s}.

Hsieh and Moretti on Allocations across Cities

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

Last post on the NBER growth session. Chang-Tai Hsieh (Chicago) and Enrico Moretti (Berkeley) presented a paper on wage dispersion across cities in the U.S. Wage dispersion (New Yorkers earn more than people in Cleveland) either represents compensation for living costs (housing in New York is more expensive than in Cleveland), a real difference in productivity (New Yorkers are more productive than Clevelanders), or some combination of the two.

What Chang and Enrico find is that the increase in wage dispersion across cities in the U.S. over the last thirty-ish years is due almost entirely to rising house prices in six cities: NY, DC, Boston, San Fran, San Jose, and Seattle. Wages have gone up rapidly in those cities, but that is basically just compensating their citizens for the higher costs of living.

Now, given the costs of living, the allocation of population across cities in the U.S. is efficient. That is, there is no reason for someone from Cleveland to move to New York on the margin. Their increased wage in New York only compenstates them for the higher housing cost, and so there is no change in their real wage.

However, if we do not take the costs of living as given, then the allocation of population is not efficient – there is a (surprise again!) misallocation. If there were not housing restrictions in NY and San Fran, and housing prices were not so ridiculously high, lots of people would move there because these are high-productivity cities. So one can back out the implied cost of housing restrictions across the whole U.S., and Chang and Enrico find that aggregate output is lower by about 10-14% because of them. That is, by preventing new housing in San Fran, restrictions drive up housing prices, which keep Clevelanders from moving, when in fact Clevelanders would be more productivite in San Fran.

The best part of the paper is the implied change in city sizes if you did remove restrictions. Chang and Enrico calculate that New York’s population would rise by 890%(!!) without restrictions on housing.

As an aside, Ed Glaeser (Harvard) gave the discussion of this paper. It was my first exposure to Ed, and all I have to say is that he should do the last discussion of the day at every conference, everywhere. Just when you are checking your phone for messages and thinking about beer, he steps in and gives a great energetic talk that keeps your attention. And the bow tie. The man can actually pull off a bow tie.

Herrendorf and Schoellman on Labor Allocations

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

The next post on the NBER growth session. Berthold Herrendorf (ASU) and Todd Schoellman (ASU) looked at the (surprise!) misallocation of labor between agriculture and non-agriculture. They look at the wage gap between ag and non-ag in a panel of 39 countries. The question is how much of the gap is due to human capital differences between ag and non-ag workers.

Across their sample, the average wage premium for non-ag is 1.79. That is, non-ag workers earn about 79% more than ag workers per hour. If you control for human capital using a standard Mincerian return to years of schooling of 10%, then this average premium falls to 1.36. The wage \textit{per unit of human capital} in non-ag is 36% higher than in agriculture. The raw wage premium is 1.79 because non-ag workers have higher average education levels.

What Todd and Berthold do to advance on this is to consider the possibility that the returns to education are different between sectors. They provide evidence that this is in fact the case. For each year of schooling, agricultural workers get a smaller bump in wage than do non-ag workers. Thus non-ag workers have even higher implied human capital than ag workers. They have more years of schooling, and those years of schooling provide them with more human capital. If you make this adjustment, then the average wage premium for non is 0.92, or non-ag workers earn about 8% \textit{less} per unit of human capital than in ag. Essentially, Todd and Berthold can account for the entire observed wage gap.

This is intriguing because it suggests that the labor markets in these countries are getting things roughly right. This doesn’t mean ag workers earn the same as non-ag workers, they don’t. But this is because ag workers provide less human capital to the market than non-ag workers, not because ag workers are underpaid for their human capital. I’ll do some self-promotion in that their work complements my own finding that wage gaps between sectors in developing countries are not a big source of aggregate productivity losses.

One conclusion from their work is that movements of workers between sectors are not by themselves a source of growth. With the marginal return to HC being the same across sectors, there is no boost to productivity coming just from moving workers around. If we do observe shifts of labor from ag. to non-ag then that represents shifts due to differential productivity growth in the sectors or to non-homothetic preferences.

Restuccia and Santaeulalia-Llopis on Land Misallocation

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

Next entry in the NBER growth session recap. Diego Restuccia (Toronto) and Raul Santaeulalia-Llopis (Wash. U.) presented their paper on land misallocation and productivity. Essentially, what Diego and Raul are trying to do is apply the firm-level study of Hsieh and Klenow (2009) methodology to farms (you might be starting to sense a theme to the day by now). Specifically, they have detailed data on farm plots from Malawi, and use this to see how much the marginal product of land varies across farming households.

They then ask the counter-factual question of: how much would agricultural output rise if we re-allocated land across farmers to equalize the marginal product of land? Essentially, if I could make sure that high-productivity farmers were able to receive more land, and low-productivity farmers less, how much could we raise aggregate output? Their finding is that output would go up by a factor of 3! This is way off the charts compared to anything that people have done with firms (where output might rise by a factor 1.5).

Big numbers like that lead to questions. The obvious concern is that they are incorrectly measuring the marginal product of land. Working in their favor is that they have clear measures of actual real output (bushels of maize). This means they do not have to make any assumptions about prices to try and deflate revenues, as we often need to do with firms.

You might also be worried that they are attributing land-quality differnces to households. That is, household A looks really productivity in the data because they have good land, not because they are really good at farming. Diego and Raul have really fine-grained measures of land characteristics. Conceptually they can control for land quality. But how do you construct a single-valued index of land quality from 11 different characteristics (slope, elevation, acidity, etc…). There is conceivably some kind of “true” agronomic function that transforms these into a single measure of quality, but it is almost certainly highly non-linear and involves all sorts of weird covariances between the measures. So if you want to be skeptical about their results, I think this is the primary worry.

Let’s take their results as true for the moment, though. One explanation for the vast degree of misallocation goes back to subsistence constraints, as I discussed in a different context before. If I need to ensure that all farming households can achieve a minimum output (and markets are not developed enough for people to borrow/lend over time if they can’t), then you’re bound to have big misallocations. You’d in fact have to give bad farmers even more land than good farmers. A very low absolute level of productivity is thus a determinant of misallocation.

On the other hand, if all the land is really highly productive, then small plots are sufficient to ensure subsistence. This means that we can give all the bad farmers small plots, and let the good farmers operate the surplus land. So a high level of absolute productivity will ensure a better allocation. Inherent land productivity thus has two separate influences on outcomes: a direct one through increasing output, and an indirect one in ensuring better allocations of land.

David, Hopenhayn, and Venkateswaran on Misallocation

NOTE: The Growth Economics Blog has moved sites. Click here to find this post at the new site.

Next installment of the NBER growth session recap. The second paper was by Joel David (USC), Hugo Hopenhayn (USC), and Venky Venkateswaran (NYU Stern). Their jumping off point is the apparent mis-allocation of factors of production across firms. The standard of comparison here is Hsieh and Klenow (2009), who find that mis-allocation of factors is lowering output by something like 50% in China and India.

So why are factors mis-allocated? David, Hopenhayn, and Venkateswaran propose that this is partly due to informational issues. That is, firms themselves do not know ex-ante (when they are deciding on how much capital or labor to hire) exactly what their productivity will be ex-post. Hence they make mistakes, and part of what we observe in the ex-post data are these mistakes. So rather than explicit taxes, subsidies, or other frictions, poor information about future productivity drives mis-allocation.

To get some quantitative feel for how important this is, they focus on listed companies. These have the advantage of an extra source of information on future prospects, the stock price. There is a neat little information extraction problem they show solves nicely that allows them to use the observed productivity of firms and the stock prices to back out the degree of uncertainty firms have ex-ante. With this, they suggest that in the U.S. roughly 40% of variation in productivity firms is a surprise to firms. In India, about 80% of variation in productivity is a surprise. Because of the poorer information, Indian firms make bigger mistakes on average, and so there is more ex-post mis-allocation.

It’s a clever explanation for mis-allocation, and is one of those stories that in some sense has to be true to some extent. There is no way firms have perfect information on future productivity (or demand, which is essentially the same thing in these models). The question is how big of an effect it is, and they suggest it’s pretty sizable.

One question that came up in my head afterwards was whether the degree of uncertainty is related to the level of returns. That is, Indian firms have a lot of uncertainty (risk) in their productivity draws, apparently. Is that high risk associated with higher rewards? If it is, then we can’t really say that this is mis-allocation, per se. Firms are making optimal decisions ex-ante, and there happens to be a willingness to tolerate risk in the economy. If, on the other hand, high risk is associated with low rewards, then there really is a mis-allocation in the sense that they are making uninformed decisions.