关键词 > SA.510.104.01

SA.510.104.01, Economic Development Fall 2022 Lecture 2

发布时间:2022-10-02

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

SA.510. 104.01, Economic Development

Fall 2022

Lecture 2: Measuring Welfare, Inequality, and Poverty

Much of today’s discussion will focus on income or consumption. Although it is good to keep in mind that income is only one dimension of well-being.

Measures of Inequality

We often want to characterize a whole distribution of income using only a few parameters, often to help decide whether either one distribution or the outcome of a policy is better” than another.  Many people would like a purely technical number to make this judgment for them so as not to have to make their ethical principles explicit but, as we will see, it’s not that easy. Also, the more precise the measure, the more ethical assumptions are hidden in the calculations.  This note outlines all the analytic background you should have.  It also ags where value judgments sneak in inadvertently.

Basic concepts Figure 1 illustrates an income density function2 representing, on the horizontal axis, levels of income and on the vertical axis, the proportion of people who have that income. It needn’t be income, of course, it could be consumption or wealth or any variable whatsoever. Also, it needn’t be a “person”, it could be a family, a household (if different from a family) or a worker. It is typically shaped something like the one below – a big bulge on the left of middle and a long tail (skewed) to the right.

Figure 1: Density of Income

Figure 2 is derived from gure 1 and is called the Cumulative Distribution Function (CDF) of the density f(y), denoted F(y). At each point the value of the function, F(y), denotes the fraction of the population that earns less than the value of income (y) on the horizontal axis. It is derived from gure 1 because each point represents the area under the curve in figure 1 between zero income and y. It ranges from zero to one (everyone is included) and

1Lecutre notes are only meant to aid teaching and are not substitutes to reading the material listed in the syllabus.

2I’m calling this the “density” function of income instead of the distribution of income (as we ordinarily say in normal English) for reasons that will become clear. However, it is good to use this vocabulary since much of the analytics for income distribution comes from probability theory that will come up again in the context of risk.

is steepest at the bulge” in gure 1. You might have more intuition about gure 1 but it will turn out that gure 2 will be more useful to us for a variety of reasons.

Figure 2: Cumulative Distribution Function of Income

Generally speaking (we’ll come back to this), the less equally distributed is income, the more “stretched out” is the CDF, the more equally, the more squished. In a perfectly equal society, the CDF is flat from zero up to the point of everyone’s (equal) income where it jumps to one and is flat again thereafter.

Some interrelations between inequality measures and welfare Going back to figure 2, imagine we have just implemented a policy such as reducing tariffs on imported automobiles. The likely outcome on the real income distribution (corrected for all prices adjusting to the new situation) is shown in figure 3.

Figure 3: Comparing Income Distributions before and after a policy change

Now, this looks like a good thing since at every income level there are fewer people who earn less than that amount.  Everybody looks like they were shifted to the right, that is, to a higher income.  This is a case of a “First Order Stochastic Dominant” shift in the distribution and is one of our least ambiguous cases of comparing distributions on their welfare implications.  However, it is not quite as unambiguous as it seems.  For example, our usual criterion for welfare improvement is Pareto-dominance, that is, where no one is worse off in the new situation and someone is better off. This is not necessarily the case in figure 3. For example, in figure 4 (same case as 3) person “a” moves from the 85th percentile in the income distribution (an auto factory foreman?) to the 35th percentile and actually makes less money after the tariff is removed.  So, the change is NOT Pareto improving since she’s worse off.

Figure 4: First Order Stochastic Dominance is not necessarily Pareto optimal

So, one characteristic that we sometimes impose on comparisons of distributions is the principle of “anonymity” meaning we don’t care where any particular person is in the distribution. This is generally accepted as a reason-  able principle but it has to be remembered that it requires us to abandon the Pareto principle and this has real  ethical consequences. As with Pareto-optimality, we can fudge this principle by noting that person a” could be  compensated by everyone else such that even she is better off.  However, just as with Pareto-optimality, if the  compensation doesn’t actually happen, this is cold comfort to person “a” .

A secondary implication of anonymity (not frequently noted) is we are implicitly ruling out envy as a legit- imate consideration. For example, person b” (an auto assembly line worker in a flexible labor market, say) is better off in an absolute sense but has dropped in the income distribution from about the 70th to the 50th percentile. To say that the new distribution is better” than the old requires ruling out as morally irrelevant any dissatisfaction that person b” might feel by loss of ranking in the distribution (let alone the loss of actual income that person “a” suffered).

Figure 5: Weak FOSD

Another example that illustrates a problem that even this relatively straightforward comparison gives some people is illustrated in gure 5. Here, all relatively poor (and middle income) percentiles of the population stay the same up to the point A after which everybody richer gets even richer. This is called “weakly” FOSD since the CDF’s coincide for a while. If everyone stayed at their same ranking (percentile) in the population, this shift is also Pareto improving since some people got better off and no one lost.  But the distribution is clearly more

unequal in the after” position. So, this is another case of not allowing envy or relative comparisons of people to dominate absolute changes in income. You can think through whether the change bothers you or not. If it does and you think “before” is better than “after”, how would you feel if things went the other way the society got poorer but more equal?

Most of the time, we’re not so lucky as to be comparing distributions such that one is FOSD to the other. Meaning: the CDF’s often cross. What happens then?

Figure 6: Crossing CDFs

In this case, more people got poorer since the CDF of “after” lies above the CDF of “before” everywhere to the left of the intersection. And more people got richer since the reverse is true to the right of the intersection. So the new distribution is more unequal – more rich and more poor. If the average income were the same (we can’t tell from the diagram) then anyone who has any aversion to inequality would think things got worse. This type of change is called a mean preserving spread” for obvious reasons.

With CDF’s that cross only once (whether they have the same mean or not), there is another term that can be used to compare the distributions.  In our case the before” curve is said to be Lorenz dominant relative to the “after” case. This terminology comes from the Lorenz curves” derived from these distributions.

Lorenz curves are constructed as follows. You line up everyone in order of their income from poor to rich. On the horizontal axis is each person’s rank in the income distribution. On the vertical axis is the share of the whole society’s total income that everyone who is as poor or poorer than that rank owns. So, for example if the poorest 20% of the country has only 10% of its income then you would plot the point (20%,10%). You do that for every percentile and you get the “Lorenz curve” for the distribution. You then put the whole thing in a square box, draw the 45 degree of the box and you get a Lorenz diagram. It looks like figure 7.

Figure 7: Lorenz Diagram

The Lorenz curve must lie between the diagonal line and the lower and right edges of the box. Perfect equality would have a Lorenz curve that is coincident with the 45 degree line. If there were perfect equality where everyone had exactly the same income then the poorest 20% of the people would have 20% of the income and that would put them on the diagonal. The other extreme would be if one person had all the money and everyone else had nothing. That would be the edge of the box – zero all the way from 0 to .99999999 and then a sudden jump up to 1 when you get up to the rich guy. The more equal the distribution, the closer you are to the diagonal. The less equal the distribution, the closer you are to the edge. (Lorenz curve will shift out after the shifts as in figure 6.)

Also, the curves have to be “convex” meaning they have to get steeper and steeper as you move from left to right (though they can be straight line segments for a while if you have a bunch of people with the same income). This is because each person from left to right is richer, must therefore contribute a higher percentage of total income than anyone to her left and must make the curve rise at a faster rate. If that were not true, then you put her in the wrong spot in the first place. Try to be more careful next time.

By far the most common statistic calculated on the basis of Lorenz curves, in fact, related to income dis- tributions generally, is the Gini coefficient” .  Graphically, they are easy to characterize.  In Figure 8, the area between the Lorenz curve and the diagonal is called A and the area between the Lorenz curve and the edge of the box is called B. The Gini coefficient, G=A/(A+B). Perfect equality has A=0 and, therefore G=0 meaning no inequality. Perfect inequality has B=0 and, therefore, G=1 meaning complete inequality. Calculations in between are something of a pain so we won’t go into them.

Figure 8: Calculating GINI

It so happens that any time CDF’s of two distributions cross only once, their Lorenz curves don’t cross at all and the outer” one corresponds to the (unambiguously) more unequal distribution. It is also said to be Lorenz dominated” by the inner curve. Also, the Gini coefficient of the outer curve is larger than the Gini coefficient of the inner.

Two problems arise. First, if the Lorenz curves themselves cross (implying the CDF’s generating them cross more than once), we can’t even say that one distribution is more or less equal than the other (let alone if one is “better” than the other). We can calculate the Gini coefficients anyway and people do claim that distributions with higher Gini coefficients are more unequal than those with lower. But that requires making implicit (implicit in the calculations of Lorenz curves) value judgments about which parts in the distribution you have more interest in than other parts. Second, if the CDF’s cross more than once, then it’s hard to say anything at all about welfare. Though, again, Gini coefficients are routinely calculated and are used to judge inequality (welfare, too, sometimes) but that is legitimate only if you swallow the value judgments hidden in the math (which are neither obvious nor compelling in any moral sense).

Crossing Lorenz curves with equal Gini coefficients are illustrated in figure 9. Distribution 1 has the very poor somewhat better off than in distribution 2. Its Lorenz curve is inside or above distribution 2’s at the low end. On

the other hand distribution 1 has rich people who are much richer than its (maybe modest income, hard-working) middle class.  Its Lorenz curve is much steeper on the right side.  Their Gini coefficients could be the same or either one could be bigger than the other (it kind of looks like the Gini of 1 is a little bigger than 2’s but that was an accident). But to call one better” in a welfare sense clearly imposes very specific welfare weights on one kind of person relative to another.

Figure 9: Lorenz Crossings

Let’s go back to a simpler case. We now know that with single crossing of the CDF’s of two functions, one is more equal than the other.  Does that mean it’s necessarily “better” than the other?  “No” .  By that I mean, “yes” but only if the average income of the two distributions were the same and we agree that inequality per se is bad (as I mentioned before). But if they have different mean incomes then you have to know more about the distributions and, possibly, make further, explicit, value judgments to know if one is better” in a welfare sense than another. Equality is not welfare. Take another look at Figure 5. The “after” curve is definitely more unequal than the “before” . But it represents a higher average income and could represent a Pareto-improvement. So it has a strong claim to being “better” even though it is more unequal.

I’m sure it’s been bothering you over the last couple of pages and you’ve wanted to raise your hand to make a comment but you’d feel silly doing that in front of a computer screen: “if there is such a thing as First Order Stochastic Dominance, I’ll bet there is such a thing as Second Order Stochastic Dominance!” Indeed there is. Curiously, the term is very rarely used and most people don’t realize they are making use of this concept even though it is related to the Lorenz curve which is often used.

This concept isn’t used much using this name but it is the same as the Generalized Lorenz Curve” (GLC) in which each point on the ordinary Lorenz curve is multiplied by the average income of the society. Figure 10 shows examples of SOSD curves (which look just like GLC’s which you are more likely to run across in the literature).

Figure 10: Determining SOSD with distribution C being richer than B (and gure 5’s case added)

As a measure of welfare (not of inequality) I think SOSD has a few things going for it. Even if CDF’s cross, the one that is SOSD implies that the poorest people are so much better off that the income of any group of the poorest is always higher than the same group in the dominated distribution no matter where you make the cutoff defining “the poorest” . But this does require explicitly stating the preference for the very poorest and consider their welfare as compensatory enough to counter any reversals of relative income (crossings of the CDF’s) that occur higher in the income distribution.  (Since that is the clearest way I can say it and it is not obvious that everyone would understand it let alone agree with it, you can see I’m sticking in value judgments).  In lots of simpler cases, it makes a lot of sense. For example, in Figure 5, the richer society does SOSD the poorer even though it is more unequal.

Criteria used for evaluating measures of inequality and poverty.

Four criteria are usually considered helpful in deciding which measures of inequality you might prefer. They do not, by themselves, necessarily help in judging welfare. The above discussion should warn us against anyone claiming they have a measure that does both without adding explicit ethical judgments.

1. The anonymity principle. This was discussed above. It says that people are ethically interchangeable and we don’t care where anyone in particular is in an income distribution. This makes sense if we are comparing different societies or the same society at different times (where changes in the ranking of people might happen but which occur by chance or some other ethically uninteresting way).  It is less compelling as a criterion if we are comparing policies before and after a deliberate policy decision since, as discussed above, it can contradict the Pareto principle.

2. Independence of population size. Measures that change if you just duplicate the population without chang- ing the proportion of people at any income level would be hard to interpret over time or across countries. So, most economists think it’s good to avoid them.

3. Independence of average income. Similarly, we’d often like to discuss inequality separately from the level of income. Only relative incomes matter.

4. The Dalton” or transfer principle. A measure of inequality should go down whenever a transfer is made from a richer person to a poorer person. And up with transfers going the other way.

Some Measures of Inequality used in Practice

Please read Debraj Ray, Chapters 6 and 8 for details. Let there be n individuals with incomes y1 ,y2,y3 ...,yn . Let average income be deonted by μ = yi

1. Range: Compares extremes and misses everything in between

(a) R1 = [maxyiμminyi ]

(b) R2 =

2. Kuznets ratios: Relative share of income owned by bottom x% or top y%.

3. Relative Mean Deviation: (a) RMD = {i|n(y)μ(i一)μ| }

4. Coefcient of Variation. Standard Deviation divided by the mean

1

(a) COV =μ(1) (yi μ)2] 2

5. Variance of Logs: attaches more weight to to income differences at the bottom of the distribution. A transfer of $10 from a person with $100 to a person with $97 lowers the VLOG by more than a transfer of $10 from a person with $1000 to a person with $970.

(a) VLOG = loy logyi2 , where loy = logyi

6. GINI. Order y1 ≥ y2 ≥ y3 ... ≥ yn . (a) GINI:1nj2 yj |

Measures of Poverty

● We start with the CDF of income again and draw a poverty line in figure 11. Right away we get the most common measure of poverty:

● the Head Count Ratio (HCR) which is simply the proportion of people who live below z.

P0 = 1 1 (yiz) . It doesn’t matter how poor you are, if you are below the line.

Figure 11: CDF of Income with Poverty Line

A more sensitive measure is the povery gap:P1 = 1 1 1 (yi z) = np np = 1 .

Where np is the number of poor people and μp is the mean income of those who are poor.

● You can think of P1 as a per capita (in the population) measure of the total shortfall of welfare levels below the poverty line.

● Note that transfers between poor people have no effect on P1 .

● We can make this poverty measure to the distribution of income among the poor.  This is Sen’s poverty measure

PS = P0 1 1 yP, where yp is the GINI among the poor and μp is the mean incomes of the poor.

● The Sen measure is a GINI weighted average of HCR and Poverty Gap

Ps = P0yP + P1 1 yP

Ps = P1 if there is no inequality among the poor.

● A generalization of the Poverty Gap measure is the Foster, Greene, Thorbeck (FGT) measure.

Pa = 1 1 a 1 (yi z)

● Because the way it is constructed the Sen measure cannot be decomposed across sectors (say rural, urban), but the FGT measure can be.

Pa = s(S)= 1is 1 a 1 (yi z) = s Pa(s), here ns(p)are the number of poor in sector s and Pa(s) is just the Pa for sector s.

Two biggest problems with poverty measures

1. They are taken way too seriously. e.g. By the world bank as we shall see in the next class

2. Small changes in either the poverty line or mismeasurement of the income distribution can have large effects on counting poor” .

Figure 12: Poverty Line and HCR

A Short Note on Price Indices

● Price indicies can be used to compare one period to another or one location to another. Here, we’ll use them to compare one country to another. A useful place to start are the Paasche and Laspeyers indexes. Take two countries i and j. Let there be n = 1, 2, ...,N products. pin , qin denote prices and quantities of product n in country i. Further, suppose we want to compare country j to country i (the base country).

● Then, the Laspeyers price index compares prices using quantities from the base country (i) and the Paasche index compares using quantities from the comparison country (j).

Pij(L) = = sin

Pij(P) = = sjn1 = PjLi1

● In the problem set you will be asked to show that if the Laspeyres is larger than Paasche, then the ratio of the value of the bundle of goods in j versus in i, is larger at i’s prices than at j’s prices. This is called the Gershenkron bias. The CIA at somepoint was overstating the size of the Soviet economy by using US prices. Typically, empirically it turns out that Pij(L)  > Pij(P), which means that valuing poorer countries at US prices could make them seem larger.

● Fisher Ideal Index is the geometric mean of the Laspeyers and the Paasche Pij(F)  = .Pij(L)Pij(P)

● The Fisher Index statisfied the country reversal property, i.e., Pij(F)  = .

These are all bilateral indices, comparing two countries. For PPP we need multilateral indices.

● Like exchange rates we want to be able to convert one to another.

● Therefore also want a “no-arbitrage” condition.

Pij = PikPkj Ai, j , k

● One way to achive this is the GEKS Index:

● Choose a numeraire or benchmark country “1” . Say the US. If there are M countries in the world. Then Price index of i relative to 1:

Pi = P1jPji\M ,

where Pji is a superlative index like the Fisher Index. (A superlative index is one that can be justified by a utility function).

● This can also be achived with a regression.  Suppose the relative prices of products are the same across countriesun but they are denominated in local currencies so we need an “exchnage rate” or a country specific factor to multiply it.

pin = aiun

● Take logs and run an OLS regression. We’ll see an application later ln pin = lnai+ lnun+ eij

● The Geary-Khamis or GK system. The idea is to find a set of “World Prices” . It is a system of two equations solved simultaneously:

PiGK = 1 qikpik

uk =

here the base is the “world” .

● Note that the weights are plutocratic. Countries that produce more are given higher weights (in the second line). So China and the US could be compared using Italian prices. Not clear if this is a good idea given Gershenkron bias.

● GK is additive but GEKS is not.

Convert components of GDP using PPP and add up.