Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

FNCE 435 – Empirical Finance

Fall 2023

Individual Assignment (Section I)

(Only students from Section I  Wednesdays 9-11h45AMshould be working on this version of the assignment.)

(Important: This assignment is to be implemented on an individual basis. No sharing of material— including data, results, inferences and write-ups—may take place among members of a group nor across different groups.)

(About the significance level: For all the tests of significance in this assignment, please assume a 5% significance level.)

Examining Inclusions in the S&P 500

We will examine the event of inclusion of a firm in the S&P500 index. As the name suggests, the S&P 500 index is an aggregation of the value of 500 public firms that are deemed to be a good representation of the universe of firms out there. One can use the evolution of the price of this aggregated value to compute a return, and this return functions as a proxy for the overall return on the market.

From time to time, the composition of the index is altered. For example, if a firm that is part of the index goes bankrupt, is taken over, or goes private (or is not deemed important anymore), it has to be replaced by another firm. We look at the dates when replacements are announced. More specifically, our data point is a pair of firm i and date t, such that it became known by the market on day t that i would become part of the S&P 500 index.

As usual, we examine whether the announcement that a firm will be included in the S&P 500 carries any implication about the value of the firm. A basic tenet in Finance is that the value of a firm is the present value of its future cash flows.  In particular, this argument suggests that the price of a stock would not depend on how many stocks are available for trading and how many people are willing to buy or sell them.

It happens that it is very difficult to test this argument. One could analyze changes in the amount of stocks available for trading and what happens to the price of the stock, and perhaps observe a negative association between availability of shares and stock price. In fact, it has been shown that sales of large blocks of stocks of a firm are usually followed by negative price reaction on the firm’s stock. But, again, association does not imply causation. In this particular case, some bad news are driving both the large block sales and the negative price reactions (e.g., bad news implying lower future cash flows for the firm).

When a firm is included in the S&P 500 index, the index funds (which try to mimic the exact composition of the index) end up buying the firm’s stock. As a result, a substantial portion of the firm’s shares will be taken out of trading. Given the way firms are chosen to be part of the index (what mostly triggers the inclusion of a firm is the removal of another firm), it is reasonable to assume that no new information about a firm triggered its inclusion in the index. We thus have a situation where there is a change in the amount of stocks available for trading and no change in the information environment about the firm. We can then more clearly test the hypothesis that the price of a stock does not depend on the shares available for trading. In particular, if we find abnormal reactions to the announcement of inclusion in the index we conclude that the price of the stock does depend on the supply of stock—which happens when the demand curve for a stock slopes down. This point is extensively discussed in Shleifer (1986) (“Do Demand Curves Slope Down?,” Journal of Finance 41; available on Canvas), which was the first paper to analyze inclusions in the S&P 500.


I have collected for you a sample of inclusions in the S&P500 index between 1962 and 2000. The file is called “d_sp500.sas7bdat” and is available on Canvas. Each row of the dataset  identifies  a  different  announcement  of  inclusion  in  the  S&P  500.  For  each announcement, we record: PERMNO (permno of the firm), COMNAM (the firm name), ANNDATE_SAS (the announcement date), and EFFDATE_SAS (the effective date of the inclusion in the S&P500).

Figure 1 shows some observations in the sample of announcements of inclusion in the S&P 500  index.  The  first  row  refers  to  the  announcement  on  August   18,  1992  that  Sun Microsystems  would  enter  the  composition  of  the  index.  The  effective  date  of  Sun Microsystems’ inclusion in the index was August 19, 1992.

PERMNO    COMNAM                                            ANNDATE  SAS     EFFDATE  SAS

1              10078 SUN MICROSYSTEMS INC                      8/18/1992                8/19/1992 

2              10107 MICROSOFT CORP                                  5/12/1994                  6/6/1994  

3              10137 ALLEGHENY ENERGY INC                     12/4/2000                 12/8/2000

           …                                         …                                               …                                

Figure 1: A few data points on the announcements dataset

There are three questions that you will examine with respect to announcements of inclusion in the S&P 500:

I.      Do announcements of inclusion in the S&P 500 have valuation implications? You will examine how markets react to the such announcements.

II.      What are the determinants of market reactions to the announcement of a firm’s inclusion in the S&P 500 index?

III.      What characteristics seem to drive the decision to add a firm to the S&P 500 index?

Part I: Valuation Implications

We start our investigation by answering question I above:

Do announcements of inclusion in the S&P 500 have valuation implications? You will examine how the price of a firms share changes at the announcement that the firm will be part of the S&P 500 index.

In this part of the project, you will examine how markets react to announcements of inclusion  in  the  S&P  500.  You  will  address  this  question  through  an  event  study— employing the technique covered in module 4. For each announcement, you examine the pattern of abnormal and cumulative abnormal returns over the 21-day window (relative days –10 through +10) around the announcement day (variable ANNDATE_SAS).

For the definition of abnormal return, use the constant-mean return model—that is, define abnormal return as the stock raw return (the variable RET in the CRSP dataset DSF, located in “/wrds/crsp/sasdata/a_stock”) minus the average return of the firm’s  stock. For an announcement of inclusion in the S&P 500 of firm i at time t, define average return as the average of the variable RET for firm i over the window [t – 365, t – 20]—that is, from 365 calendar  days  before  the  announcement  date  up  to  20  calendar  days  before  the announcement date. Important: notice that the abnormal return here is different from the abnormal return used in module 4. In module 4 we employed the market-adjusted return as proxy for abnormal return.

Prepare a table showing average abnormal returns, t-stats and p-value for the 21-day window  around  announcements.  Also  create  a  graph  showing  the  pattern  of average cumulative abnormal returns over the same 21-day window. Then discuss how the markets interpret the announcements of inclusion of a firm in the S&P 500 index. Anchor your inferences on formal hypotheses testing.

Finally, examine whether markets respond efficiently to news in such announcements. For this you can assume that some announcements happen after the close of the market—that is, market reactions to such announcements could happen up to one day after the event date. Given that you track up to 10 days after the event date, and assuming that day +1 is still part of the event, you can check for market efficiency by analyzing the cumulative abnormal return from day +2 thru day +10.

Part II: More on Valuation Implications: A Regression Framework

Next step is to understand the market reactions to announcements of inclusion in the S&P 500. This refers to the question II above:

What are the determinants of market reactions to the announcement of a firm’s inclusion in the S&P 500 index? In part I, one examines if markets react to the announcements, while now you examine in a regression framework what are the determinants of the market reactions to such announcements.

Two basic control variables are examined. One has to do with the power of the index fund industry. Page  584  of the  Shleifer  (1986)  hints  at that. It  explains that  an  important implication of the story that the value of a stock should be related to the number of stocks available to trade is that “the share price increase on the announcement date should be positively related to the shift of the demand curve.” For that we can look at the pattern of increase in index funds activity.

Here is a picture of the pattern of passive holdings, from Anadu et al. (2018).

 

Figure 2: Index funds holdings: share of total ownership

The graph shows a steady increase in the fraction of stocks held by index (also known as “passive”) funds, from less than 5% in 1995 to almost 40% by 2017. Thus, a stock being included in the index in the later part of the sample would imply a relatively larger fraction of its stocks being taken by index funds.

How to define the strength of an announcement of inclusion in the S&P 500? The perfect measure would be to have the share of total holdings by index funds. But we do not have access to that measure. Instead we will a simple time trend. Define the variable LT as the natural logarithm of the number of days since 01/01/1962.

The second explanatory variable is a measure of buying by index funds, proxied by the excess  announcement  date  volume.  As  Shleifer  (1986)  explains,  “representatives  of Vanguard Index Trust and Wells Fargo Index Fund have suggested to me that these funds buy the necessary shares within a few days of the inclusion. To gauge the amount of each stock bought by index funds and self-indexing investors, we look at abnormal daily volume on the announcement date, defined as the dference between the abnormal daily volume and the average daily volume in the previous six months, both expressed as a fraction of the number of shares outstanding.

More specifically, the measure of abnormal volume, AVOL, is defined as follows. The daily stock dataset DSF (located in “/wrds/crsp/sasdata/a_stock”) contains the variables VOL (# of stocks traded in that day) and SHROUT (number of shares outstanding, in thousands of shares). For an announcement that firm i will enter the index, compute for i in each of the relative days in the window [–180,+1],where day 0 is ANNDATE_SAS, the measure PERC_VOL=100*VOL/(SHROUT*1000). Notice that PERC_VOL is stored in percentage terms (so that PERC_VOL=0.5 means 0.5%). PERC_VOL thus measures the fraction of i’s shares that traded in that day. The announcement’s percentage trading volume is PERC_VOL at day 0 plus PERC_VOL at day +1. We then subtract from the announcement’s volume the average PERC_VOL over the window [–180, –10] to get the firm’s abnormal volume, AVOL.

We  thus  want  to   examine  the  relation  between  the   information   embedded  in  the announcement of inclusion in the S&P 500 and market reactions around the announcement. The event study does not address this question because it pools together all announcements, disregarding the information about the announcements’ characteristics.

For that, one needs a regression framework that relates the market reaction with the information in the announcement. The left-hand side variable (named CAR) is defined as the cumulative abnormal return from relative day 0 (the announcement day) to relative day +1 (the day after the announcement). While this variable can be obtained from the data used to analyze your event study in part I, you should use the measure provided by the instructor in the dataset “d_sp500_car.sas7bdat” (available on Canvas). Each row of this dataset has the variables PERMNO, ANNDATE_SAS and CAR, so that you can combine this dataset with the dataset on inclusions in the SP&500 in order to obtain the CAR measure for each firm entering the index.

(The CAR measure in “d_sp500_car.sas7bdat” is not exactly the measure you may obtain from your event study, and thus should not be used to evaluate whether you event study is correct. However, the CAR measure in this dataset is close enough to the true measure and thus should be employed for your analysis in part II.)

The main explanatory variables in the regression are LT and AVOL. The idea is thus to run a regression as

CARi  β0  + β1LTi  β2AVOL i  ε i

Please collect the results of the regression, then analyze whether LT and AVOL are related are related to CAR. Treat each variable (and its hypothesis)  separately. Examine the magnitude of the effect of LT on the market reaction to the firm’s stock price. Repeat the analysis for AVOL.

However, the problem with inferences from the model above is that we may need to control for other potential determinants of market reactions. Since the model aims at explaining returns, we may need to control for other determinants of returns, such as the CAPM’s beta and firm size. One can quickly build up stories that leaving these out of the model may bring the wrath of the omitted variable bias. For example, perhaps the time trend variable LT is also related to firm’s beta—say, because firms in general have become riskier over time. If riskier companies have higher returns, then this risk effect might be biasing the results of a regression model that omits beta from the set of explanatory variables.

To avoid the perils of the omitted variable bias, the idea is to run a multiple regression model, as in

CARi  β0  + β1LTi  β2AVOL i  β3BE TAi  β4SIZE i  β5R OA i  ε i

where the other control variables in the model are described here:

.    BETA: the beta of the  firm’s stock. The information is available to you in the dataset “d_sp500.sas7bdat” .

.    SIZE: the natural logarithm of total assets of the firm entering the index, measured in   the   year   before   the   year    of   the    announcement—that    is,    as    of YEAR(ANNDATE_SAS) – 1. Total assets is the variable AT in Compustat dataset FUNDA, located at “/wrds/comp/sasdata/nam” . We use the log formulation so that our analysis refers to rates of change in total assets being associated with market reactions.

.    ROA: the proxy for firm performance, call it ROA, is defined as the return on assets for the firm in the three years preceding the year of the announcement of the inclusion in the S&P 500. Return on assets is defined as net income (variable NI in Compustat dataset FUNDA, located at “/wrds/comp/sasdata/nam”) divided by total assets     (variable     AT      in     Compustat      dataset     FUNDA,     located      at “/wrds/comp/sasdata/nam”).  Notice  that  ROA  is  stored  in  decimals  (that  is, ROA=0.01 implies a 1% ROA).

Given that many of these variables are new, it is a good idea to create and show a summary statistics of the variables involved in this study. You can also create and show a correlation table involving them. Having (and showing) the summary statistics and the correlation table, discuss whether the concerns about the omitted variable bias are warranted.

Run the regression model  above  and  analyze  its  results.  You  should  address,  at  the minimum, the following items:

.    For each of the control variables in the regression, discuss whether it is related to market reactions to the announcement. If so, please discuss the magnitude of the effect.

.    Discuss the R2 of the model. What doestheR2 represent here? (You should present the adjusted-R2.)

.    Please make  sure that your results are not plagued by heteroscedasticity. First, discuss whether the residuals in the model are homoscedastic or not. If they are not homoscedastic, please adjust the standard errors to account for heteroscedasticity.

In your last step, examine a possible nonlinearity in the multiple regression model above, in that the effect of AVOL on market reactions depends on firm size. That is, add the variable AVOL_SIZE=AVOL*SIZE in the model, rerun the regression, and discuss the inferences you learn from the interaction effect (i.e., reexamine the effect of AVOL on market reactions).

Part III: Determinants of Inclusion in the S&P 500

We now address question III:

What characteristics seem to drive the decision to add a firm to the S&P 500 index?

Shleifer (1986) discusses that Neubert ("The Ins and Outs of the  S & P 500," Market Perspectives 3, Chicago Mercantile Exchange, February, 1985) had listed six criteria for inclusion of a firm in the  S&P 500 index: size, industry classification, capitalization, turnover, emerging companies/industries, and responsiveness of the movements of stock price to changes in industry affairs. Let’s try some of these: size, industry membership and turnover, and also add performance.

The idea is to look at a firm that was announced to enter the index and compare it with a firm that was not announced to be part of the index. Thus, each firm entering the index is randomly matched to another firm such that this other firm was not entered in the index at the announcement date. Information about the matched firm also appears in the dataset “d_sp500.sas7bdat”. Examples appear in Figure 3.

We know already know about the August 18th, 1992 announcement that Sun Microsystems (PERMNO=10078) was entering the index. The variable MATCHED_PERMNO contains the identifier of a second firm that was not  announced to enter the index in that same day. Thus, the firm with permanent number 54173 did not enter the index on August 18th, 1992 (nor in the year surrounding that date).

PERMNO   COMNAM                                            ANNDATE  SAS      …        MATCHED  PERMNO          

1

10078 SUN MICROSYSTEMS INC

8/18/1992

54173

2

10107 MICROSOFT CORP

5/12/1994

79298

3

10137 ALLEGHENY ENERGY INC

12/4/2000

10180

           …                                         …                                            …                    …                             …                             …

Figure 3: More variables in the announcements dataset

The dataset allows us to analyze announcements of inclusion in the S&P 500, since we have data on companies that entered the index and companies that did not. No surprise here, since the decision to be entered in the index is a binary variable, you should use the PROC LOGISTIC to implement your regression model, as

Prob(Inci = 1) = f (β0+ β1X1i+ ... + βkXkiεi )

where INC=1 denotes a firm that was included in the index.

Notice that in order to run this regression, you need to have a dataset with a different row for each firm—such that each row of the announcements dataset yields two rows of your regression dataset. Take the first row in Figure 3, for example. From that row, you need to create one row with PERMNO=10078,DCLRDT=8/18/1992 and INC=1 and one row with PERMNO=54173, DCLRDT=8/18/1992 and INC=0, as in Figure 4 here:

INC               PERMNO         ANNDATE_SAS         

1

2

3

4

5

6

 

1

0

1

0

1

0

10078

54173

10107

79298

10137

36864


8/18/1992

8/18/1992

5/12/1994

5/12/1994

12/4/2000

12/4/2000


Figure 4: Reorganizing the announcements dataset for a logistic regression

The explanatory variables of the logistic regression come from the suggested explanatory variables by Shleifer (1986):

.    Turnover  (TURNOVER):  companies with high trading volume might be more amenable to becoming part of the index. We measure turnover as the average of PERC_VOL  defined  over  the  calendar  days  [–180,  –10],  where  day  0  is

ANNDATE_SAS. Recall that the measure of daily PERC_VOL was defined when discussing the control variable AVOL in part II. (Notice a slight different definition here: in part II we used PERC_VOL to build AVOL based on trading days. Here we use PERC_VOL to build TURNOVER based on calendar days.)

.    Firm size (SIZE): firm size is as defined in part II. It is the natural logarithm of total assets of the firm, measured in the year before the year of the announcement—that is, as of YEAR(ANNDATE_SAS) – 1. Total assets is the variable AT in Compustat dataset FUNDA, located at “/wrds/comp/sasdata/nam”).

.    Stock performance (PERF): a good past performance may attract the attention of S&P about whom enters the index. After all, one should not add a firm to the index if the firm is expected to underperform (and face the risk of being delisted). PERF for firm i and announcement date t is defined as the average daily return for i’s stock computed between 730 calendar days prior to t and 365 calendar days prior to t. PERF is measured in decimals (so that PERF=0.01 implies a 1% average daily return).

.    Industry membership (binary variables IND1 thru IND11): the dataset “d_sp500” contains the variables IND (for the firm entering the index) and MATCHED_IND  (for the matched firm). IND (MATCHED_IND) is a number between 1 and 12,  indicating the firm (the matched firm) belongs to one of 12 industries. You will  need to define 11 binary variables to identify the industry to which the firm (the  matched firm) belongs. For example, you define IND1=1 if IND is equal to  1;  IND1=0 ifIND is different from 1. Then you define IND2=1 ifIND is equal to 2;  IND2=0  if IND  is  different  from  2.  And  similarly  for  the  remaining  binary  variables.

Prepare and show a summary statistics table of the explanatory variables for the sample with INC=1 vs. the sample with INC=0. Interpret the numbers.

Then run the logistic regression explaining the likelihood of the firm entering the index,

P ro b (In ci  = 1) = f(β0  + β1TURNO VER i  β2SIZE i  β3PERFi  + . . .

. . . + β4IN D 1i  β5IN D i  + . . . + β1 4IN D 1 1i )

Report the regression results. Discuss the significance of each coefficient, and interpret the effect of each variable on the likelihood that a firm enters the index. The effect should be based on the change in the odds that the inclusion occurs. You can play with some specific changes; for example, when examining the effect of changes in PERF, you may examine the effect of having the PERF increase by 0.01—say from 0.10 to 0.11.

Finally, compute the predicted probability of a firm entering the index for a firm with the following values: TURNOVER=0.10, AT=60, PERF=0.002, and IND=3 (you can rely an Excel worksheet to compute this probability).