Home Others Instrumental Variables

Instrumental Variables

overview

Software

description

Web pages

Readings

Courses

overview

Instrumental variables are briefly described on this page, followed by an annotated list of resources.

description

The instrumental variable (IV) estimate is used when the model has endogenous X's. IV can therefore be used to address the following major threats to internal validity:

1. Omitted variable bias from a variable that correlates with X but is unobserved and therefore cannot be included in the regression
3. Simultaneous causality bias (endogenous explanatory variables; X causes Y, Y causes X)

Instrumental variable regression can eliminate bias from these three sources

  • Sources of bias - omitted variables, measurement errors, simultaneous relationship
    Consider the following regression model
    which corresponds to the standard OLS assumptions. Assume that the variable x2 is not observed. The estimated regression model is therefore
    where ui = xi2 + b2 + vi. Regressors xk in x1 are therefore correlated with the error term u if they are correlated with the omitted variable x2. If xi1 and xi2 are scalars, then cov (xik, ui) = b2cov (xik, xi2).

  • Measurement error



    u *, which is the covariance in the example above

    in comparison to coding the manifest content of communication, coding the latent content
  • Simultaneous relationship


    The above system of equations is also known as reverse causality, since the dependent variable y1 has a feedback effect on the regressor y2. In the example above, z2 and z1 are simple tools for IV estimation of the first and second equations, respectively.

Instrumental Variables: Intuition

  • An instrumental variable, Z is not correlated with disorder e but correlated with X (e.g. proximity to college might be correlated with schooling but not with residual wages)

  • With this new variable, the IV estimator should only capture the effects of shifts from X to Y caused by, while the OLS estimator captures not only the direct impact on, but also the impact of the included measurement error and / or endogeneity

  • IV is not as efficient as OLS (especially when Z is only weakly correlated with X, i.e. when we have so-called 'weak instruments') and only have large sampling properties (consistency)

  • IV leads to distorted coefficients. With weak instruments, the bias can be great

Identification and estimation

Compliance status out of the scope for potential outcomes

  • If we assume a situation where an experimenter conducted a randomized experiment where the participants are preschoolers, where the treatment is the Sesame Street television program, and the result of interest is the score on the letter recognition test

  • In this experiment, watching itself cannot be randomly assigned, only encouragement to watch the show can be randomly assigned

  • Using the randomization of encouragement could estimate a causal effect of the observation for at least some of the people in the study

  • As shown above, the children in the study could be categorized according to their compliance status

Status

Xi(1)

Xi(0)

Always a buyer

1

1

Never buyers

0

0

Compiler

1

0

Defiler

0

1

  • Compliers are the only children for whom we draw conclusions about the effect of watching Sesame Street. This effect is known as the Complier Average Causal Effect (CACE).

Four important assumptions for IV

  1. Ignorability of the instrument: The instrument should be randomized or conditionally randomized with regard to outcome and treatment variables

  2. Association not equal to zero between IV and treatment variable: The instrument must have an influence on the treatment

  3. Monotony: Suppose there were no children who would watch if they weren't encouraged, but who wouldn't watch if they were encouraged (no defiance).

  4. Limitation of exclusion: The instrument has no direct influence on the result, except indirectly through the treatment

Forest estimator and two-stage least squares estimator: From the Sesame Street example

unit

Xi(0)

Xi(1)

Status

WITH

Yi(0)

Yi(1)

Yi(1)-Yi(0)

1

0

1

Fulfillers

0

67

76

9

two

0

1

Fulfillers

0

72

80

8

3

0

0

Never buyers

0

68

68

0

4

1

1

Always a buyer

0

76

76

0

5

1

1

Always a buyer

0

74

74

0

6

0

1

Fulfillers

1

67

76

9

7

0

1

Fulfillers

1

72

80

8

8

0

0

Never buyers

1

68

68

0

9

1

1

Always a buyer

1

76

76

0

10

1

1

Always a buyer

1

74

74

0

. The intent-to-treat effect (ITT) in the hypothetical table above for the 10 observations is an average of the effects for the 4 induced observers, along with 6 zeros corresponding to the encouragement effects for the always-takers and never-takers:

ITT = (9 + 8 + 0 + 0 + 0 + 9 + 8 + 0 + 0 + 0) / 10 = 8.5 * (4/10) + 0 * (6/10) = 3.4

. The effect of observing Sesame Street for the complier is 8.5 points and this corresponds algebraically to the intent-to-treat effect (3.4) divided by the proportion of compliers (4/10). This ratio is called the forest estimate
. However, the two-step least squares method is a more general estimation strategy with a regression frame that allows control of covariates. And the steps required are as follows:

- Regress the treatment variable on the randomized instrument
- Substitute predicted values ​​into the equation that predicts the outcome

Some problems for IV

Where do valid instruments come from?
. A common way to find instruments is to look for exogenous variation - variation that is randomly assigned 'as if' in a randomized experiment - that affects.
- VAT shifts the supply curve for cigarettes, but not the demand curve; Sales taxes are randomly assigned 'as if'

Weak instruments
. IV estimates are not unbiased, and the bias tends to be greater with weak instruments (even with very large datasets).
. Adding more and more tools to improve asymptotic efficiency does not solve the problem
. Recommendation Always test the 'strength' of your instrument (s) by reporting the F-test of the instruments in the first regression level

Summary

. A valid instrument lets us isolate a part of X that is not correlated with, and that part can be used to assess the effect of changing X to Y. to estimate
. IV depends on having valid instruments: A valid instrument isolates variations that are randomly assigned 'as if'

Readings

Textbooks & Chapters

Angrist, Joshua D. and Jörn-Steffen Pischke. 2009. Mostly harmless econometrics: A companion of an empiricist. Princeton, NJ: Princeton University Press.

- One of the canonical textbooks of microeconometrics covering the main causal inference techniques including IV, difference-in-differences, fixed effects, regression discontinuity, quantile regression and standard error problems in important previous applications. Compared to other causal inference books, the IV part in this book is explained in more detail, and to fully understand this part requires knowledge of OLS and asymptotic theory.

. Stephen L. Morgan and Christopher Winship. 2007. Counterfactual and Causal Inference: Methods and Principles for Social Research. New York, NY: Cambridge University Press.

The first comprehensive review of the counterfactual approach to causal inference from a framework for potential outcomes, written for a social science audience with a strong emphasis on causal reasoning versus mathematical deductions, but now a bit out of date and the second edition comes out in 2014 or 2015. One chapter is specially dedicated to the IV.

. Guo, Shenyang, and Mark W. Fraser. 2010. Propensity Score Analysis: Statistical Methods and Applications. Thousand Oaks, CA: Sage Publications.
- A specifically propensity score matching-oriented textbook, which also briefly deals with IV together with conceptually similar methods such as Heckman's sample selection model and treatment effect model. Stata codes with related examples are provided.

. Wooldridge, Jeffrey M. 2010. Econometric analysis of cross-sectional and panel data. Cambridge, MA: MIT Press.
. An excellent results-oriented treatment of modern applied econometrics including the IV method, a current favorite of advanced elicitation courses in econometrics. This book focuses more on mathematical details and therefore requires a solid understanding of multivariate analysis.

Methodical articles

Angrist, Joshua D., Guido W. Imbens, and Donald B. Rubin. 1996. Identification of causal effects with instrumental variables. American Statistical Association Journal 91 (434): 444-455.

- The classic treatment of IV from a counterfactual perspective. Canonical treatment of estimates of the local mean treatment effect.

. Martens, Edwin P., Wiebe R. Pestman, Anthonius de Boer, Svetlana V. Belitser, and Olaf H. Klungel. 2006. Instrumental Variables: Applications and Limitations. Epidemiology 17 (3): 260-267.

- An introductory article by epidemiologists.

. Hernán, Miguel A., and James M. Robins. 2006. Tools for Causal Inference: An Epidemiologist's Dream? Epidemiology 17 (4): 360-372.

- Offers four different definitions of IV with some extensions.

. Swanson, Sonja A. and Miguel A. Hernán. 2013. How to Report Instrumental Analysis of Variables (suggestions welcome) Epidemiology 24 (3): 370-374.

- Provides a normative checklist for performing IV analyzes.

. Bollen, Kenneth A. 2012. Instrumental Variables in Sociology and the Social Sciences. Annual Sociology Review 38: 37-72.

- A current overview of IV applications from sociology and the social sciences.

Application item

. Angrist, Joshua D. 1990. Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records. American Economic Review 80 (3): 313-336.

- Perhaps the most famous IV application.

. Acemoglu, Daron, Simon Johnson, and James A. Robinson. 2001. The Colonial Origins of Comparative Development: An Empirical Study. American Economic Review 91 (5): 1369-1401.

- Another classic in IV applications with European death rates as an instrument.

. Kim, Daniel, Christopher F. Baum, Michael L. Ganz, S.V. Subramanian and Ichiro Kawachi. 2011. The Contextual Effects of Social Capital on Health: A Cross-Country Instrumental Analysis of Variables. Social Sciences and Medicine 73: 1689-1697.

- Use of corruption / population density and religious fractionation and population density as tools for social capital at the country level.

. Fish, Jason S., Susan Ettner, Alfonso Ang und Arleen F. Brown. 2010. Association of Perceived Neighborhood Safety on Body Mass Index. American Journal of Public Health 100(11): 2296-2303.

- Using household crime and collective neighborhood effectiveness as tools for perceived neighborhood security.

. Davies, Neil, George Davey Smith, Frank Windmeijer, and Richard M. Martina. 2013. COX-2 Selective Nonsteroidal Anti-Inflammatory Drugs and the Risk of Gastrointestinal Complications and Myocardial Infarction: An Instrumental Analysis of Variables. Epidemiology 24 (3): 352-362.

- The most careful and comprehensive assessment of the IV assumptions in any application.

Web pages

. Workshop for applied microeconometrics (by Guido W. Imbens and Jeffrey M. Wooldridge)
http://www.irp.wisc.edu/newsevents/workshops/appliedmicroeconometrics/schedule1.htm

. Cyrus Samii's class website on Quant II (weeks 10-11 on instrumental variables)
http://cyrussamii.com/?page_id=1595

Courses

. Casual Inference: Methods for Programme Evaluation and Policy Research (bei Jennifer Hill an der NYU Steinhardt; angeboten im Herbstsemester)
. Quantitative Political Analysis II (with Cyrus Samii at NYU Politics; offered in the spring semester)
. Quantitative Strategies (with Thomas DiPrete at Columbia Sociology; offered in the fall semester)
- The teaching material is essentially the same as Samii's but with a more sociological research orientation.

Interesting Articles