Home Others Propensity Score Analysis

Propensity Score Analysis




Web pages




The PS is a probability. In fact, there is a conditional probability of being exposed to a number of covariates, Pr (E + | covariates). We can calculate a PS for each person in an observational study regardless of their actual exposure.

As soon as we have a PS for each motif, we return to the real world of the exposed and the unexposed. We can match exposed subjects with unexposed subjects with the same (or very similar) PS. Thus, the likelihood of being exposed is the same as the likelihood of not being exposed. The exposure is random.


The propensity score analysis (PSA) was developed in order to achieve interchangeability between exposed and non-exposed groups in observational studies without relying on traditional modeling. Interchangeability is critical to our causal conclusion.

In experimental studies (e.g. randomized control studies) the exposure probability is 0.5. Thus the probability of not being exposed is also 0.5. The probability of being exposed or unexposed is the same. Therefore, the actual exposure status of a subject is random.

This equal probability of exposure allows us to better claim that the exposed and non-exposed groups are the same for all factors except their exposure. Hence we say that we have interchangeability between groups.

One of the greatest challenges in observational studies is that the likelihood of being in the exposed or non-exposed group is not random.

There are several cases when an experimental study is not feasible or ethical. However, we still want to achieve the interchangeability of groups through randomization. PSA helps us mimic an experimental study with data from an observational study.

Implementation of PPE

5 quick steps to PPE
2. Use logistic regression to get a PS for each subject.
4. Check the balance of the covariates in the exposed and non-exposed groups after matching for PS.
5. Calculate the effect estimate and standard errors with this population of agreement.

1. Decide on the set of covariates you want to include.
This is the critical step in your PPE. We use these covariates to predict our likelihood of exposure. We want to include all predictors of exposure and none of the effects of exposure. We don't take the result into account when deciding on our covariates. We can include disruptive factors and interaction variables. When we have doubts about the covariate, we include it in our covariate set (unless we think it is an effect of exposure).

2. Use logistic regression to get a PS for each subject.
We use the covariates to predict the likelihood of exposure (that's the PS). The more true covariates we use, the better our prediction of the probability of exposure will be. We calculate a PS for all motifs, exposed and unexposed.

With numbers and Greek letters:
PS= (exp(β0+β1X1+…+βpXp)) / (1+exp(β0 +β1X1 +…+βpXp))

3. Assign exposed and unexposed subjects on the PS.
We would like to compare the exposed and unexposed people according to their exposure probability (their PS). If we cannot find a suitable match, this topic will be discarded. Discarding a topic can distort our analysis.

There are several methods of matching. Most common is the closest neighbor within calipers. The closest neighbor would be the unexposed object whose PS is closest to the PS for our exposed object.

We may not be able to find an exact match, so we say we accept a PS value within certain limits. We set an a priori value for the brake calipers. This value typically ranges from +/- 0.01 to +/- 0.05. Below 0.01, we can see a lot of variability within the estimate as we struggle to find matches and this leads us to discard these topics (incomplete match). If we go beyond 0.05, we may be less confident that our exposed and non-exposed ones are really interchangeable (imprecise match). Typically 0.01 is chosen for a cutoff.

The ratio of exposed to unexposed motifs is variable. A 1: 1 match can be done, but often a match with substitution is done instead to allow better matches. Matching with replace enables the unexposed motif that was matched with an exposed motif to be returned to the pool of unexposed motifs available for comparison.

There is a tradeoff in terms of bias and precision between matching with replacement and without (1: 1). Matching with replacement allows less bias due to better matching between subjects. Matching without replacing has better precision because more motifs are used.

4. Check the balance of the covariates in the exposed and non-exposed groups after matching for PS.
In order for us to be able to draw causal conclusions from our data, there must be a considerable overlap in the covariates between the exposed and non-exposed groups. This is the case with all models, but with PSA it is optically very clear. If there is no overlap in the covariates (i.e., if we don't have an overlap in the propensity scores), then all conclusions would be drawn without the support of the data (and thus the conclusions would be model dependent).

We can use a couple of tools to assess our covariate balance. First we can make a histogram of the PS for exposed and unexposed groups. Second, we can judge the standardized difference. Third, we can assess the bias reduction.

Standardized difference = (100 * (mean (x exposed) - (mean (x unexposed))) / (sqrt ((SD ^ 2exposed + SD ^ 2unexposed) / 2))

More than 10% difference is considered bad. Our covariates are distributed too differently between exposed and non-exposed groups for us to feel comfortable assuming interchangeability between groups.
We would like to see a significant reduction in the bias from the analysis without matching to the matched analysis. What substantive means is up to you.
Estimation of the average treatment effect of the treated (ATT) = sum (y exposed - y not exposed) / number of matching pairs
The resulting matched pairs can also be analyzed using standard statistical methods, e.g. Kaplan-Meier, Cox proportional hazard models. You can include PS as a continuous measure in the final analytical model, or you can create quartiles and layer them.

A few more notes on PPE
Since PSA can only handle measured covariates, the full implementation should include sensitivity analysis to assess unobserved covariates.
Although PSA has traditionally been used in epidemiology and biomedicine, it has also been used in educational tests (Rubin is one of the founders) and ecology (EPA has a website about PSA!).

Strengths and Limits of PPE

PSA uses a score instead of multiple covariates in estimating the effect. This allows a researcher to use dozens of covariates, which is usually not possible in traditional multivariate models due to the limited degrees of freedom and zero count cells that result from layering multiple covariates.
The patients included in this study may be a more representative sample of real world patients than an RCT would provide.
We avoid conclusions outside of the support.
We don't need to know the causes of the outcome to create interchangeability.

The group overlap must be significant (to allow adequate mapping).
PSA works best in large samples to get a good covariate balance.
Does not take cluster formation into account (problematic for neighborhood research).


Textbooks & Chapters

Oakes JM und Johnson PJ. 2006. Propensity Score Matching for social epidemiology in Methods in Social Epidemiology (Hrsg. JM Oakes und JS Kaufman), Jossey-Bass, San Francisco, CA.
Simple and clear introduction to PPE with a practical example from social epidemiology.

Hirano K und Imbens GW. 2005. Propensity Score with Continuous Treatments in Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives: An Essential Journey with Donald Rubin’s Statistical Family (Hrsg. A Gelman und XL Meng), John Wiley & Sons, Ltd, Chichester, UK.
Discussion of the use of PPE for continuous treatments.

Methodical articles

Rosenbaum PR and Rubin DB. 1983. The central role of the propensity score in observational studies of causal effects. Biometrics, 70 (1); 41-55.
Germinal article on PPE.

Rosenbaum PR and Rubin DB. 1985. The bias due to incomplete agreement. Biometrics, 41 (1); 103-116.
Discussion of the bias due to incomplete agreement among subjects in PSA.

D’Agostino RB. 1998. Propensity score methods for reducing bias when comparing treatment with a non-randomized control group. Statistician Med, 17; 2265-2281.
Another discussion of PPE with elaborated examples. Includes calculations of standardized differences and bias reduction.

Joffe MM and Rosenbaum PR. 1999. Invited Commentary: Propensity Scores. At the. J. Epidemiol, 150 (4); 327-333.
Discussion of uses and limitations of PPE. Also includes the discussion of PPE in case cohort studies.

Application item

Kumar S and Vollmer S. 2012. Access to Improved Sanitation Reduces Diarrhea in Rural India. Health economist DOI: 10.1002 / hec.2809
Applies PPE to hygiene and diarrhea in children in rural India. Lots of explanations about how PPE was done in the newspaper. Good example.

Suh HS, Hay JW, Johnson KA, and Doctor, JN. 2012. Comparative efficacy of statin plus fibrate combination therapy and statin monotherapy in patients with type 2 diabetes: use of propensity score and instrumental variable methods to adjust for treatment selection errors. Pharmacoepidemiol and drug safety. DOI: 10.1002 / pds.3261
Applies PSA to therapies for type 2 diabetes. Also compares PSA to instrumental variables.

Ruby DB. 2001. Using Propensity Scores to Design Observational Studies: Application to the Tobacco Litigation. Health Service Outcomes Res Method, 2; 169-188.
More advanced use of PSA by one of the PSA developers.

Landrum MB and Ayanian JZ. 2001. Causal Effect of Special Outpatient Care on Post Myocardial Infarction Mortality: A Comparison of Propensity Socre and Instrumental Variable Analysis. Health Service Outcomes Res Method, 2; 221-245.
A good clear example of PSA applied to post-MI mortality. Comparison with IV methods.

Bingenheimer JB, Brennan RT and Earls FJ. 2005. Exposure to gun violence and serious violent behavior. Science, 308; 1323-1326.
Interesting example of PPE applied to exposure to firearm violence and subsequent severe violent behavior.

Web pages

Statistical software implementation

For SAS macro:
http://ndc.mayo.edu/mayo/research/biostat/sasmacros.cfm gmatch: Computer-aided matching of cases with controls using the greedy matching algorithm with a fixed number of controls per case.
vmatch: Computer-aided matching of cases with controls using the variable optimal match.

SAS Documentation:

Introduction to the state:
http://help.pop.psu/edu/help-by-statistical-method/propensity-metching/Intro to P-score_Sp08.pdf

For R program:

cohen v. california, 403 u.s. 15 (1971)

General information on PPE

Slides from the ASA presentation by Thomas Love 2003:

Resources (handouts, annotated bibliography) by Thomas Love:

Explanation and example from the ecology of PPE:


A Online-Workshop zum Propensity Score Matching is available through EPIC

Estimating the impact of mental health interventions in non-experimental settings
Thu-F, 14.-15. June 2012, 8:30 - 16:30

Interesting Articles

Editor'S Choice

VIRTUAL EVENT. Reporting on the Siege of Sarajevo by Kenneth Morrison and Paul Lowe
VIRTUAL EVENT. Reporting on the Siege of Sarajevo by Kenneth Morrison and Paul Lowe
Book review: 'Köstler
Book review: 'Köstler'
When Arthur Koestler arrived in New York City in March 1948 to tour America, his visit hit the headlines. Carnegie Hall was filled with an audience of 3,000, eager to hear Koestler's thoughts on the radical dilemma and America's urgent need to face Soviet communism.
The hunt for the first Exomoon could be over
The hunt for the first Exomoon could be over
Columbia astronomers David Kipping and Alex Teachey report a moon around a Jupiter-like planet called Kepler-1625b.
Sound and image in modern East Asian music
Sound and image in modern East Asian music
Natacha Diels
Natacha Diels
Natacha Diels (DMA, Composition, 2016) is Assistant Professor at the University of California, San Diego. Her work combines ritual, improvisation, traditional instrumental technique and cynical play to create worlds of curiosity and discomfort. As an accomplished composer and interpreter, Natacha's unique musical approach continues to contribute to the continuous development of new ones
Paul Hogan
Paul Hogan
Paul Damian Hogan is a New York City based composer. He was recently nominated for an Emmy for his score on Birders: The Central Park Effect. He received his PhD in musical composition from Columbia University in 2007 and then decided to write music for film and television. He composed scores for the films Shored Up, Birders: The Central Park Effect (HBO),
Book excerpt: Astrophysics for those in a hurry
Book excerpt: Astrophysics for those in a hurry
By Neil deGrasse Tyson ’92GSAS In his new bestseller collection, the astrophysicist breaks down complex scientific topics - from the big bang to dark energy