This page briefly describes the False Discovery Rate (FDR) and provides an annotated resource list.
When analyzing the results of genome-wide studies, thousands of hypothesis tests are often run concurrently. Using the traditional Bonferroni method to correct multiple comparisons is too conservative as protecting against false positives will lead to many overlooked results. In order to be able to identify as many significant comparisons as possible and still maintain a low false positive rate, the false discovery rate (FDR) and its analogue the q value are used.
Definition des Problems
When we run hypothesis tests, for example, to see if two means are significantly different, we compute a p-value, which is the probability of getting a test statistic equal to or more extreme than the observed one, assuming the null hypothesis is true. For example, if we had a p-value of 0.03, it would mean that if our null hypothesis is true, there is a 3% chance of getting our observed test statistic, or a more extreme one. Since this is a small probability, we reject the null hypothesis and say that the means are significantly different. Usually we want to keep this probability below 5%. If we set our alpha to 0.05, we say that the probability that a zero result is called significant should be less than 5%. In other words, we want the probability of a Type I error or false positive to be less than 5%.
If we do multiple comparisons (I call each test a characteristic), we have an increased chance of false positives. The more features you have, the more likely a null feature will be called significant. The false positive rate (FPR) or per comparison error rate (PCER) is the expected number of false positives from all hypothesis tests performed. So if we control the FPR at an alpha of 0.05, we guarantee that the percentage of false positives (null features that are said to be significant) of all hypothesis tests is 5% or less. This method poses a problem when we are performing a large number of hypothesis tests. For example, if we conduct a genome-wide study to investigate the difference in gene expression between tumor tissue and healthy tissue and test 1000 genes and check the FPR, an average of 50 genes that are really null are said to be significant. This method is too generous as we don't want so many false positives.
Typically, several comparison methods instead control the family-wise error rate (FWER), which is the probability of having one or more false positive results in all hypothesis tests carried out. The frequently used Bonferroni correction controls the FWER. When we test each hypothesis to a significance level of (alpha / number of hypothesis tests), we guarantee that the probability of having one or more false positives is less than alpha. So if alpha was 0.05 and we tested our 1000 genes, we would test each p-value to a significance level of 0.00005 to guarantee that the probability of one or more false positives is 5% or less. However, protection from a single false positive result can be too strict for genome-wide studies and lead to many missed results, especially when we expect there will be many true positive results.
False Discovery Rate (FDR) control is a way to identify as many significant features as possible while getting a relatively low percentage of false positives.
Steps to control the false detection rate:
Control for FDR at level α * (i.e. the expected amount of false discoveries divided by the total number of discoveries is controlled)
Calculate p-values for each hypothesis test and hypothesis (from smallest to largest, P (min) …… .P (max))
For the i ordered p-value, check that the following is true:
P (i) ≤ α × i / m
If true, then significant
* Limitation: If the error rate (α) is very high, this can lead to an increased number of false positives with significant results
Die False Discovery Rate (FDR)
The FDR is the rate at which features labeled significant are truly zero.
FDR = expected (# false predictions / # total predictions)
The FDR is the rate at which features labeled significant are truly zero. An FDR of 5% means that 5% of all features that are designated as significant are really zero. Just as we set alpha as the threshold for the p-value to control the FPR, we can also set a threshold for the q-value, which is the FDR analog of the p-value. A p-value threshold (alpha) of 0.05 gives an FPR of 5% among all truly null features. A threshold q-value of 0.05 gives an FDR of 5% among all features designated as significant. The q-value is the expected proportion of false positives among all characteristics that are more extreme or extreme than those observed.
In our 1000 gene study, say, gene Y had a p-value of 0.00005 and a q-value of 0.03. The probability that a test statistic for a non-differentially expressed gene would be the same as or more extreme than the test statistic for gene Y is 0.00005. However, the test statistic for gene Y can be very extreme, and possibly these test statistic is unlikely for a differently expressed gene. It is entirely possible that there are indeed differentially expressed genes with less extreme test statistics than gene Y. With the q-value of 0.03 we can say that 3% of the genes are considered to be more extreme or extreme (i.e. the genes with lower p-values) because the gene Y is false positive. The use of q values allows us to decide how many false positives we want to accept among all the characteristics we call significant. This is especially useful when we want to make a large number of discoveries for further confirmation at a later date (e.g. this is also useful in genome-wide studies where we expect a significant portion of the traits to be truly alternative and we expect our discovery capacity not want to restrict.
The FDR has a number of useful properties. If all null hypotheses are true (there are no really alternative outcomes), then FDR = FWER. If there are a number of truly alternative hypotheses, the control for the FWER automatically controls the FDR as well.
The power of the FDR method (remember that the power is the probability of rejecting the null hypothesis if the alternative is true) is equally greater than the Bonferroni method. The power advantage of the FDR over the Bonferroni methods increases with the number of hypothesis tests.
(From Storey and Tibshirani, 2003)
Definitions: t: threshold value V: number of false positive results S: number of features marked as significant 0: number of really zero features: total number of hypothesis tests (features)
How do we estimate E [S (t)]?
How do we estimate E [V (t)]?
How do we estimate m0?
We assume that p-values of zero features are evenly distributed (have a flat distribution) between [0.1]. The height of the flat distribution gives a conservative estimate of the total proportion of zero p-values, 0. For example, the following picture from Storey and Tibshirani (2003) is a density histogram of 3000 p-values for 3000 genes from a gene expression study. The dashed line represents the height of the flat part of the histogram. We expect really null features to form this flat distribution of [0,1] and really alternative features to be closer to 0.
π0 is quantified as, where lambda is the tuning parameter (for example, in the picture above we could choose lambda = 0.5, since the distribution is fairly flat after a p-value of 0.5. The proportion of truly zero features corresponds to the number of p - Values greater than lambda divided by m (1-lambda) As lambda approaches 0 (when most of the distribution is flat), the denominator will be approximately m, as will the numerator, since the majority of the p-values are greater than Lambda, and π0 is approximately 1 (all features are zero).
The choice of the lambda is usually automated by statistical programs.
Now that we have estimated π0, we can estimate FDR (t) as
The q-value for a feature is then the minimum FDR that can be achieved if this feature is designated as significant.
(Note: The above definitions assume that m is very large, i.e. S> 0. If S = 0 the FDR is undefined, so that in the statistical literature the quantity E [V /? S? | S> 0]? * Pr (S> 0) is used as FDR. Alternatively, the positive FDR (pFDR) is used, which is E [V / S? | S> 0]. See Benjamini and Hochberg (1995) and Storey and Tibshirani (2003 ) for more informations.)
Textbooks & Chapters
RECENT PROGRESS IN BIOSTATISTICS (Volume 4):
Herausgegeben von Manish Bhattacharjee (New Jersey Institute of Technology, USA), Sunil K Dhar (New Jersey Institute of Technology, USA) & Sundarraman Subramanian (New Jersey Institute of Technology, USA).
The first chapter of this book reviews the FDR control methods proposed by leading statisticians in the field and suggests a new adaptive method that controls the FDR when the p-values are independent or positively dependent.
Intuitive Biostatistics: A Non-Mathematical Guide to Statistical Thinking
This is a statistics book written for scientists who lack a complex statistical background. Part E, Challenges in Statistics, explains in lay language the problem of multiple comparison and the various manners, including basic descriptions of the family-related error rate and the FDR.
Large-scale inference: empirical Bayesian methods for estimating, testing, and predicting
This is a book that reviews the concept of the FDR and examines its value not only as an estimation method but also as a significance test object. The author also provides an empirical assessment of the accuracy of FDR estimates.
Benjamini, Y. and Y. Hochberg (1995). False Detection Rate Control: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (methodological) 57 (1): 289-300.
This 1995 paper was the first formal description of FDR. The authors mathematically explain how the FDR relates to the family-wise failure rate (FWER), provide a simple example of the use of the FDR, and conduct a simulation study that demonstrates the performance of the FDR method compared to Bonferroni-type methods.
Storey, J.D. and R. Tibshirani (2003). Statistical significance for genome-wide studies. Proceedings of the National Academy of Sciences 100 (16): 9440-9445.
This article explains what the FDR is and why it is important for genome-wide studies, and explains how the FDR can be estimated. There are examples of situations where the FDR would be useful and provides a worked through example of how the authors used the FDR to analyze differential gene expression data from microarrays.
Floor JD. (2010) Incorrect Detection Rates. In International Encyclopedia of Statistical Science, Lovric M (Editor).
A very good article on FDR control, positive FDR (pFDR), and addiction. Recommended for a simplified overview of the FDR and related methods for multiple comparisons.
Reiner A, Yekutieli D, Benjamini Y: Identification of differentially expressed genes with false detection rate control methods. Bioinformatik 2003, 19 (3): 368-375.
This article uses simulated microarray data to compare three resampling-based FDR control methods with the Benjamini-Hochberg method. The test statistics are resampled so as not to assume the distribution of the test statistics of the differential expression of each gene.
Verhoeven KJF, Simonsen KL, McIntyre LM: Implementing False Detection Rate Control: Increase Your Performance. Oikos 2005, 108 (3): 643-647.
This paper explains the Benjamini-Hochberg method, provides a simulation example and discusses recent developments in the FDR area that can deliver more power than the original FDR method.
Stan Pounds and Cheng Cheng (2004) Improving the Estimation of the False Detection Rate Bioinformatics Vol. 2, No. 20 No. 11 2004, pages 1737-1745.
This paper introduces a method called the LOESS histogram (SPLOSH). This method is proposed to estimate the conditional FDR (cFDR), the expected proportion of false positives that depends on the existence of k “significant” findings.
Daniel Yekutieli, Yoav Benjamini (1998) Resampling-based false detection rate that drives multiple test procedures for correlated test statistics Journal of Statistical Planning and Inference 82 (1999) 171-196.
This paper introduces a new FDR control method to deal with correlated test statistics. The method includes calculating a p-value based on resampling. The properties of this method are evaluated with the help of a simulation study.
Yoav Benjamini and Daniel Yekutieli (2001) Controlling the False Detection Rate in Multiple Tests under Dependency The Annals of Statistics 2001, Vol. 2, No. 29, No. 4, 1165-1188.
The originally proposed FDR method was intended for use in testing multiple hypotheses of independent test statistics. This paper shows that the original FDR method also controls the FDR when the test statistics show a positive regression dependency on each of the test statistics that meet the true null hypothesis. An example of dependent test statistics would be testing multiple endpoints between treatment and control groups in a clinical trial.
John D. Storey (2003) The Positive False Detection Rate: A Bayesian Interpretation and the q-Value The Annals of Statistics 2003, Vol. 2, No. 31, No. 6, 2013-2035.
This paper defines the positive False Discovery Rate (pFDR), which is the expected number of false positives from all tests that are considered significant if at least one positive result is present. The paper also offers a Bayesian interpretation of the pFDR.
Yudi Pawitan, Stefan Michiels, Serge Koscielny, Arief Gusnanto and Alexander Ploner (2005) Incorrect detection rate, sensitivity and sample size for microarray studies Bioinformatics Vol. 2, No. 21 No. 13 2005, pages 3017-3024.
This paper describes a method for calculating the sample size for a comparative two-sample study based on FDR control and sensitivity.
Grant GR, Liu J, Stoeckert CJ Jr. (2005) A Practical Rate of False Detection Approach to Identifying Patterns of Differential Expression in Microarray Data. Bioinformatics. 2005, 21 (11): 2684-90.
The authors describe the methods of permutation estimation and discuss questions related to the choice of statistical and data transformation methods by researchers. Performance optimization related to the use of microarray data is also explored.
Jianqing Fan, Frederick L. Moore, Xu Han, Weijie Gu, Estimation of the proportion of false discoveries under arbitrary covariance dependency. J Am Stat Assoc. 2012; 107 (499): 1019-1035.
This paper proposes and describes a method of controlling FDR based on a principal factor approximation of the covariance matrix of the test statistic.
Han S, Lee K-M, Park SK, Lee JE, Ahn HS, Shin HY, Kang HJ, Koo HH, Seo JJ, Choi JE et al .: Genome-wide association study of acute lymphoblastic leukemia in children in Korea. Leukemia Research 2010, 34 (10): 1271-1274.
This was a genome-wide association study (GWAS) that tested one million single nucleotide polymorphisms (SNPs) for association with acute lymphoblastic leukemia (ALL) in children. They checked the FDR at 0.2 and found that 6 SNPs in 4 different genes are strongly linked to the risk of ALL.
Pedersen, K. S., Bamlet, W. R., Oberg, A. L., de Andrade, M., Matsumoto, M. E., Tang, H., Thibodeau, S. N., Petersen, G. M. and Wang, L. (2011). The DNA methylation signature of leukocytes differentiates pancreatic cancer patients from healthy controls. PLoS ONE 6, e18223.
This study controlled an FDR<0.05 when looking for differentially methylated genes between pancreatic adenoma patients and healthy controls to find epigenetic biomarkers of disease.
Daniel W. Lin, Liesel M. FitzGerald, Rong Fu, Erika M. Kwon, Siqun Lilly Zheng, Suzanne et.al. Genetic variants in the LEPR, CRY1, RNASEL, IL4 and ARVCF genes are prognostic markers for prostate cancer-specific mortality (2011), Cancer Epidemiol Biomarkers Prev. 2011; 20: 1928-1936. This study examined the variation of selected candidate genes associated with the occurrence of prostate cancer to test their prognostic value in people at high risk. FDR was used to rank single nucleotide polymorphisms (SNPs) and to identify the major SNPs of interest.
Radom-Aizik S, Zaldivar F, Leu S-Y, Adams GR, Oliver S, Cooper DM: Effects of exercise on microRNA expression in peripheral blood mononuclear cells of young men. Clinical and Translational Science 2012, 5 (1): 32-38.
This study examined the change in microRNA expression before and after training with a microarray. They used the Benjamini-Hochberg method to control the FDR at 0.05 and found that 34 out of 236 microRNAs were differentially expressed. From these 34, the researchers then selected microRNAs that were confirmed with real-time PCR.
Annotated R code for analyzing data in the paper by Storey and Tibshirani (2003), including a link to the data file. This code can be customized to work with any array data.
qvalue package for R.
Journal R Project is a peer-reviewed open access publication from the R Foundation for Statistical Computing. This volume includes an article titled 'Sample Size Estimation While Controling False Discovery Rates for Microarray Experiments' by Megan Orr and Peng Liu. Specific functions and detailed examples are provided.
This website provides a list of R software for FDR analysis with links to their home pages for a description of the package functions.
Description of PROC MULTTEST in SAS that offers options to control the FDR using different methods.
Provides STATA commands for calculating q-values for multiple test procedures (calculate FDR-adjusted q-values).
FDR_general web resources
Website managed by the Tel Aviv University statisticians who first officially launched the FDR.
Many references are available on this FDR website. Talk on FDR is available for review.
Nice, concise statement from FDR. A useful summary with an example at a glance is provided.
A brief overview of false positives and q values.
A False Detection Control Tutorial from the Christopher R. Genovese Department of Statistics Carnegie Mellon University.
This powerpoint is a very thorough tutorial for someone interested in learning the math basics of the FDR and variations of the FDR.
Multiple tests from Joshua Akey, Department of Genome Sciences, University of Washington.
This Powerpoint offers a very intuitive understanding of multiple comparisons and the FDR. This talk is good for those looking for a simple understanding of FDR without a lot of math.
Estimation of the local misrecognition rate when recognizing difference expressions between two classes.
This video presentation was helpful in learning more about the local FDR i.e. H. the probability that a particular hypothesis is true based on its specific test statistic or p-value.
False Detection Rate Control Procedure for Discrete Tests
This video presentation was helpful in learning more about applying FDR control to discrete data. Several step-up and step-down methods for FDR control when dealing with discrete data are discussed. Alternatives that ultimately improve performance will be reviewed.