This document are a summary of rules for self-reference from the book Statistical Rules of Thumb by Gerald van Belle. Most of the notes have been taken verbatim. Please refer to the book for a detailed description without which the notes may be meaningless to the uninitiated reader.
Observation is selection
Replicate to characterize random variation
Variability occurs at multiple levels
Invalid selection is the primary threat to valid inference
Compared with experimental studies, observational studies provide less robust information
Make a sharp distinction between observational and experimental studies
Always look for a physical model underlying the data being analyzed. Assume that a statistical model, such as a linear model, is a good first start only
Keep models as simple as possible but no more simple
Be sure to understand the components and purpose of an omnibus quantity
Do not multiply probabilities more than necessary. Probabilities are bounded by 1; multiplication of enough probabilities will always lead to a small number
The use of one sided p-values is discouraged. Ordinarily, use 2-sided p-values
When designing experiments or observational studies, focus on p-values to calculate sample size; when representing results, focus on sample size
Use atleast 12 observations in constructing a confidence interval
For samples \(\geq\) 20, a point estimate +/- 2 standard errors has a 95% coverage for a wide variety of distributions
Always know what the unit of a variable is
Do not let scale of measurement rigidly determine method of analysis
The practical applied statistician uses methods by all three schools (Neyman-Pearson, Likelihood, Bayesian) as appropriate
The basic formula (Lehr’s equation) for sample size is \[ n = 16/\Delta^2\] where \[ \Delta = \frac{\mu_0 - \mu_1}{\sigma} = \frac{\delta}{\sigma}\] is the standardized difference. In the single sample case (where a single sample is compared to a known population value), the numerator is 8 instead of 16
The sample size using coefficient of variation (CV) is given by \[n = \frac{16(CV)^2}{(ln(\mu_0)-ln(\mu_1))^2}\]
Finite population size correction can be ignored in initial discussions of survey sample size questions
The range of the observation is related to the standard deviation as follows: \[ \frac{range}{\sqrt{2(n-1)}} \leq s \leq \frac{n}{n-1}\frac{range}{2}\]
Do not formulate objectives for a study solely in terms of effect size
Confidence intervals associated with statistics for two variables can overlap as much as 29% and the statistics can still be significantly different
If \(\theta_1\) and \(\theta_2\) are the means of two poisson-distributed populations, then the required number of observations per sample is \[ n = \frac{4}{(\sqrt(\theta_1)-\sqrt(\theta_2))^2}\]
The sample size calculation for a poisson distribution with background rate \(\theta*\) is given by \[n = \frac{4}{(\sqrt(\theta* + \theta_1)-\sqrt(\theta* + \theta_2))^2}\]
The sample size calculation for a binomial distribution is given by \[ n = \frac{16\bar{\pi}(1-\bar{\pi})}{(\pi_0 - \pi_1)^2} \] where \[\bar{\pi}=\frac{\pi_0 + \pi_1}{2}\]
For unequal sample sizes where one group contains \(n_0\) samples and the other group contains \(kn_0\) samples, choose k such that \[k = \frac{n_0}{2*n_0-n}\] to get the same precision as having an equal number of samples in each group
When there are different costs associated with each sample, choose a sample size that is inversely proportional to the square root of the cost of the observations
Given no observed events in \(n\) trials, the 95% upper bound on the rate of occurence is \(3/n\)
Sample size calculations should be based on the statistics used in the analysis of the data
The model for an observational study is the sample survey
Large sample size do not guarantee validity
Good observational studies are designed
To establish cause and effect requires longitudinal data
Make theories elaborate. Consider many alternative explanations for the observed effect
The Hill guidelines are useful in determining causation
Sensitivity analyses assesses model uncertainty and missing data
Before choosing a measure of covariation, determine the source of the data, the nature of variables, and the symmetry status of the measure
Do not summarize regression sampling schemes with correlation
Do not correlate rates or ratios indiscriminately
To determine the appropriate sample size to estimate a population correlatiob \(\rho\), use the following \(\Delta\) in Rule 1 of sample size
\[\Delta=\frac{1}{2}ln\frac{1+\rho}{1-\rho}\]
Do not pair unless the correlation between the pairs is \(>\) 0.5
Go beyond correlation in drawing conclusions, particularly in instances where location and scale are relevant
Assess agreement in terms of accuracy, scale differential, and precision
Assess test reliability by means of agreement
The range of the predictor variable determines the precision of the regression
In measuring change, width (i.e. spacing of the observations) is more important than the number of observations
Begin with the lognormal distribution in environmental studies
Differences are more symmetrical
Know the sample space for statements of risk
Beware of pseudo-replication (Hurlbert 1984)
Always consider alternatives to simple random sampling for a potential increase in efficiency, lower costs, and validity
In assessing the importance of an effect, consider the size of the population to which it applies
Models estimating small effects in large populations are particularly sensitive to assumptions. Extensive sensitivity studies are needed in such cases to validate the model
In assessing variation, distinguish between variability and uncertainty
In using a database, first look at the metadata, then look at the data
Always assess the statistical basis for an environmental standard
How a pollutant is measured plays a key role in identification, regulation, enforcement and remediation
Parametric analysis make maximum use of the data
Distinguish between confidence, prediction, and tolerance intervals (Vardeman 1992)
Risk assessment is divided into 5 areas - hazard identification, dose-response evaluation, exposure assessment, risk characterization, and risk management. Statistics plays an important role in the first 4. The last involves policy based on the first 4
Exposure and disease are usually widely separated both in space and time. Retrospective assessment of exposure is very difficult - particularly if the causes and mechanisms are poorly understood
Calibration involves inverse regression, and the error associated with the regression must be assessed
Start with the poisson distribution to model disease incidence or prevalence
For a rare disease, the odds ratio approximates the relative risk
To detect a relative risk R in a rare disease cohort study, the number of exposed subjects (or unexposed subjects) \(n\) for \(\alpha\)=0.05 and power = 0.8 is given by \[ n = \frac{4}{\pi_0(\sqrt{R-1})^2} \] where \(\pi_0\) is the probability of the disease in the unexposed population and \(R\) is the relative risk assumed to be >1
The estimate of sample size per group in a cohort study, based on the logarithm of the relative risk \(R\) is given by \[ n= \frac{8(R+1)/R}{\pi_0(ln R)^2}\] for \(\alpha\)=0.05, power=0.8 and a two-sided alternative
Take no more than 4 to 5 controls per case
In logistic regression situations, about 10 events per variable are necessary inorder to get reasonably stable estimates of the regression coefficients
Begin with the exponential distribution to model time to event
Begin with two exponentials for comparing survival times
Be wary of surrogates. Accept substitutes warily
In rare diseases, the prevalence dominates the predictive value of a positive test
Do not dichotomize unless absolutely necessary
Select an additive or multiplicative model according to the following order: theoretical justification, practical implication, and computer implementation
There are three hierarchies of evidence, each of which depend on the question asked and on the population of interest
The distinction between patient-oriented (POEM) and disease oriented (DOE) evidence is almost completely the difference between a surrogate endpoint and clinically relevant endpoint
In comparing two treatment regimens with binary outcome, start with absolute risk reduction
Number neeeded to treat (NNT) is a very useful clinical statistic but must be handled with care
Variability in treatment effect must always be considered over and above the average effect
Evidence for safety is limited
Intent to treat (ITT) is the default strategy for analysis
In EBM, it is more useful to discuss information about the prior rather than the prior
The four key questions for meta-analysis are the same as those in rule 1 of Basics
Randomization puts systematic sources of variability into the error term
Blocking is the key to reducing variability
Factorial design should be used to assess the joint effects of variables
Higher order effects occur rarely. Therefore it is not necessary to design experiments to incorporate higher order effects
Aim for balance in the design of a study
Analysis should follow design
Assess independence, equal variance, and normality in that order
For every analysis, there is an appropriate graphical display
Distinguish between design structure and treatment structure of a study
Plan to do a hierarchical analysis of treatment effects by including all lower order effects associated with a higher order effect
Distinguish between nested and crossed design. The analysis will be quite different
Plan for missing data
Develop a strategy for dealing with multiple comparisons before starting a study
Know what properties a transformation preserves or does not preserve
Think of bootstrapping instead of the delta method in estimating complex relationships
Agresti, Alan, An Introduction to Categorical Data Analysis, Wiley-Interscience, 2007.
Rosenbaum, Paul R. Observational Studies (second edition), Springer New York, 2002.
Cohen, Jacob Statistical Power Analysis for the Behavioral Sciences Routledge, 2nd edition, 1988
Cameron, Colin and Trivedi, Pravin Regression Analysis of Count Data Cambridge University Press, 2nd edition, 2013
Marcus-Roberts, Roberts, Meaningless Statistics, Journal of Educational Statistics, 1987.
Hill, A. B. The Environment or Disease: Association or Causation?, Presidential Address to the Section of Occupational Medicine of the Royal Society of Medicine, 1965.
Vardeman, S. B. What about the other intervals?, The American Statistician, Vol 46, No. 3, Aug 1992, pp 193-197
Hurlbert, S. H. Pseudoreplication and the design of ecological field experiments, Ecological Monographs, 1984, pp 187-211
Malinas, Gary and Bigelow, John, Simpson’s Paradox, The Stanford Encyclopedia of Philosophy (Winter 2012 Edition), Edward N. Zalta (ed.)
Sandman, Peter, Mass Media and Environmental Risk: 7 Principles, 1997.