Image by ishmagination via Flickr
This blog post is motivated by articles that I read this month in which matching was used inappropriately in clinical studies.
First, some background. The primary purpose of randomization in clinical trials is to hope that the stochastic process of group assignment would, on average, remove the effects of confounders.
In observational studies, group assignments (e.g.., whether a person is a case, or a control) are not under the control of the researcher. To remove the effects of confounders--at least those that are known--one could adjust for it in the analysis stage. Alternatively the groups can be actively matched during the design stage.
The typical matching one thinks of is the 1-to-n case-control study, where for each case, n matching control are obtained. This is termed individual matching. Another often used matching method is frequency matching, in which the distributions of the confounder variable are matched.
However, matching in design stage must be echoed by statistical adjustment in the analysis stage.
We note that for for frequency matching, at least simple stratified analysis should be adopted. To take the extreme example, consider the Simpson's paradox. This is the phenomenon where an effect seen in all subgroups are reversed when all subgroups are considered together.