Confounding is a concept often mentioned in clinical research - the idea that a 3rd variable can distort or confuse (or confound..) a relationship between two other variables.
When confounding is present, it looks like exposure A is associated with increased risk of disease B, but really a 3rd variable X is causing increased risk of disease B and it just happens to also be associated with exposure A.
Clear as mud? Here's a real world example - if you look at how people recover after hip fracture, and you consider gender and whether women or men do better after hip fracture, it may seem that women generally fare poorly after they break a hip.
However, there's a confounder in this relationship -- age!
If you think a little more about the characteristics of people who break a hip, you realize that young men have hip fractures (due to high-energy trauma associated with events like motor vehicle crashes) and old(er) women have hip fractures (women with reduced bone density and potentially some mobility/balance/cognition problems leading to a fall). Younger females don't tend to have as many hip fractures as their young male counterparts (due to lifestyle issues etc.), and older males don't tend to have the same incidence of hip fracture as their female counterparts (they don't always have the same severity of bone density changes as older females; men tend to die sooner).
The female group is then naturally weighted toward older people (who generally heal more slowly and may have other comorbid conditions going on) and the male group is weighted toward young, otherwise healthy people.
So, when you add age to the statistical model and correct for the different age distribution, the difference in outcome by gender goes away -- the women tend to do more poorly because they're older, not because of their gender. If you control for age, there's no difference in outcome between men and women.
This is a classic example of confounding and one way to correct for it -- in an observational study, you can't randomize, but you can try to measure "things" that might be impacting your disease -- age, gender, socioeconomic status, family support, severity of injury, type of operative repair, rehabilitation status, etc. -- so that you can add them into your statistical model.
The Pitt epidemiology supercourse has a great discussion of
confounding and ways to correct for it (PowerPoint file), and the Social Sciences Statistics blog has a
very interesting post today that comments on issues of confounding in a recent study about running habits in older individuals (
this is the study the post talks about).
Labels: confounding