Steve Schoenbaum writes: In his blog this week, Mark Pendergrast challenges someone/anyone to take on explaining the differences between case-control studies vs. cohort studies. As an EIS officer, back in late May/early June 1968, I did a case-control study as part of the investigation of a common source outbreak of hepatitis in Ogemaw County, Michigan, so I will try to pick up the challenge. I believe it was only the second time case-control methods were used in a CDC epidemic investigation. In using this method I learned about the power of comparison, not just that numerators need denominators, but cases need controls. I also learned about the factors influencing use of different types of epidemiologic investigation and elaborate on those below:
The first use of case-control methods in an EIS investigation I am aware of is described on p. 45 of "Inside the Outbreaks." There was an outbreak of hepatitis in December 1961 among officers in a naval air station in Florida that Don Millar and Paul Joseph investigated. It involved asking the 22 cases and 116 "non-cases", or "controls", about their preferences for eating the foods which had been on the officers' mess' menus. Potato salad was an item that all the ill persons said they ate at some time; whereas 30 percent of the controls said they never ate potato salad. This became the "likely vehicle", and Drs. Millar and Joseph had a theory that the potato salad had been contaminated by one of the food handlers "urinating into the dressing used on the potato salad."
The Ogemaw County epidemic has been well described first in a story that Berton Roueche did for the New Yorker, which I found particularly exciting since I had originally been attracted to epidemic investigations and the EIS by reading some of Roueche's collected stories in "Eleven Blue Men". The description of Ogemaw remains in print in another collection of Roueche stories, "Medical Detectives", and is entitled "The West Branch Study" - West Branch being the largest town in Ogemaw County. It is described more formally/scientifically in the following reference: Schoenbaum SC, Baker O, Jezek Z. "Common-source epidemic of hepatitis due to glazed and iced pastries." American Journal of Epidemiology. 1976; 104:74-79.
Back in 1968, Ogemaw County had a population of about 12,000. A worker in the local bakery and another local resident who happened also to be a food handler at an ice-cream stand had had hepatitis in early April. Then over one month from late April to late May, there were 61 more cases of hepatitis. Since hepatitis A, which in those days was called infectious hepatitis, has an incubation period of about 15-45 days, the whole outbreak was consistent with a common source of exposure back in the early part of April - i.e., sometime in the first two weeks.
Now, cohort studies are based on the idea that the investigator wants to find out if exposure to something causes disease. So, to do a cohort study you start with a group of people who are well but have an exposure of interest - for example, smoking. You find a similar group of people who are well and do not have that exposure - i.e., are non-smokers. Then you follow both groups over time and determine how many in each group become ill from various illnesses. That allows you to calculate the "relative risk" of developing a specific illness, e.g., lung cancer, heart disease, etc, if you are exposed (a smoker) vs. non-exposed (a non-smoker). An advantage of cohort studies is that they allow you to study the occurrence of multiple illnesses in relation to a single type of exposure. Since you start with people who are well and the illnesses develop over time, cohort studies take a lot of time. They are "prospective" studies. It is also the model for clinical trials, i.e., for drugs or other medical interventions. One starts with a group of people and gives some of them a drug, or "exposure", and compares what happens to the treated group over time vs. the non-treated or non-exposed group.
In contrast, case-control studies are "retrospective". They work backwards in time. One starts with a group of people who have an illness, the cases, and a group who do not have it, the controls. Then one determines if the members of each group might have or have had a variety of exposures. By comparing the frequency of each exposure in each group one can calculate a statistic known as the "relative odds" of getting the illness in the face of each specific exposure.
When I started the EIS course back in July 1967, the first case study was a church supper or pot-luck supper in Oswego County, New York. The epidemic was due to staphylococcal enterotoxin which has a very short incubation period between the time its victims eat it and the time they become ill. In a church supper in a small town it is possible to determine who was present, who became ill or stayed healthy, and what each person ate. That actually makes it possible to approach the analysis of the information either from a cohort approach - i.e., consider each food on the buffet table an exposure, calculate, and compare the relative risk of illness for those who consumed the food and those who didn't - or pursuing the case-control approach one could start with the groups of people who got ill vs. those who didn't, calculate the frequency with which the persons in each group ate each food and then calculate the relative odds of illness for eaters of each food. The cohort approach to analysis is preferred in this situation; and a study in which all the data about the exposures and illness can be obtained at one time is called a "cross-sectional" study.
OK - so back to Ogemaw County: Hepatitis has a relatively long incubation period, especially compared to staph enterotoxin. By the time the epidemiologist learns about the outbreak it really isn't possible to know the cohorts of persons who were exposed to each possible source of such outbreaks - foods of various types from various sources, water, etc. But, it is easy to know who is sick, the cases. Then one has to pick a suitable group of controls and try to find out from each group what they might have been exposed to. In short, evaluation of such epidemics fits quite naturally into the case-control methodology. There is an issue, however, and it involves selection of an "appropriate" group of controls. This isn't trivial, and sometimes one even picks a couple of different kinds of controls. Also, one can just pick a group of suitable controls for the group of cases as a whole, or one can try to match a specific control to each case or patient. The theory is that one wants the controls to be as similar as possible to the cases except for the exposures that might have been associated with the development of the illness. So if the cases were all young women, one wouldn't pick a group of controls that were all old men. On the other hand, if one matches each ill woman who is a specific age with a healthy woman of exactly the same age, then one can no longer study whether gender or age are factors in developing the illness. In Ogemaw County we took a chance in picking the controls. Forty-one of the 61 cases were age 10-19. To have a control group that came close to the case group in demographics, we chose all the household members of cases who were age 10-19 and ended up with 56 of them. The chance that we took was that not everyone who gets infected with the hepatitis A virus becomes ill. So, it was possible that by picking household members of cases we were picking some people who actually had had exposure to the hepatitis virus. In considering them controls we were increasing the likelihood that a high rate of exposure to the culprit would occur in the control group and we wouldn't be able to identify the culprit since there wouldn't be enough difference in reported exposure between cases and controls. Nonetheless, it was convenient to be able to interview the cases and controls about exposures on just one contact with a household. So we took the chance. And, it worked out: 92 percent of the patients 10-19 years old had eaten glazed or iced pastries from the local bakery during the first two weeks of April 1968, the likely time of exposure given the body of the epidemic occurring in late April/May; whereas, only 47 percent of the controls reported a similar exposure. The only other frequent exposure reported by cases was West Branch's municipal water, but the controls reported consuming water from the municipal supply even more frequently than the cases. So, it was just a frequent exposure for anyone living in the Ogemaw County area.
There were many other interesting features to the epidemic in Ogemaw County; and since they are well-described in the journal article and story in Roueche's "Medical Detectives" I won't go into them here. What I would like to mention is that like most epidemic investigations, Ogemaw was a team effort. I was sent out from Atlanta where I had been working on influenza. In line with some of the stories in "Inside the Outbreaks", I needed to learn about the epidemiology of hepatitis A on the flights up to Michigan. I got to work closely with the county health officer, Ophelia Baker; and two other EIS officers were sent up after a few days to help in the investigation - Jim Gardner, the EIS officer in my class who had been assigned to the state of Michigan, and Eugene Page, the EIS officer in Florida. Then one day I got a call from Atlanta. There was a visitor at CDC, a seasoned infectious disease epidemiologist from Czechoslovakia, Dr. Zdenek Jezek. He had had several times as much experience as the three of us EIS officers combined! In fact, at the time he came to visit CDC, he was working for WHO and had just come back from a couple of year assignment in Mongolia. Without the team and Zdenek Jezek's experience, I doubt we would have been able to pin down the exposure quite so neatly to the worker in the bakery who had become ill in early April. When we observed the bakery operations directly, we saw that he would dip pastries with his bare hands into a pan of glaze, a thick sugar-water mix, and put icing onto other pastries with his hands. Dr. Baker, in her role as the health officer, instituted some food-handling rules for the local businesses that prepared foods on site such as the bakery, and proceeded to inspect them over time. Michigan, which used to manufacture immune serum globulin, had a large supply, and Dr. Baker ran clinics to give globulin to anyone who wanted it which turned out to be the majority of residents of the county. There was no newly reported case of hepatitis in Ogemaw County for over a year after the epidemic.
Thanks to Steve Schoenbaum for this clear, cogent explanation of cohort and case-control studies, and especially for his recounting of this classic outbreak investigation. I'm particularly glad to have it here, since I couldn't fit it into my book, though I certainly wanted to do so.
One thing Steve didn't comment on is that Alexander Langmuir, the visionary founder of the Epidemic Intelligence Service, had certain foibles as well, including a profound distaste for case-control studies. He thought dase-controls were an inferior, backward-looking methodology. But in cases such as toxic shock syndrome, where the "cohort" would have been all the women in the United States who were experiencing their monthly mentrual cycle, case-control was the only possible option.
It is true, however, that the case-control methodology is fraught with possible error. Epidemiology is a science of probability, not proof, and the way you chose controls, as well as the specific questions you ask, can slant the process. Here's a salient excerpt from INSIDE THE OUTBREAKS on the toxic shock investigation that illustrates this point:
The four members of the TSS Task Force did their own study, asking 52 patients in 20 states to name an age-matched friend as a control. Within a week, they had found and questioned all of them on the telephone, asking intimate questions about marital status, sex frequency, intercourse during menstruation, use of tampons or pads, brands used, menstrual patterns, what medications they had taken, and more. They also contacted the major U. S. tampon manufacturers, asking for materials used, manufacturing practices, and marketing history. They learned that new, more absorbent brands of tampons had replaced the standard rayon or cotton with polyacrylate fibers, polyester foam, and various forms of cellulose.
The case-control study failed to implicate any medication or activity other than tampon use. All of the TSS cases used tampons, compared to 44 out of the 52 controls. Tampons were thus implicated, but not by much.
The second MMWR article appeared on June 27, 1980. âNo particular brand of tampon is associated with unusually high risk,â Kathy Shands wrote, although she knew that Procter & Gamble's new Rely tampon was over-represented, though not with statistical significance. The article resulted in a media frenzy, and the young officers began to consult directly with CDC Director Bill Foege and the Surgeon General.
Over the Labor Day weekend, the EIS officers conducted a second case-control study, using 50 new TSS cases and requesting three friends as controls for each case. To make sure that brand information was accurate, they asked patients and controls to read the labels from their tampon packages to them over the phone.
In this study, 71 percent of the cases used only one tampon brand, Rely, compared to 26 percent of the controls. Procter & Gamble had rolled out Rely nationally throughout 1978, claiming that it âeven absorbs the worry.â When the EIS officers presented their pre-publication finds, a P&G executive asked, âYou realize what this means to P&G? What if youâre wrong?â Bruce Dan shot back, âWhat if youâre wrong? What if it were your daughter?â
The results were published in the MMWR on September 19, 1980. Under pressure from the FDA, on September 22 Procter & Gamble announced that it was âvoluntarilyâ withdrawing Rely from the market. --Mark Pendergrast