OST 512 Epidemiology and Biostatistics

Summer 1999


H.S. Teitelbaum, D.O., Ph.D., M.P.H.

Department of Internal Medicine

B-319 W. Fee Hall

Telephone: 35-53361 (Office) 332-1881 (Home)

Office Hours: Monday 3:00-7:00.

Course Format:

This is a two (2) credit course which consists of:

1. Lectures.

2. Required text readings.

3. Readings from the scientific, medical and other journals.

4. Two examinations.

5. An electronic literature search.

The instructional mode will be primarily lecture with class participation in discussing the various articles as the need arises.

Course Goals:

This is an introductory course. It assumes a working knowledge of arithmetic and elementary algebra and a college reading level. The main emphasis will be on the interpretation of medical literature both from a substantive perspective as well as a statistical perspective. In practice, most computational work of any complex nature is done by computer packages, hence only rudimentary computations will be required. The major emphasis will be placed on the use of the statistics, the appropriateness of the statistic and the understanding of the most commonly appearing statistics used in the professional and popular literature. Published articles will be used to supplement the topics discussed and reference will also be made to portions of the required text.

Required Text :

A Study Guide to Epidemiology and Biostatistics. Morton, Richard F., Hebel, J. Richard, McCarter, Robert J., Fourth Edition, Aspen Publication, 1996.

Evaluation and Grading:

There will be one (1) midterm examination and one (1) final examination. The material will be from lectures and readings. The format of the examinations will be primarily, but not exclusively, multiple choice answers. There will also be computational problems and short essay. The midterm will be weighted 40% and the final weighted 60% of the final grade. An average of 70% will be the minimum passing level for the course. An electronic search must also be submitted to fulfill the course requirements.

Remediation :

For students who do not reach the minimum passing level, remediation can be accomplished by:

1. Taking and passing another examination covering the entire course; or

2. Re-enrolling and subsequently passing OSS 512 the next time it is offered; or

3. Enrolling in an independent study course which meets equivalent objects of OSS 512. The determination of equivalency is at the discretion of the instructor of record.

N.B. The date of any remediation examination will be at the discretion of the instructor of record.

Table of Contents

Expectations 1

Basic Terminology 4

Sensitivity and Specificity 30

Levels of Measurement 39

Prospective Studies 62

Case-Control Studies 65

Establishing a Statistical Association or Statistical Relationship 68

Sample Size 85

Readings 99

Supplementary Explanations 100

Homework Set 1 H - 1

Homework Set 2 H - 4

Homework Set 3 H - 7

Homework Set 4 H - 10

Suggested Answers to Homework Set 1 A - 1

Suggested Answers to Homework Set 2 A - 5

Suggested Answers to Homework Set 3 A - 6

Suggested Answers to Homework Set 4 A - 16

Index I - 1


1. Become facile with the terms risk, incidence (and its cognates), case fatality, survival, and prevalence as well as independent and dependent variables. Discern the correct usage of these terms when presented with articles using these terms. Be able to calculate these epidemiological measures.

2. Be able to define common indices of medical and health care.

3. Be able to compute common measures of central tendency.

4. Identify indices of dispersion and apply them in reading articles.

5. Be able to identify the correct levels of measurement associated with common medical variables.

6. Be able to identify a case-control study when presented with an article or an abstract of an article.

7. Become familiar with the advantages, disadvantages and biases associated with case-control studies.

8. Be able to critique a case-control study.

9. Be knowledgeable about the strict criteria of causation as used in medicine.

10. Be able to compute an odds ratio when presented with a 2x2 table, a description of a medical situation and/or an article.

11. Be able to identify a prospective study when presented with an article or an abstract of an article.

12. Become familiar with the advantages, disadvantages and biases associated with prospective studies.

13. Be able to critique a prospective study.

14. Be able to compute a relative risk for a prospective study.

15. Be able to interpret tables of association from retrospective and prospective studies.

16. Be able to compute and interpret sensitivity, specificity and predictive values from tables, articles or contrived problems.

17. Be able to discern if the sample size a study uses is sufficient to support the conclusions the author proffers.

18. Be able to deduce the null and alternative hypotheses from an article or paper case.

19. Be able to define statistical significance and apply it to articles.

20. Be knowledgeable about the framework for statistical testing.

21. Be able to interpret confidence intervals.

22. Be knowledgeable about the correct use of common statistical tests.

23. Be able to identify the common disease states in the United States and Michigan.

24. Be able to solve elementary probability problems.


What it is!

Basic Terminology

Prevalence (Point)

Generally speaking

1. Existing Cases

2. In a defined population

3. A time period needs to be specified (it is many times implied -like one year or on a given day).

4. Formula for computation:

where C = Existing cases in the time period and N = Number of persons in the population.


Generally speaking

1. NEW cases

2. In a defined population

Incidence Rate

1. New Cases

2. In a defined population

3. A time period must be specified.

4. Denominator is PERSON TIME.

5. Formula for Computation:

where A = Number of NEW cases and PT = total number of (weeks, months, years, etc.) that are observed.

Cumulative Incidence or Risk

1. A probability

2. New Cases

3. In a defined population

Where A = the number of NEW cases of disease and N represents the number of people in the population under study. Unless specifically stated the time period is usually implied as 1 year. BE CAREFUL to note the time period.


1. How many female medical students are currently pregnant?

2. How many medical students become pregnant during the first year of medical school?

There is a relationship between PREVALENCE and RISK (Cumulative incidence)

Sometimes this will be written as: P = I x D.

Think what this equation means in illness:




Can Prevalence decrease yet Incidence increase?
Characteristics of risk, prevalence and incidence rate
Incidence Rate
What is measured Probability of disease Percent of population with Disease Rapidity of disease occurrence
Units None None Cases/person-time
Time of disease diagnosis Newly diagnosed Existing Newly diagnosed
Synonyms Cumulative Incidence
Incidence Density

Example 1:

Consider the following example:

A 60 year old white male refinery worker recently developed shortness of breath and nosebleeds. On physical examination, he was pale and his pulse was elevated at 110 beats per minute. His hematocrit was 20% (low), indicating anemia, his white blood cell count was 20,000/L (elevated), his platelet count was 15,000/L (low), and examination of his peripheral blood smear revealed atypical myeloblasts. He was hospitalized for suspected acute myelocytic leukemia. The diagnosis was confirmed and chemotherapy was started. About 3 weeks after his temperature rose abruptly to 102F and his granulocyte count dropped to 100/L (abnormally low). Cultures were taken of his blood and urine since no apparent source of infection was evident.


A strategy to answer the above question may be:

1. Are cancer patients prone to develop infections (i.e. bacterial) that are treatable by antibiotics.

2. Are patients with a profile similar to the above patient more or less likely to develop a NOSOCOMIAL infection?

3. If they are, what are the most likely organisms?

4. What are the antibiotics available that are best to treat the most probable organism?

A literature review showed a study with over 5000 patients, who had cancer and developed a nosocomial infection. The DEFINITION of a case was a culture proven infection, beginning 48 hours after admission and occurred no more than 48 hours after discharge. Results:

5,031 patients

596 cases meeting the definition.

Find the RISK:


Our patient had a fever and granulocytopenia. Thus if we can find a subset of the population studied that has these qualities it may refine our RISK estimate. A study showed the following information:

1,022 cancer patients were studied.

530 developed a clinically documented bacterial infection.


Conclusion relative to treatment with an antibiotic?

Recall your microbiology and your clinical correlations and determine the most likely organism. In the rare event that you can't recall, you can again use the literature to help out. One study showed a group of patients at an outpatient clinic and showed the following:

96 patients were chosen for study, since they had no apparent skin infection.

62 patients were positive for Staphylococcus aureus.


The 5,031 patients remained under observation for a total 127,859 patient-days (or an average of 25.4 days). A total of 596 patients developed an infection that met the definition for a hospital acquired infection.



Probability of remaining alive for a specified period of time.

Calculated by:

Where S = survival

A = number of newly diagnosed cases

D = number of deaths observed in the newly diagnosed cases after following them for a specified period of time.

For acute leukemia the 5 year survival rate has been reported to be only 9% on average. For those younger than 65 years of age, the survival rate is approximately 14%, for those 65 years or older the figure is only 2%. We calculate these figures by means of a life table.

Case Fatality

The proportion of people who die from a given disease.

It is calculated by:


Where D = number of deaths

A = Number of diagnosed patients

Below are some of the commonly occurring measures of Natality ( measures concerning birth related events), Morbidity (measures concerning illness related events) and Mortality (measures concerning Death related events)
Name of rate

or ratio






per number

at risk

Death Rate Total number of deaths reported during a specified time interval. Estimated mid-interval population. per 1,000

per 10,000

per 100,000

Birth rate Number of LIVE births reported during a specified time interval. Estimated mid-interval population. per 1,000
Fertility rate Number of live births reported during a specified time interval from mothers 15-44 years. Estimated number of women in age group 15-44 at mid-interval. per 1,000
Low birth weight ratio (This is really a Proportion) Number of live births under 2,500 grams (51/2 lb.) during a specified time interval Number of LIVE births reported during the same time interval. per 100
Incidence Rate Number of NEW cases of a specified disease reported during a specified time interval. Estimated mid-interval population at risk. per 100

per 1,000


Attack Rate WARNING! This is really a proportion Number of new cases of a specified disease reported having a specified time interval. Susceptible population at risk during the same time interval. per 100 

per 1,000


Point prevalence ratio

Period prevalence ratio

Number of current cases of a disease existing at a specified point in time.

Number of current cases of a specified disease during a specified time interval.

Estimated population at risk at the same point in time.

Estimated mid-interval population at risk.

per 100

per 1,000

per 10,000


per 100

per 1,000


Proportionate mortality ratio (PMR) Number of DEATHS assigned to a SPECIFIC cause. TOTAL number of deaths from ALL causes reported during the same interval. per 100 or 1,000
Infant Mortality Rate Number of deaths UNDER 1 yr. of age reported during a specified time interval, usually a calendar year. Number of LIVE births reported during the same time interval. per 1,000
Fetal Death rate Number of fetal deaths of 28 weeks or more gestation reported during a specified time interval usually a calendar year. Number of fetal deaths of 28 weeks or more gestation reported during the same time interval PLUS the number of live births occurring during the same time interval. per 1000
Fetal death RATIO (often confused with the above) Number of fetal deaths of 28 weeks of more gestation reported during a specified time interval. Number of live births reported during the same time interval. per 1,000
Cause-specific death rate The number of deaths assigned to a SPECIFIED CAUSE during a specified time interval. Mid-interval population. per 100,000

Additional terms used in the medical literature.




Some things from your past:



If the value of the Fraction is GREATER than 1, then the NUMERATOR is ____________________ the DENOMINATOR.

If the value of the Fraction is LESS than 1, then the NUMERATOR is ____________________ the DENOMINATOR.

If the value of the Fraction is equal to 1, then the NUMERATOR is ____________________ the DENOMINATOR.

Consider the following statement:

Of the people who turned in the Biographical Sheet at the first lecture, __ were men and __ were women.

What is the ratio of MEN to WOMEN?

What is the ratio of WOMEN TO MEN?

What is the proportion of the responders were WOMEN?

Consider the following statement:

The crude death rate in Florida is 10.9, the crude death rate in Alaska is 4.4. Therefore, if you want to live longer, go to Alaska.

We must be able to account for variables that effect the DEPENDENT variable of a study. The phrase you will see in the literature is ADJUSTED FOR. The most common variables ADJUSTED FOR are age, sex, and race. This technique SHOULD be used anytime you are COMPARING two groups that differ on a variable (like age) that is related to your outcome.

There are several techniques that one can use to ADJUST FOR a variable. One is a statistical technique called covariance, which we will not discuss; another is called DIRECT ADJUSTMENT and is used when reporting certain indices of health.

The following examples illustrate the difference between unadjusted (crude) and adjusted figures and how a direct standardization is done.
Developed Country --A
Developing Country -- B
Infants Born
Infants Born
Birth Weight N in 1000s
N in 1000s
1500 - 2499g


Crude Infant mortality rate for Country A =

Crude Infant mortality rate for Country B =

Is this reasonable -- or is there something that confuses the comparison?

Weight Specific Rate for developed country:

<1500g = (870/20,000) = .0435 = 43.5 per 1000

1500 - 2499 = (480/30,000) = .016 =

>2499 =

We now need a common linkage of WEIGHT between the two counties. For simplicity let us use the weight distribution of the Country A. The question now becomes:

If the developing country had the same weight distribution as the developed country, how would the death rates compare? THIS COMMON DISTRIBUTION ALLOWS FOR AN "EQUAL" COMPARISON.

Adjusted Rate =

{(62 x 20) + (20 x 30) + (9 x 150)} / (20 + 30 + 150)

= 3190/200,000 = .01595 = 15.95 per 1000
Table 1. Calculation of the Age-adjusted Mortality Rates from all Causes by the Direct Method: United States, 1950 and 1960.


Mortality from All Causes per 100,000 Population

Standard Population: Total U.S. Enumerated Population per 1,000,000 Expected Number of Deaths that would Occur in Standard Population at Rates in:
Age Group












<1 3,299.2 2,696.4 15,343 506.2 413.7
1-4 139.4 109.1 64,718 90.2 70.6
5-14 60.1 46.6 170,355 102.4 79.4
15-24 128.1 106.3 181,677 232.7 193.1
25-34 178.7 146.4 162,066 289.6 237.6
35-44 358.7 299.4 139,237 499.4 416.9
45-54 853.9 756.0 117,811 1,006.0 890.7
55-64 1,901.0 1,735.1 80,294 1,526.4 1,393.2
65-74 4,104.3 3,822.1 48,426 1,987.5 1,850.9
75-84 9,331.1 8,745.2 17,303 1,614.6 1,513.2
85+ 20,196.9 19,857.5 2,770 559.5 550.4
Total death rate all ages 963.8 954.7 -- -- --
Total population -- -- 1,000,000 -- --
Total expected number of deaths -- -- -- 8,414.5 7,609.7
Age-adjusted death rate per 100,000 -- -- -- 841.45 760.97

Sensitivity and Specificity

In trying to determine the etiology of illness (diagnosis) it is many times necessary to use the laboratory for additional information. This is so common that many practicing physicians and other health professionals that the tests for granted. Two points MUST be kept in mind:

1. The laboratory test can (should) only CONFIRM the diagnosis.

2. BASIC ASSUMPTION is that the lab test can be trusted. For your own benefit and that of your patient check out:

a. Is anything going on that would alter the lab test:

1. Medications that conflict with the lab test?

2. Comorbidity that will interfere with the lab test?

3. Proper procedure followed by the PATIENT, PHYSICIAN, LAB PERSONAL?

b. Is the test valid?

Validity is loosely defined as "appropriate for the task". That is to say, is the right test being ordered for the question you have in mind? Should you do a brain biopsy for suspected diabetes? If you determine that the test is appropriate, can you trust the results? This question of trust is what makes the notion of sensitivity and specificity important.

The table below is the reference point for all the terms that follow:

+ -
Test + a b a + b
- c d c + d
a + c b + d a + b + c + d


1. Sensitivity:

The ability to correctly identify those who have the condition or disease. This translates statistically as THE PROPORTION OF THOSE WITH THE DISEASE WHO TEST POSITIVE.

2. Specificity:

The ability to correctly identify those who DO NOT HAVE the condition or disease. This translates statistically as THE PROPORTION OF THOSE WHO DO NOT HAVE THE DISEASE WHO TEST NEGATIVE.

3. Prevalence:

The proportion of those people in the "at risk" population who currently have the disease. This translates statistically as

4. Positive Predictive Value:

The proportion of those people WHO TEST POSITIVE who actually have the disease. This translates statistically as

5. Negative Predictive Value:

The proportion of those people WHO TEST NEGATIVE who are actually free of disease. This translates statistically as

6. False Positive Rate:

The proportion of those people WHO ARE DISEASE FREE, who have positive tests. This translates statistically as

7. False Negative Rate:

The proportion of those people WHO ARE SICK, whose lab test is negative.


Steps to take in determining indices of diagnostic tests


  Target Disorder
Lab Test +



1 = Arbitrary Sample Size

2 = Prevalence x (1)

3 = (1) - (2)

4 = Sensitivity x (2)

5 = Specificity x (3)

6 = (2) - (4)

7 = (3) - (5)

EXAMPLE: Assume you are looking for a disease that has a PREVALENCE of 2% in the population of interest. The test you are going to use has a SENSITIVITY of 90% and a SPECIFICITY of 95%. What is the PPV and NPV of the test?

Points to Help Interpret Test Results

1. Sensitivity and Specificity do not depend on Prevalence of Disease.

2. Predictive value positive and Predictive Value Negative DO depend on Prevalence of disease.

3. If the prevalence of disease is low in your patient population (rare), then most of your predictive positive test results will be FALSELY POSITIVE.

4. If the prevalence of disease is high in your patient population, then most of your predictive positive test results will be TRUE POSITIVES.

Final Comments

While it is desirable to be able to calculate the various indices of a test it is critical that you understand their application to a medical or health situation. As physicians, we want to RULE IN or RULE OUT disease states or conditions. That is to say, we want to say to someone YES, you have had a heart attack or NO you did NOT have a heart attack. When presented with a patient you do a history and physical (to the extent possible) and while doing so you begin to generate a list of possible etiologies for the signs or symptoms elicited from the patient. This is loosely called generating a differential diagnosis. As you go along you begin to eliminate explanations from the list. When you have reduced the list as far as you can you then appeal to ancillary sources -- like the laboratory or consultants or other medical students etc.. The goal is to reduce the list so that fewer and fewer possible explanations for the signs, symptoms or hypothesized disease state of the patient remain. This elimination process is called RULING OUT etiologies. Those that remain are ruled in and must be investigated thoroughly. Since we do not know for sure what the disease is we might seek additional laboratory results to help. WE THUS WILL BE IN A POSITION TO KNOW ONLY THE LAB TEST RESULTS. If the lab test is POSITIVE we would like to say you have disease X. The statistical translation of this process is to determine the probability that someone with a positive test AND that the person also has the suspected disease. This is the positive predictive value of the test. What we have to agonize over is the fact that UNFORTUNATELY there will be someone who tests positive for the disease and yet WILL ACTUALLY NOT HAVE THE DISEASE IN QUESTION. We will erroneously tell someone YOU HAVE THE DISEASE when in fact they do not. They were falsely positive or FALSE-POSITIVES. Similarly, if their test results were NEGATIVE, we would like to declare them to be FREE OF DISEASE. The statistical definition of this situation would be the ascertainment of the probability that one who tests negative AND is really free of disease is called the negative predictive value of the test. Again, we have to agonize over the fact that someone who tests negative WILL ACTUALLY HAVE THE DISEASE. These people are falsely negative or FALSE NEGATIVES. In the 2 x 2 table given in lecture and which precedes this explanation, the FALSE POSITIVES are indicated by the letter b. The FALSE NEGATIVES are indicated by the letter c. The cell identified as a is usually called the TRUE POSITIVES since these individuals have POSITIVE test results AND are indeed diseased. Similarly, the cell identified as d is termed the TRUE NEGATIVES since these people have NEGATIVE test results and are indeed free of disease.

To appreciate these terms let us consider the disease of Acquired Immunodeficiency Syndrome (AIDS). The ultimate definition of the disease rests on certain lab tests as well as physical presentation. The precursor of the disease rests on a laboratory test which detects the presence or absence of antibodies to the virus itself. Since the detection of the antibody is cheaper than growing the virus and the fact that the one develops an antibody only after exposure to a virus, one infers that if one is antibody positive then one has been exposed to the "AIDS VIRUS" (which is more appropriately referred to as HIV -- Human Immunodeficiency Virus). If I tell someone YOU are HIV positive, based on a positive lab test, but the person is really FALSELY POSITIVE what are the consequences? We know that the disease engenders panic and many times despair in the individual as well as the public. There are consequences for the persons sexual partner, potential childbearing, employment, insurance, long-term plans, and a plethora of social as well as medical disasters. On the other hand, if the persons test result is negative, and I tell the person there are not infected, but the person is FALSELY NEGATIVE, what are the consequences? Certainly false hope is engendered. The person may donate blood, engage in risky behavior, and from a social point of view, become an unknowing risk to society. Thus the consequences of a testing situation must be evaluated before a decision to run a lab test is done. While I have chosen a real situation, it is admittedly dramatic. One forgets that the same thought processes must be thought of even when the test is, as some people say, routine. There is always the risk to the patient that must be weighed against the benefit to the patient. Will the test CHANGE YOUR TREATMENT PLAN? If the answer is NO, then you should not do the test. If the answer is YES, then at least you have a context within which to interpret the test. There is much talk about defensive medicine these days. You will here repeatedly, I did the test for legal reasons, not for medical reasons. Studies done to evaluate this claim estimate that only about 10% of the lab work done can be termed defensive. The rest may really be unnecessary. You will be taking a course in medical jurisprudence later in your stay here, ask the instructor about this figure and advice on how to proceed.

Cutoff Point set too low

Cutoff point of greater sensitivity

Cutoff point of greater specificity

True positives for cutoff point X


The term implies something that changes. How much is the change? This implies something that is:

1. Measurable

2. Discernible (or Observable)

Independent Variable:

Those features which describe or discern individuals or groups of individuals prior to the start of the study. AGE, SEX, PRIOR LAB VALUES, EXPOSURE TO A DRUG, ABSENCE OF EXPOSURE, ETC. One usually sees this defined in statistical texts as those variables under the control of the investigator. This is true if you understand that this means YOU can choose how to categorize or pick patients for a study.

Dependent Variable:

That which serves to assess the OUTCOME of the study.

Synonyms for Independent Variables -- predictors, precursors, "cause"

Synonyms for Dependent Variables -- predicted, outcome, result, consequent, effect


1) Married men will incur more physician visits than single men.

Independent variable ___________________________________________

Dependent variable ____________________________________________

Marital Status ------------------------------> # of physician visits

# of physician visits ----------------------> Marital status

2) What is the relationship between menstrual cycle phase at the time of surgery and the onset of the first post-operative menses?

Independent variable __________________________________________

Dependent variable ___________________________________________

Levels of Measurement

1. Nominal (Attribute, Qualitative)

Naming, categorical

a. Dichotomous

b. Polychotomous

2. Ordinal (Qualitative)

Ranked or ordered along some property of the variable.

Example: Stages of Cancer

Stage I

Stage II

Stage III

Example: Satisfaction with Personal Physician.

1. Very Satisfied

2. Somewhat Satisfied

3. Somewhat Unsatisfied

4. Very Unsatisfied

3. Interval (Quantitative, Continuous)

"Ratio" Interval + a real (or meaningful) zero. This course will treat ratio data as interval data. We will also, from here on refer to ratio data as results formed when one number is divided by another.

For purposes of this course and for general medicine, INTERVAL will be the most precise form of measurement we will need, You can think of this as forming an interval by subtracting the value of one end of a spectrum from the opposite end of the spectrum in question. Consider the variable temperature. A temperature of 101oF yesterday and a current temperature of 98oF today gives and an interval of 3 degrees. The INTERVALS between successive measurements are EQUAL. There are two things implicit in this level of measurement:

a. The variable is continuous. That is to say, we can get as precise as we want in the measurement of the variable.

b. An interval of a given length is interpretable no matter where it is on the scale.

Example: Temperature, Height, Weight.

Example: Exercise levels.

0. No exercise

1. Moderate exercise, no sweating.

2. Exercise to the point of sweating.

3. Strenuous exercise to the point of sweating 30 minutes a day.

4. Strenuous exercise for at least 1 hour per day.

Teaching point to remember for LEVELS OF MEASUREMENT

1. All categories of measurement must be mutually exclusive.

2. All categories of measurement must be jointly exhaustive.

3. Levels of measurement is a "one-way street".

Independent Variables should temporally and logically precede Dependent Variables


Elementary Descriptive Statistics

Measures of Central Tendency

A single number that best represents a group of observations.

For the definitions below consider the following example:

A group of children are brought to an emergency room after a flood. Their ages are: 1,1,1,6,4,6.


The most frequently occurring number in a series of numbers. There may be NO mode -- all numbers occurring only once or there may be several modes -- more than one number occurring with greater frequency than all the other numbers.

The mode is generally used for nominal data. It is the quickest calculation to make.

Calculate by noting the frequency with which each value in a series occurs.

In the example above, the MODE = 1.


This is a number which divides the frequency of observations into two equal parts. This number may be an actual observation, or a contrived one.

The median is generally used for ordinal data. It is easy to calculate but I suggest the following approach:

1. Arrange the numbers in ORDER from Low to High.

2. If the total number of observations is EVEN

add the two middle numbers together and divide by 2.

If the total number of observations is ODD

the median is the middle number.

Consider the example above:

Step 1. Order the numbers from Low to High


Step 2. Is there an ODD or EVEN number of observations? There are six observation so we add the two middle numbers together ( 1 + 4 ) and divide by 2. Thus the median is 2.5. Here the median is a CONTRIVED number in the sense that it is created and never appears as a real observation.

If there had been an ODD number of observations, for example:


We would again ORDER the set of observation:


and then choose the number which divides the set in half. In this example the median would be 4. Since 2 observations lie to the left of 4 and 2 observations lie to the right of 4.

You have encountered the MEDIAN before, perhaps not by name, but by application. The 50% mark divides a group of observations in half thus it is the median. The next time you see a grade sheet, the number correct that is equal to 50% of the observations is the median mark. You can judge whether you are above or below the median by comparing your mark to this reference point.

The big advantage is that the median is unaffected by extreme scores. For example consider the following set of observations:


The median is still 4, since there are 2 observations to the left of 4 and two observations to the right of 4.


This is usually taken to be the arithmetic mean. It is defined as the sum of all the observations divided by the number of observations in the series.

It is most often used with INTERVAL data.

The biggest problem with the MEAN is that it is markedly affected by extreme scores.

In the example above, the mean is calculated as follows:

x = 1 + 1 + 1 + 4 + 6 + 6 = 19

N = 6 since there are 6 observations. Thus the mean is equal to 19/6 = 3.17. Notice please that this number is contrived. There is no 3.17 among our observations. It is nonetheless the best number to represent the average of the observations.

As a point of emphasis, consider the following set of observations:

1,1,2,6. The mean is 10 divided by 4 = 2.5

1,1,2,6000. Here the mean is 6004 divided by 4 = 1501. Note how the mean is markedly affected. IT MOVES IN THE DIRECTION OF THE EXTREME SCORE.

Measures of Dispersion


The lowest number in a series subtracted from the highest number in the series.

Quick to compute but is of limited utility. May be used to calculate sample size.

For the series: 1,1,1,4,6,6 the range is 6 - 1 = 5.


This is one of the most commonly used terms to describe the spread of a set of observations. It is calculated by formulae in your readings. I will not ask you to calculate it for this class.


A. Large

Range of numbers

Number of people or observations in the study is small

B. Small

Range of numbers

Number of people or observations in the study is LARGE



This is the positive SQUARE ROOT of the Variance. Again I will not ask you to calculate this for the class. The interpretation is important however.


In certain circumstances determine NUMBER of OBSERVATIONS around the mean of a set of observations.

Accuracy of a set of readings


Range of numbers

Number of observations
Table 2. Summary of the Characteristics of 93 Patients Admitted to the Hospital for Suspected MI

(N = 43)


(N = 50)

No. of patients assigned to:









No. of patients with MI
No. of patients without MI
Age, (years ± SD)
64.1 ± 14.9
71.0 ± 14.9
HDPI score (± SD)
0.369 ± 0.241
0.365 ± .243
CCU demotes cardiac care unit; MI, myocardial infarction; HDPI, Acute Ischemic Heart Disease Predictive Instrument.

Source: Green and Ruffin, MI Treatment in Men vs. Women, The Journal of Family Practice, Vol. 36, No.4, 1993


This is calculated by dividing the standard deviation of a set of observations by the mean of the observations.

This is a quick way of comparing the dispersion of two different sets observations.
Males Females
s.d. =  s.d. =


Coef. of Variation:

Males =

Females =

This is only an exercise! We would only use this C.V. in the situation when the variable of interest is measured on different scales in two different studies.

Statistical Testing

Researchers in the medical field may be motivated by different factors but it is undeniable that people anticipate results of their investigation. This anticipated result, sometimes vague, leads one to start to systematically inquire initiate a set of activities which, hopefully, will lead to an unbiased determination of whether their ideas have merit or not. The basic ground rule for going into these activities is usually that their initial idea is testable. This means it is capable of being rendered true or false by empirical data. This initial notion is sometimes called the SCIENTIFIC or RESEARCH HYPOTHESIS. It is really what the anticipated results might look like. This scientific hypothesis (es) are then translated into a SET of specific statements called STATISTICAL HYPOTHESES , which are then evaluated by statistical techniques. The plausibility of these statistical hypotheses, decided after this evaluation process, gives credence to the research hypothesis. This evaluation process is called STATISTICAL TESTING. The following are examples of scientific hypotheses:

A) Early detection of breast cancer will increase the proportion of women who survive 5 years.

B) ACE inhibitors will have fewer side effects in hypertensive African-American males.

C) Combination therapy in HIV positive patients will persistently decrease viral load during the period of administration.

Notice that these statements reflect only the anticipated findings of the research. There may be no mention of comparison groups, etc., although these can certainly be included. These details must be attended to in the translation of these ideas into statistical hypotheses. For example:

Example: Lumpectomy vs. Lumpectomy plus radiation

Scientific hypothesis: Among women with ductal carcinoma in situ who undergo a lumpectomy alone or lumpectomy plus radiation, a difference will exist between the proportion experiencing tumor recurrence in the treated breast within five years after treatment.

Statistical Hypotheses: Eight hundred women with ductal carcinoma in situ were sampled and assigned randomly to have lumpectomy alone or lumpectomy plus radiation. Within 5 years, 56 of the 400 women who had lumpectomy alone had recurrence of cancer in the treated breast; 16 of 400 women who had lumpectomies plus radiation had recurrence in the treated breast. Let represent the proportion of women who experience tumor recurrence, L indicate lumpectomy alone, and L/R designate lumpectomy plus radiation. The STATISTICAL HYPOTHESES are:

H0: L/R - L = 0

H1: L/R - L 0

I. Decision Making

a. Clinical Situation -- Choose between alternatives.

1. At least 2 alternatives -- Not Sick or Sick.

2. Not treat or Treat

b. Literature -- same notion.

1. No difference between Drug A and Drug B or Drug A is different than Drug B.

2. Cases no different from Controls or they are different.

3. Experimental group is the same as the comparison group or it is different.

4. Exposed group is not different from the unexposed group, or it is.

II. Sampling

a. In the clinical situation or the literature -- conclusions based on sampling.

e.g. HTN on visit -- is the person really hypertensive.

THE UNDERLYING CONCERN IS ALWAYS WHETHER OUR SAMPLES ARE TRULY REFLECTIVE OF THE TRUE STATE OF AFFAIRS? For it is from these samples that we will infer what goes on in the population from which these samples are theoretically obtained.

III. Hypotheses -- Something to be shown correct or incorrect.

a. From a literature (as well as clinical point-of-view) two statements:

* The groups do not differ from each other (they are equal to each other -- the apparent difference is due to chance).

** The groups in fact differ from each other (something is going on -- it is due to something other than chance).

b. Convention calls the statement of NO DIFFERENCE the NULL HYPOTHESIS ( H0 )

c. The alternative statement is called the ALTERNATIVE HYPOTHESIS. (H1 )

d. Mathematically:




Explicit Null Hypotheses Accompanying Decision Situations*
Decision Situation Null Hypothesis
Diagnostic Tests This patient's test in no different from the test of the group called well.
Clinical Trials This experimental treatment is no different from the treatment it is compared with.
Quality Control This batch of production is no different from the usual high-quality products of this company.
Patient Satisfaction This patient is no different from those who have benefitted from this therapy in the past.
Judicial This defendant is no different from the group of people whom we call not guilty.
Graduate education This candidate for graduate school is no different from those who have succeeded in the past
Used cars This car is no different from those that have proved dependable in the past.
Connubial This spouse is no different from faithful spouses.


* No different indicates having no important difference.

IV. p - Value

a. You always enter the situation ASSUMING THE NULL HYPOTHESIS IS TRUE. (i.e. There is no relationship between the variables -- the difference is equal to 0 or that the ratio is equal to 1).

b. The p-value is a measure of the compatibility of YOUR data with the NULL hypothesis.


if the p-value is large, "accept" the Null Hypothesis.

if the p-value is small, accept the Alternative Hypothesis.

c. Large vs. small is YOUR DECISION. This is referred to as the level. Think of this as the point BEYOND REASONABLE DOUBT.

d. For purposes of this course, the formal definition of a p-value is the probability of obtaining the result as large or larger than you did, and the null hypothesis still being true.

e. Type I error.

Because the p-value is a probability (a proportion) it can range from 0 -- 1. It is calculated on the assumption that the null hypothesis is true. Thus you are saying: I know the null hypothesis is TRUE, but the finding I see is so weird, that I am going to conclude the null hypothesis is false and accept the alternative. IT IS A MISTAKE TO DO SO, BUT YOU DO IT ANYWAY. Your subjective estimate of reasonable doubt has been exceeded. Thus you have made an error. YOU have rejected a true hypothesis.

V. How the Statistical Testing Process proceeds.

1. State the hypothesis of what you think will happen.

2. Generate the NULL hypothesis.

3. Decide on your -level.

4. Collect data.

5. Apply a statistical test.

6. Decide on the truth status of the NULL Hypothesis.

VI. Confidence Intervals

a. Point Estimation


Depends on 3 things:

1. Sample size

2. Sample variability

3. -level

c. Length of the interval

d. Decision Rule:

If the confidence interval contains THE NULL VALUE (either 0 for interval data or 1 for ratio data) then DO NOT reject H0.


Table A

Effects of caffeine consumption and other risk factors on low birth weight according to logistic regression for term deliveries, Yale-New Haven Hospital, 1980-1982
Parameter Adjusted relative 


95% CI*
p value
Caffeine intake (mg/day)
1 - 150 1.4 0.7 - 3.0 0.33
151 - 300 2.3 1.1 - 5.2 0.04
301 4.6 2.0 - 10.2 0.0004
Nonwhite ethnicity 4.0 2.4 - 6.6 0.0000
Parity 0§ 2.0 1.2 - 3.3 0.007
Cigarette smoking 1.7 1.1 - 2.9 0.02
Gestational age 5.6 4.3 - 7.2 0.0000


* 95% CI (categorical) = exp[ ± 1.96(SE)]: 95% CI (continuous) = exp[(X1) ± 1.96(X1]/

exp[(X0) ± 1.96(SE)X0], where X1 is the value of interest of the variable, and X0 the reference value.

Reference category is 0 mg/day.

Black and other compared with white ethnicity to calculate relative risk.

§ Compared with parity 1 or more to calculate the relative risk.

One or more cigarettes/day compared with none to calculate the relative risk.

Continuous variable. Thirty-seven weeks compared with 40 weeks gestation to calculate the relative risk.
Table 3. Patient Characteristics and Site of Care by Race at Hospitalization for Angiography.
Characteristic Whites, % Blacks, % p*
Sociodemographic factors
Female 37.6 52.0 <.001
Medicaid eligible 3.8 23.3 <.001
Principal diagnosis
Myocardial infarction 19.0 22.5 .003
Unstable angina 25.0 27.0 ns
Angina pectoris 11.5 13.1 ns
Chronic ischemia 44.5 37.4 <.001
Secondary diagnoses
Congestive heart failure 7.7 11.9 <.001
Diabetes mellitus 14.2 26.2 <.001
Chronic renal failure 0.7 2.8 <.001
Peripheral vascular disease 3.4 4.1 ns
Cerebrovascular disease 3.6 2.7 ns
Chronic obstructive lung disease 8.0 5.4 .002
Type of hospital
Public 9.6  15.1 <.001
Teaching 66.1 71.0 .001
Urban/suburban 92.7 90.2 .002
Revascularization procedures available 84.2 77.7 <.001
*2 test; ns indicates not significant.

Table 4. Unadjusted Rates of Revascularization Procedures Within 90 Days After Angiography, Stratified by Race and Type of Hospital.
Type of Hospital Where Angiography Performed Whites, % Blacks, % Relative Risk 90% Confidence Interval
Public 55.4 35.2 1.58 1.28 - 1.94
Private 53.9 37.5 1.44 1.32 - 1.56
Teaching 54.6 37.1 1.47 1.34 - 1.62
Nonteaching 52.9 37.3 1.42 1.22 - 1.64
Urban/suburban 54.1 37.0 1.46 1.35 - 1.59
Rural 53.9 38.3 1.41 1.10 - 1.80
Revascularization procedures available 56.0 39.7 1.41 1.30 - 1.54
Revascularization procedures not available 43.4 28.3 1.53 1.25 - 1.88

Table 5. Significant Multivariate Predictors of Revascularization Procedures Within 90 Days After Angiography.*
Variable Adjusted Odds Ratio 95% Confidence Interval
Sociodemographic factors
White 1.78 1.56 - 2.03
Male 1.28 1.22 - 1.35
Medicaid eligibility 0.80 0.71 - 0.91
Principal diagnosis
Myocardial infarction 2.14 1.94 - 2.35
Unstable angina 2.78 2.54 - 3.04
Chronic ischemia 2.17 1.99 - 2.35
Secondary diagnoses
Congestive heart failure 0.76 0.69 - 0.83
Peripheral vascular disease 0.74 0.65 - 0.85
Cerebrovascular disease 1.20 1.05 - 1.37
Chronic obstructive lung disease 0.79 0.72 - 0.87
Type of hospital
Revascularization procedures available 1.63 1.52 - 1.75
Public 1.11 1.02 - 1.21
Northeast 0.73 0.67 - 0.80
South 0.72 0.66 - 0.77
Midwest 0.80 0.74 - 0.87
* Using logistic regression to adjust for all listed variables and age, secondary diagnoses of diabetes mellitus and chronic renal failure, and the teaching status and urban or rural location of the hospital in which angiography was performed.

Relative to angina pectoris.

Relative to West. 

Table 6. Adjusted White-to-Black Odds Ratios for Revascularization Procedures Within 90 Days After Coronary Angiography by Type of Hospital.*
Type of Hospital Where Angiography Performed White-to-Black Odds Ratio 95% Confidence Interval
Public 2.11 1.51 - 2.95
Private 1.73 1.49 - 1.99
Teaching 1.84 1.58 - 2.16
Nonteaching 1.63 1.28 - 2.08
Urban/suburban 1.79 1.56 - 2.05
Rural 1.63 0.93 - 2.86
Revascularization procedures available 1.79 1.55 - 2.08
Revascularization procedures not available 1.72 1.28 - 2.32
*Using logistic regression to adjust for age; sex; region of residence; Medicaid eligibility; principal coronary diagnosis; secondary diagnoses of congestive heart failure, diabetes mellitus, chronic obstructive pulmonary disease, chronic renal failure, cerebrovascular disease, and peripheral vascular disease; and the ownership, teaching status, location, and availability of revascularization procedures at the hospital in which angiography was performed.


Examples of Two Statistical Tests

Example 1. Does the Average Height of the Male medical students in the class of 1996 differ from the average height of the Female medical students?
Mean = 70.42 inches 

Standard deviation = 2.74

Number = 53

Mean = 64.97 inches

Standard deviation = 3.248

Number = 39


The resulting t value can be interpreted as how many "standard deviations" you are from the middle of a distribution which has its center on 0. Based on this value you conclude that the null hypothesis of NO DIFFERENCE BETWEEN THE HEIGHTS OF MEN AND WOMEN IN THE CLASS OF 1996 SHOULD BE REJECTED, and the alternative to that null hypothesis should be accepted -- There is a statistically significant difference in the height of male medical students and female medical students.

Example 2.

Question: Is there any difference between men and women in their ranking of Knowledge and Competency as the highest ranking expectation of the American public regarding their physician.

H0: There is no association between gender of the respondent and the ranking of Knowledge and Competency as the highest ranking expectation of the American people relative to their physician.

H1: There is an association between gender and ranking.
Knowledgeable and Competent
Highest Second Third Fourth Total
Male 33 9 7 4 53 (57.6%)
Female 27 6 6 0 39 (42.4%)
Total 60 








92 (100%)


Expected Number of Observations Under the Null Hypothesis
Knowledgeable and Competent
Highest Second Third Fourth Total
Male 34.6 8.6 7.5 2.3 53 (57.6%)
Female 25.4 6.4 5.5 1.7 39 (42.4%)
Total 60 








92 (100%)


This information is combined in a statistic call the Chi-Squared Statistic.

Note the p-value. It is greater than the criterion level of .05, therefore the data are compatible with the null hypothesis. The null hypothesis is supported if we have enough people. Thus, There is no association between gender of the respondent and the ranking of Knowledge and Competency as the highest ranking expectation of the American people relative to their physician.

Prospective Studies


I. Synonyms

A. Cohort

B. Follow-up

II. Key Points

A. Disease Free

B. Assignment of patients done by "Nature" or Investigator

C. YOU decide what information to collect and when to collect it.

III. Advantages

A. Can estimate the incidence of a disease (or whatever dependent variable you are studying) with a high degree of accuracy.

B. Reduce the emphasis on RECALL

C. Can obtain information on changes in habits.

D. Provides the opportunity to study the whole spectrum of morbidity and/or mortality.

E. Avoids the "late-look" bias.

IV. Disadvantages

A. Difficult and expensive.

B. Induce change in habits.

C. Selection bias may be more difficult to detect.

D. Very inefficient for rare diseases.

Questions to be asked when faced with a


1. Study Population : How is the study population selected? Can you determine if the population studied is similar to your own population? Is the study population composed of individuals who have special characteristics that would select them for membership in the study? Where did they come from (referral center, general practice, general population, etc.)?

2. Sampling Procedures : Can you tell how the individuals were picked? Can you detect any SELECTION BIAS ? How do patients who were asked to participate BUT REFUSED differ from those who participated?

3. Follow - up : Is there loss to follow-up? Are the reasons for attrition likely to be related to the outcome of the study?

4. Habits : Did subjects change their habits while the study was in progress? If yes, were these individuals put into separate subgroups for analysis? Do the authors periodically reexamine the cohorts to see if habits change?

5. Surveillance bias : Is surveillance bias operating? Are the cohorts being followed with equal intensity? Or is high-powered scrutiny being applied to certain subjects, which may bias results?

Case-Control Studies

I. Reasons for a Case-Control Study

A. Efficiency

B. Rare Diseases

C. Ethics

II. Problems in Case-Control Studies

A. Adequacy of Information

B. Biased Recall

C. Selection of Controls

D. Selection of Cases

III. Control Group Considerations

A. Multiple Controls

B. Community Controls

C. Matching

1) Wasted Matching

2) Overmatching

Questions to ask when faced with a

Case Control Study

1. Are the Data Dependable?

--- since the data are obtained from the past, are the records complete.

2. Is Recall Bias a serious danger here?

--- Have attempts been made to assess or control for such a bias?

3. How alike are the cases and controls?

--- do they differ ONLY on the absence of disease? Are other differences that MIGHT bear on either the risk factor or the outcome or both present? If yes, did the study control for these differences (i.e. matching or a statistical technique?)

4. What kind of population do the cases represent?

--- Heterogenous population (high generalizability)

--- Homogeneous population (low generalizability)

5. Are other biases evident? (Can Apply to Both PROSPECTIVE and


a. Detection bias (heightened awareness)

b. Late-look bias (Neyman bias)

c. Non-response bias

d. Volunteer

e. **** SELECTION ******

f. Admission bias

Establishing a Statistical Association or Statistical Relationship

Assume we want to know if two variables are statistically related. This is also referred to as a statistical association. The idea is to determine if the presence of one variable effects the occurrence of a second variable.


Is the taking of oral contraceptives related to thrombophlebitis?

1) Independent variable ___________________________________

2) Dependent variable ____________________________________

Level of measurement of the independent variable _______________

Level of measurement of the dependent variable _________________

Let us look at two approaches -- Prospective and "Retrospective" or CASE-CONTROL.


Thrombophlebitis NO Thrombophlebitis
Birth Control Pills a = 30 b = 970 a + b = 1000
No Birth Control Pills c = 3 d = 997 c + d = 1000



Thrombophlebitis NO Thrombophlebitis
Birth Control Pills a = 90 b = 45
NO Birth Control Pills c = 10 d = 55
a + c = 100 b + d = 100


Risk Ratio

The letters appearing below A - D represent numbers of subject in the four possible combinations of exposure and outcome status. (In this instance death)

A. Exposed persons who later die.

B. Exposed persons who do not die.

C. Unexposed persons who later die.

D. Unexposed persons who do not die.

The total number of subjects in this study is the sum of A + B + C + D. The total number of exposed persons is A + B, and the total number of unexposed persons is C + D.
Table 7. Summary of risk data from a cohort study.
No Death
A + B
C + D
A + C
B + D
A + B + C + D

1In some studies, the outcome is development of disease rather than death.

Among exposed persons the risk (R) of death is defined as:

Among unexposed persons the risk (R) of death is defined as:

The Risk Ratio (RR), or Relative Risk, is:

Table 8. Relationship between 10-minute Apgar Scores and the Risk of Death in
the First Year of Life Among Children with Birth Weights of at least 2500 g.1
No Death
Apgar Score 0-3
Apgar Score 4-6

1Data used, with permission, from Nelson KB, Ellenberg JH: Apgar scores as predictors of chronic neurologic disability. Pediatrics 1981; 68:36.

The risk among exposed newborns is:

The risk among "less exposed" newborns is:

Quantification of the magnitude of this effect is achieved by calculating the risk ratio:

Attributable Risk and Attributable Risk Percent

Attributable Risk or Risk Difference or (excess risk) = RD. This is defined as:

Using the previously cited data relating 10 minute Apgar scores (0 - 3) vs (4 - 6) to the risk of death in the first year of life, the risk difference is:

Another measure of interest is the attributable risk percent (ARP), in which the risk difference is expressed as a percentage of the total risk experienced by the exposed group:

For the Apgar score-infant mortality data, the attributable risk percent is:

Sometimes studies are constructed to take into account how long someone stays in the study. For example how many years a person is studied. When this information is provided and the duration of study is important, a common framework for analysis is as follows:
Table 8a: Summary format of rate data from a cohort study
Number of Outcomes Person-time (PT) usually in years
Exposed Persons
Unexposed Persons
A + C


Several studies have looked at the existence of risk factors for heart attacks among working individuals. Since people come and go from the work force how long someone stays in study is critical for forming inferences about the safety of a job or the potential effect heart attacks have on labor force questions. Assume you have access to information that looks at baseline cholesterol levels in a workforce and then whether they subsequently develop an MI. The baseline measure is like a screen, the outcome is MI. Assume you study approximately 40,000 individuals for an average of 15 years (some longer, some shorter). Your data may look like this:
Table 8b. Baseline Cholesterol Levels in a Cohort of Men followed for an average of 15 years and subsequently Developing an MI. 
Chol. Lvl <5.1 mmol/L3
Chol. Lvl 5.2 - 6.2 mmol/L3


The interpretation would be that among white males with borderline high cholesterol levels was about three and one-half times higher than that of white males with lower cholesterol levels. Notice that no adjustment for age has been made. Would this make a difference? Probably yes.

Analysis for Case-Control Studies

Unmatched Design

A Cases who were exposed

B Controls who were exposed

C Cases who were not exposed

D Controls who were not exposed

Although the summary tables for cohort and case-control studies are similar, it is important to remember that the underlying approaches to sampling differ, and the analysis must account for these differences. In a cohort study sampling is based upon exposure status, and the investigator thus determines the total numbers of exposed (A + B) and the unexposed C + D) that are included in the study. Risk of disease development then can be estimated separately for exposed and unexposed groups, and these two risks can be compared in a risk ratio (RR).

A case-control, on the other hand, begins with sampling of persons with and without the disease of interest ((A + C) and (B + D) respectively). With this approach, the proportion of persons in the study who have the disease is no longer determined by the disease risk in the source population but rather by the choice of the investigator. That is, a disease that occurs infrequently in the source population can be over-sampled, so that affected individuals constitute a large proportion of the study sample. This ability to over-sample affected individuals is why case-control studies are statistically efficient for the study of rare diseases.

Once the investigator determines the ratio of persons with and without the disease of interest in a case-control study, risk of disease no longer can be estimated. As shown in the following section, however and indirect estimate of the incidence rate ratio can still be obtained in a case-control study.
Table 9. Summary of data collected in an unmatched case-control study.
A + B
C + D
A + C
B + D
A + B + C + D



With the notation introduce in Table 9, the probability that a case was exposed previously is estimated by:

The odds of exposure for exposure for cases represent the probability that a case was exposed divided by the probability that a case was not exposed. The odds then are estimated by:

Similarly, the odds of exposure among controls are estimated by:

The odds of exposure for cases divided by the odds of exposure for controls are expressed as the odds ratio (OR). Substituting from the preceding equations, the OR is estimated by:

The OR is sometimes termed the exposure odds ratio; or the cross-product of Table 9, because it results from dividing the product of entries on one diagonal of this table by the product of entries on the cross diagonal.
Table 10. Summary of data collected from a hypothetical unmatched case-control
study of Reye's Syndrome and Aspirin Use.
Aspirin Use
No Aspirin Use


In other words, the odds of aspirin use for patients with Reye's Syndrome were almost ten times greater than the odds of aspirin use among controls. This will many times be reported as: To the extent that the OR provides a valid estimate of the relative risk one could conclude from this investigation that use of aspirin for a preceding viral illness increased the likelihood of developing Reye's Syndrome tenfold.

Editorial Note: The italicized and bolded statement above, must be interpreted carefully. It implies that a retrospective study can induce a "cause-effect" relationship. Statements like this try to ease this leap of faith by using the phrase "To the extent..." , but it is a push nonetheless. In point of fact these particular data were taken at face value, since clinicians are no longer prescribing Aspirin for fever and headaches in children. Many subsequent studies have tried to verify the coincident occurrence of Aspirin use, viral illness and serious disease outcomes. Thus, a case control study provides an efficient means of INITIALLY looking at a serious disease, but does not establish a definitive cause and effect relationship.


Table 11. Numbers of cases and controls and relative risk

according to a history of use of dietetic beverages and

sugar substitutes by sex.

DIETETIC BEVERAGES 144 155 0.8 0.6-1.1 69 46 1.6 0.9-2.7
SUGAR SUBSTITUTES 101 113 0.8 0.5-1.1 54 39 1.5 0.9-2.6
NO EXPOSURE 224 193 1 74 80 1


* CI denotes 95 percent confidence interval.

Nonexposed subjects reported never using dietetic beverages or sugar substitutes and no current use of artificially sweetened foods

Table 12. Numbers of cases and controls and relative risk according to current frequency of use of dietetic beverages, sugar substitutes and artificially sweetened foods, by sex.

Exposure Men Women
Cases Controls Relative Risk Cases Controls Relative Risk
Dietetic Beverages
22 12 1.9 6 9 0.5
18 23 0.9 11 9 1.6
64 77 0.7 33 13 2.5
Sugar Substitutes

Powder packets















or equivalent/day
16 28 0.5 15 11 1.2
8 7 1.3 4 5 0.8
13 10 1.5
Dietetic Foods
31 33 0.9 20 16 1.4
13 18 0.6 9 16 0.5
12 13 0.8
NO Exposure 224 193 1 74 80 1


Associations (Continued)

I. Correlation or Association

A. Positive

B. Negative

II. Causation

A. Criteria

1. Strong Design

2. Evidence from Human Experiments


4. Consistency


6. Dose-Response

7. Epidemiologic Sense

8. Biologic Sense

9. Analogous to previously shown studies of causal association

Briefly: Statistical Association

Temporal Association

Alternative explanations ruled out



III. Applications

A. Below are statements which suggest association -- are they valid?

1. If you find that 60% of students who develop infectious mononucleosis are habitual smokers, this shows the presence of an association between the disease and smoking. (T F)

2. If you find that 5% of students who smoke develop infectious mononucleosis during a one-year follow-up period, this shows the presence of an association between the disease and smoking. (T F)

B. A suggested mechanism for "Association".

1. 2 x 2 table
Positive Negative
Exposed A B
Unexposed C D



3. If 60% of a large sample of male students and 30% of a large sample of female students smoke, there is an association between gender and smoking. (T F)

4. BE CAREFUL in the sense that the terms used to express a relationship -- like incidence or proportion (percent) implies a knowledge of a denominator or a total. Thus, you can deduce the missing cells in these instances.

Sample Size

I. Consider the following statement:

The violent crime rate in City A in 1988 was 30%. The violent crime rate in 1989 decreased to 15%. This shows a decrease of 15%.

Is the conclusion true or false?

II. Types of changes reported in the medical literature.

A. Absolute Change

B. Percentage change (proportional change)

III. Power

The ability to find a significant difference if it really exists.

You apply it to instances when the authors show NO statistical significance.

The question reduces to : Is my sample size large enough to find a clinically important difference.

Using a Nomogram for CONTINUOUS Variables

Perform these Steps:

1. Decide what size difference between the two groups is clinically important.

2. Locate the difference on the horizontal axis.

3. Extend a vertical line to the diagonal line representing the standard deviation.

4. Extend a horizontal line to the vertical axis and read the required sample size.


Using a Nomogram for DICHOTOMOUS Variables

Perform these steps:

1. Identify one of the two groups as the control group.

2. Decide what difference between the two groups would be considered clinically important. Express this difference as a % change in the response rate.

3. Locate the % change on the horizontal axis.

4. Extend a vertical line to intersect with the diagonal line representing the response rate.

5. Extend a horizontal line from the intersection point to the vertical axis and read the required sample size.


A physician wants to assess the effects of calcium supplements on blood pressure. The physician wants to be able to detect a 5 mm difference between the treatment and the control group when the standard deviation is 15. Use a nomogram to determine the required sample size.

Answer: The required sample size of each group is __________.

What is the likely outcome of the above experiment if the research is conducted with 100 patients in each group?

Answer: ___________________.


A researcher is trying to assess the effectiveness of a new therapy. The standard therapy has a cure rate of 30%. The researcher is interested if the new therapy will cure 45%. The research is done with 90 patients in the treatment and control group. The difference in cure rates in found to be nonsignificant. Use a nomogram to determine if the sample size was adequate.

The sample size was _____________. Approximately __________ should have been in each group.

Suppose the above research was interested in detecting a 100% increase in cure rate. Under these conditions was the sample size adequate if 60 patients were present in each group?

Answer: ____________________________________________________



We conducted a double-blind, randomized, placebo-controlled trial in 40 patients to evaluate the need for antibiotics in acute exacerbations of chronic bronchitis. All patients were sufficiently ill to require hospitalization although none needed ventilatory support; the presence of pneumonia was excluded. Treatment consisted of bronchodilators, corticosteroids, and either tetracycline, 500 mg, or placebo by mouth every 6 hours for 1 week. Arterial blood gases, spirometric tests, bacteriologic evaluation of sputum, and patient and physician evaluation of the severity of illness were assessed at the beginning and end of the study. All patients improved both symptomatically and by objective measures of lung function. At the end of the study period there were no differences between those patients receiving tetracycline and those receiving placebo. We conclude that antibiotic therapy is not needed in moderately ill patients with exacerbations of chronic bronchitis.

From the NICOTRA, et al. article
PaO2 Day 7 Day 7
Mean 74.1 68.1
St. Dev. 13.6 17.5


Sample Size = 20 per group

Decision Tree for the Null Hypothesis and Power


Checklist to be used by Authors when preparing or by Readers when analyzing a report of a randomized controlled trial (RCT)1
Yes No Unable to determine
1. State the unit of assignment.
2. State the method used to generate the intervention assignment schedule.
3. Describe the method used to conceal the intervention assignment schedule from participants and clinicians until recruitment was complete and irrevocable.
4. Describe the method(s) used to separate the generator and executor of the assignment.
5. Describe an auditable process of executing the assignment method.
6. Identify and compare the distributions of important prognostic characteristics and demographics at baseline.
7. State the method of masking.
8. State how frequently care providers were aware of the intervention allocation, by intervention group.
9. State how frequently participants were aware of the intervention allocations, by intervention group.
10. State whether (and how) outcome assessors were aware of the intervention allocation, by intervention group.
11. State whether the investigator was unaware of trends in the study at the time of participant assignment.
12. State whether masking was successfully achieved for the trial.
13. State whether the data analyst was aware of the intervention allocation.*
14. State whether individual participant data were entered into the trial database without awareness of intervention allocation.
15. State whether the data analyst was masked to intervention allocation.
16. Describe fully the numbers and flow of participants, by intervention group, throughout the trial.
17. State clearly the average duration of the trial, by intervention group, and the start and closure dates for the trial.
18. Report the reason for dropout clearly, by intervention group.
19. Describe the actual timing of measurements, by intervention group.
20. State the predefined primary outcome(s) and analyses clearly.
21. Describe clearly whether the primary analysis has used the intention-to-treat principle.
22. State the intended sample size and its justification.
23. State and explain why the trial is being reported now.
24. Describe and/or compare trial dropouts and completers.
25. State or reference the reliability, validity, and standardization of the primary outcome.
26. Define what constitutes adverse events and how they were monitored by intervention group.
27. State the appropriate analytical techniques applied to the primary outcome measure(s).
28. Present appropriate measures of variability (e.g., confidence intervals for primary outcome measures).
29. Present sufficient simple (unadjusted) summary data on primary outcome measures and important side effects so that the reader can reproduce the results.
30. State the actual probability values and the nature of the significance test.
31. Present appropriate interpretations (e.g., NS does not necessarily indicate no effect; P<.05 does not necessarily indicate proof).
32. Present the appropriate emphasis in displaying and interpreting the statistical analysis, in particular controlling for unplanned comparisons.
*If the data analyst is not masked as to the interventions, new treatments may be grossly favored over standard treatments.

This information may sometimes reveal duplicate publication rather than two separate trials by the same author(s).

Many trials are longitudinal and require several follow-up assessments. These assessments may be subjective based on the responses of questionnaires or scales. There is wide variation in how scales and questionnaires are constructed which may influence the assessment, reliability, validity, and responsiveness of the treatment outcome of interest. Providing information or references about the development of these outcome measures will enable readers to judge how confident they should be about the results.


1. Standards of Reporting Trials Group. A proposal for structured reporting of randomized controlled trials. JAMA: 1994;272:1926-1931.

Worksheet for Paper Review

1. Title:

2. Source:

3. Objective (Purpose):

4. Design:

5. Setting:

6. Patients:

7. Intervention:

8. Main Outcome Measures:

9. Main Results:

10. Conclusion:


-Comparison group





Loss to F/U

Data Source

Worksheet for Paper Review

1. Title:

2. Source:

3. Objective (Purpose):

4. Design:

5. Setting:

6. Patients:

7. Intervention:

8. Main Outcome Measures:

9. Main Results:

10. Conclusion:


-Comparison group





Loss to F/U

Data Source

Worksheet for Paper Review

1. Title:

2. Source:

3. Objective (Purpose):

4. Design:

5. Setting:

6. Patients:

7. Intervention:

8. Main Outcome Measures:

9. Main Results:

10. Conclusion:


-Comparison group





Loss to F/U

Data Source


Addato K. Behavioral factors in urinary tract infection. JAMA. 1979;241:2525-26 R1 - R2

Hunter RS Antecedents of child abuse and neglect in premature infants: A prospective study in a newborn intensive care unit. Pediatrics. 1978;61:629-35 R3 - R9

Nicotra MB Antibiotic therapy of acute exacerbations of chronic bronchitis: A controlled study using tetracycline. Annals of Internal Medicine. 1982;97:18-21 R10 - R13

Ramond M A randomized trial of prednisolone in patients with severe alcoholic hepatitis. NEJM. 1992;326:507-12. R14 - R19a

Schrock CG Clarithromycin vs penicillin in the treatment of streptococcal pharyngitis. J FAM PRACT. 1992;35:622-626. R20 - R25

Spitzer WO The use of -agonists and the risk of death and near death from asthma. NEJM. 1992;326:501-6. R26 - R32

Young MJ Sample size nomograms for interpreting negative clinical studies. Annals of Internal Medicine. 1983;99:248-51. R33 - R36

Sauve JS Does this patient have a clinically important carotid bruit?

JAMA. 1993;270:2843-45. R37 - R39

Williams JW Randomized controlled trial of 3 vs. 10 days of trimethoprim/sulfamethoxazole for acute maxillary sinusitis. JAMA. 1995; 273:1015-1021. R40 - 48

Supplementary Explanations

Levine MA Readers' guide for causation: Was a comparison group for those at risk clearly identified? ACP Journal Club 1992 S1 - S2

Altman DG Confidence intervals in research evaluation. ACP Journal Club 1992 S3 - S4

Laupacis A How should the results of clinical trials be presented to clinicians? ACP Journal Club 1992 S5 - S7

Cook D On the clinically important difference. ACP Journal Club 1992 S8 - S9

Oxman AD Users' guide to the medical literature. I. How to get started.

JAMA 1993;270:2093-5;2096. S10 - S13

Guyatt GH Users' guides to the medical literature. II. How to use an article about therapy or prevention - Are the results of the study valid? JAMA 1993;270:2598-2601. S14 - S17

Jaeschke R Users' guide to the medical literature. III. How to use an article about a diagnostic test - Are the results of the study valid? JAMA 1994;271:389-91. S18 - S20

Homework Set 1

The following titles are from articles published in the medical literature. Determine if the terms INCIDENCE and PREVALENCE are used correctly.

1. Incidence of Blood Group O In An Earlier Series of Myocardial Infarction Patients.

2. The Prevalence of Cardiovascular Disease In Different Ethnic and Socioeconomic Groups in Beit Shemesh, Israel.

3. Incidence of Rheumatic Fever-- Summary of an Eight Year Study of Incoming Freshmen at the University of North Dakota.

4. Rheumatic Heart Disease Epidemiology. Part III. The San Luis Valley Prevalence Study.

5. Incidence of Primary Carcinoma of the Liver in the West of Scotland between 1965 an 1975.

6. Prevalence of Undiagnosed Cancer of the Large Bowel Found at Autopsy in Different Races.

7. The Rising Incidence of Cancer of the Pancreas--Further Epidemiologic Studies.

8. Incidence of Cancer in Men on a Diet High in Polyunsaturated Fat.

9. Age and Sex Variations in the Prevalence and Onset of Diabetes Mellitus.

10. The Incidence of Chronic Peptic Ulcer Found at Necropsy.


11. The compliance rate with prescribed medical regimen will be greater among cancer patients who have high self-esteem and low anxiety compared with those with low self-esteem and high anxiety levels.

12. The effect of restraint systems on the incidence of injury to children in automobile accidents.

Questions continued on next page


13. Age

14. Blood pressure

15. Ethnicity

16. Number of cups of coffee per day

17. You are working an Emergency Room on your first clerkship and a patient comes in complaining of chest pain. You give the patient nitroglycerine wait a few minutes and then ask the patient, " If the pain you came in with was a 10, what number would you assign to your discomfort now?". The patient responds 5. How are you going to interpret this answer--

a) The pain is less. (An ordinal interpretation)

b) A 50% decrease in pain. (An interval interpretation).

18. Characteristics of a normal distribution include the following:

a. The total area under the curve represents 100% of all values.

b. The mean and median and mode coincide.

c. Approximately 5% of the values lie beyond 2 standard deviations from the median.

d. The curve is symmetrical.

e. All the above.

19. In a study of 250 students taken from the general student population of a southern university, the mean systolic blood pressure was 116mm Hg, with a standard deviation of 4mm Hg. From this information approximately 99% of the general student population will have systolic blood pressure (mm Hg) in the range of:

a. 110-130mm Hg

b. 104-128mm Hg

c. 112-120mm Hg

d. 116-124mm Hg

e. 118-122mm Hg

20. In a study involving 150 health providers, the mean serum cholesterol level was found to be 176 mg/dL with a sample variance of 25 mg/dL. From this information approximately 1/3 of these providers, will NOT have a cholesterol level in the range of:

a. 161-191 mg/dL

b. 166-186 mg/dL

c. 171-181 mg/dL

d. 172-180 mg/dL

e. 175-177 mg/dL

21. From the following information compute the mean, median and mode.


22. T-cell counts from a series of newly diagnosed HIV positive males were collected. The mean was 176; the median was 200 and the mode was 224.

From this information the distribution of these data can be described as: (May be more than one correct answer).

a. Normally distributed.

b. Positively skewed.

c. Negatively skewed.

d. Skewed right.

e. Skewed left.

Homework Set 2

1. All of the following statements are true of the NORMAL distribution except:

a. The mean = median = mode

b. Approximately 50 percent of the observations are greater than the mode.

c. Approximately 68 percent of observations fall within 1 standard deviation of the mean

d. The number of observations between 0 and 1 standard deviations from the mean is the same as the number of observations between 1 and 2 standard deviations from the mean.

e. The shape of the curve does not depend on the value of the mean.

2. Randomization is a procedure used for assignment or allocation of subjects to treatment and control groups in experimental studies. Randomization ensures

a. that assignment occurs by chance.

b. that treatment and control groups are alike in all respects except treatment.

c. that bias in observations is eliminated.

d. that placebo effects are eliminated.

e. none of the above.

3. In comparing the difference between two means, the value of p is found to be .20 The correct interpretation of this result is:

a. the null hypothesis is rejected.

b. the difference is statistically significant.

c. the difference is compatible with the null hypothesis.

d. the sample size is small.

e. sampling variation is an unlikely explanation of the difference.

4. Correct statements concerning statistical inference include which of the following? ( Choose all that are correct)

a. If the p value is very low, the difference between the groups must be very large.

b. If the sample size is large enough, it is easy to achieve statistical significance at the .05 level.

c. The confidence interval is dependent on the sample size, the variance of the sample and the degree of confidence.

d. The 95% confidence interval is longer than the 99% confidence interval for the same data.

For each case history that follows, select the study design that it most appropriately illustrates.

(A) Case series report

(B) Case-control study

(C) Clinical trial (Randomized Clinical Trial)

(D) Cohort study (Prospective study)

(E) Case report

5. A total of 300 newly diagnosed patients with laryngeal cancer are randomly allocated to treatment with either surgical excision alone or surgical excision with radiation treatment.

6. A 39 year old man who presents with a mild sore throat, fever, malaise, and headache is treated with penicillin for presumed streptococcal infection. He returns after a week with hypotension, fever, rash, and abdominal pain. He responds favorably to chloramphenicol, after a diagnosis of Rocky Mountain spotted fever is made.

7. A total of 3500 patients with thyroid cancer are identified and surveyed by patient interviews regarding past exposure to radiation.

8. A total of 10,000 Vietnam veterans, half of whom are known by combat records to have been in areas where agent Orange was used and half of whom are known to have been in areas where no Agent Orange was used. They were asked to give a history of cancer since discharge.

9. Patients admitted for carcinoma of the stomach are age and sex matched with fellow patients without a diagnosis of cancer and surveyed as to smoking history to assess the possible association of smoking and gastric cancer.

Homework Set 3

1. Identify the independent variable(s) and the dependent variable(s) in the Adatto study.

2. Assuming you are a family physician and are seeing a 51 year old women who is presenting with a urinary tract infection, are you justified in using the results of the Adatto study in counseling you patient? What assumptions are you making if you do; conversely, what assumptions are you making if you don't?

3. Given the table below, compute the ODDS RATIO and interpret the finding.



Risk Factor

Present Absent
Present 6 4
Absent 112 242


4. Apply the questions covered in lecture regarding the problems that are inherent in a case-control study to the Adatto study.

5. A patient asks, "Doc, what are my chances of getting through this operation?". What would you need to know order to answer the patient factually?

6. The following questions refer to the paper entitled "Antecedents of child abuse and neglect in premature infants: A prospective study in a newborn intensive care unit. Hunter, et al.

6a. What type of study do the authors cite as a foundation for their study ?

6b. Justify your answer to Question 6a.

6c. Identify any sources of Study Population bias and sampling bias.

6d. Is there loss to follow-up and if so, do the authors address the problem?

6e. In the 24 item inventory cited on page 630, what level of measurement is used?

6f. What is the dependent variable in the study?

6g. What evidence did the authors collect to document the occurrence of the dependent variable?

6h. Given the evidence you cite in the preceding question, does this suggest any SURVEILLANCE BIAS?

6i. In the abstract, the authors use the term incidence? Is this a correct or incorrect use of the term? Why?

The following questions refer to the table below:

Acne Present Acne Absent
Eat Breakfast 20 50
Did Not Eat Breakfast 60 110


7a. Put in the correct margins for a retrospective study and calculate an odds ratio.

7b. Put in the correct margins for a prospective study and calculate a risk ratio.

7c. Interpret each calculation.

8. Read the descriptions of the studies below. Determine the scientific hypothesis, the independent and dependent variables, and indicate whether a one-tailed or a two tailed statistical test is suggested by the description.

a) Alcohol is assumed to be the causative agent in many accidents. It is further assumed that alcoholics have an increased risk of dying from accidents involving severe burns. A study was undertaken to evaluate the mortality of alcoholics and nonalcoholics admitted to a burn unit of a major hospital. Nine of 28 alcoholics died; 8 of 75 nonalcoholic patients died.

b) The relationship between parental smoking and number of colds per year was examined in nonsmoking teenagers. Nonsmoking teens in households where both parents smoke had 3 times the number of colds in 1 year compared to nonsmoking teens in households whether neither parent smokes.

Homework Set 4

1. Given the 2 x 2 table below identify the following terms by cell identification.

+ -
Test + a b a + b
- c d c + d
a + c b + d a + b + c + d


I. Prevalence

ii. Sensitivity

iii. Specificity

iv. False positives

v. False Negatives

vi. Negative Predictive Value

vii. Positive Predictive Value

2. A test is said to have a Positive Predictive value of 77%. What does this mean?

3. A test has a negative predictive value of 90%. Your patient's test result is negative. What would you probably conclude?

4. For the table in question 1: Assume the disease you are looking for is a Myocardial Infarction (heart attack) or MI for short. The test you decide to use is a Creatine Kinase (CK), or as is sometimes referred to, the CPK (Creatine Phosphokinase). This is thought to rise early in an infarction. Recall your biochemistry and remember where CPK comes from and under what conditions it is released. Assume your study will take place in a Coronary Care Unit. This unit receives all patients suspected of having an MI. For purposes of this example assume you look at 360 patients. Assume you have a prevalence rate of 64%. The CPK has a sensitivity of 93% and a specificity of 88%. I'll admit that these figures are not that easy to work with but these are actually from a study and not made up for ease in computation.

a) What are the respective PPV and NPV ?

b) Would you rely on this test to identify if your patient

I) has a heart attack?

ii) doesn't have a heart attack.

5. Assume you agree with my answer to question 4. You then get energetic and suggest that this test be used as a screening tool for general admissions to the hospital rather than just the CCU. There are 2300 admissions to the hospital and the community prevalence rate is 10%. What are the calculated PPV and NPV?

6. From the diagram on page 36 of the notes, answer the following questions with the correct letters:

A. Cutoff point set too low.

B . Cutoff point of greater sensitivity.

C. Cutoff point of greater specificity.

D. Cutoff point of greater false positive rate.

E. True positives for cutoff point X.

F. True negatives for cutoff point X.

Suggested Answers to Homework Set 1

1. Incorrect. The number of patients with blood group O is PREVALENCE.

2. This could be correct. If this was a survey of the patients in this geographic area and the purpose was to establish an estimate of existing disease then PREVALENCE is correct.

3. Incorrect. The title indicates that the authors are reporting on how many total cases of rheumatic fever were accumulated in the group of freshmen ( better they should have said first year students!) over the eight years of the study.

4. This could be correct, assuming the intent of the study is to see how many people in the San Luis Valley have rheumatic fever.

5,6,7 correct.

8. This could be correct. The title indicates that the men are placed on a diet and then followed to see how many develop cancer. That is to say, how many NEW cases of cancer are diagnosed.

9. The term prevalence could be used correctly here-- giving one a baseline against which to compare. The term onset could be used as a synonym for incidence since the term connotes newly developed disease.

10. Incorrect. You don't develop a disease at necropsy! These would clearly be existing cases of chronic peptic ulcer.

11. The Independent variables are self-esteem and anxiety. The Dependent variable is compliance. The point I want you to think about is what data would you accept as evidence of compliance (e.g. appointments kept, number of pills taken, diary, someone to verify that the regimen was followed)? Similarly for the independent variables-- a standardized psychological scale for anxiety and self esteem, perspiration when talking to you about disease?

12. This was the title of an article. Presumably the independent variable is the type of restraint system used on the child. Here is a problem-- how old is the child? Do you use a care seat, a lap belt, a lap and shoulder combination or use all of these and maybe more. Those of you with children can probably come up with some new ones. Do you use the same restraint with all children or only up to a certain age and use only one type of device. The dependent variable is accidents due to automobile accidents. Here again, the precision or the term looms large. A fender bender, a high speed crash, etc.

13. Age -- interval. The difference between age intervals is theoretically the same ( i.e. 45 to 42 is three years and the difference between 25 and 22 is the same distance three years).

14. Blood pressure -- interval. The clinical significance of the distance between measurements can be remarkable but the interval is the same theoretical distance. This is why clinicians use the terms low, normal, high and OH MY GOD! to help describe the pressure of the patient. If you have a patient with a blood pressure of 220/150 with headache, visual disturbances etc. you are in an emergency situation. Dropping the pressure to 210/150 is ten points but clinically you are still in trouble. A patient who is 130/80 and loses weight due to your persuasive style and Doctor-Patient Relationship training and now measures 120/75, has also experienced a drop of 10 points systolic but wasn't in trouble at either time. THE TEACHING POINT TO REMEMBER IS THAT IT IS THE CLINICAL INTERPRETATION OF THE DATA THAT IS IMPORTANT.

15. Ethnicity is nominal.

16. Number of cups of coffee/day is interval.

17. There is no real right answer here since either could be used. I would suggest that an ORDINAL interpretation be used for the following reasons:

a) The purpose of the question. It is usually the case that you want to assess the direction of the pain and if it is improving. The ordinal interpretation does this.

b) An interval response assumes a lot about the patient. If the patient is experiencing a first episode a question requiring a high level of precision is usually unable to be duplicated the next day (a statement like "hey doc, great stuff my pain is only 2 and a half today" is very unlikely.

18. The answer is E (all of the above).

19. This question turns on the properties of the Normal Distribution. the diagram below summarizes the percentages based on the number of standard deviations from the mean. (NOTE: Because the distribution is normal one could also say standard deviations from the median or the mode, since the mean, median and mode coincide.)

Mean ± 1 s.d. = 68% 116 ± 4 = 112-120.

Mean ± 2 s.d. = 95% 116 ± 8 = 108-124. (See Next Page)

Mean ± 3 s.d. = 99% 116 ± 12 = 104-128.

Thus, the answer is B.

20. Be careful here. The information gives the mean and the VARIANCE. The normal curve properties are based on the standard deviation. The standard deviation is the SQUARE ROOT of the variance. Thus the standard deviation is:

Using some of the explanation from above, 1/3 or approximately 34% will lie 1 standard deviation on either side of the mean. We don't know which direction it will be so we will have to include both directions to account for this uncertainty.

So, if these people are below the mean, the range would be:

mean - 1 s.d. = 176 - 5 = 171; thus 1/3 of the people would be excluded from the range of 171 - 176.

So, if these people are above the mean, the range would be:

mean + 1 s.d. = 176 + 5 = 181; thus 1/3 of the people would be excluded from the range of 176 - 181.

Thus to be complete the range would be 171 - 181.

The answer would be C.

21. Mode is most frequent: There are 2 numbers that each have a frequency of 2 --- the number 1 and the number 5. Thus a bi-modal distribution (1,5)

The median is the middle point: 1,1,2,5,5,6.

Even number of observations (six of them) so add the middle two together and divide by two (2 + 5)/2 = 7/2 = 3.5.

The mean is the arithmetic average = x/N = 20/6 = 3.33.

22. Since the mean median mode the distribution is NOT symmetric (or normally distributed). The mean is most effected by extreme scores and is thus moved in the direction of the extreme score. The median is next effected by the extreme score(s) then the mode. The distribution is thus NEGATIVELY SKEWED or SKEWED LEFT. The literature will use both terms, so I put them both in, the answer is C and E.

Suggested Answers to Homework Set 2

1. D

2. A

3. C

4. B,C

5. C

6. E

7. A

8. D

9. B

Suggested Answers to Homework Set 3

1. The authors are somewhat slippery here. They ostensibly want to look at behavioral factors and see if they are related to recurrent urinary tract infections (UTI). Then they want to institute a regimen that will reduce recurrent UTI's. The never address the first question directly. The authors first determine if the groups differ on certain behavioral factors just from a descriptive point of view. They limit the factors to sexual habits, voiding behavior and personal hygiene. Thus, for the first part of their work, they have used these 3 behavioral factors as INDEPENDENT variables. After finding a difference in voiding habits they then posit a "cause" and effect proposition that voiding habits lead to recurrent UTI's and design a program to encourage women in the study group to void regularly. Thus the independent variable in the second part of the study is the behavioral program and the dependent variable is recurrent UTI's.The authors then use the following logic: The behavioral program encouraged people to void more frequently than their usual habits and now they have fewer recurrent UTI's; therefore voiding frequently MAY be protective of UTI's.

2. The authors indicate that two groups of people were excluded from the study: individuals who had an history of UTI or any serious chronic illness and women older than 40 (see page 2525). Your patient is thus excluded form the scope of the study by her age. Are you still tempted? If so, you are assuming that age and any chronic illness you patient might have has no bearing on UTI's. This may be true, I will not answer that for the time being since this may keep you interest peaked when you get to the Renal system next year. If you don't know the influence of age or chronic disease then you would be justified in NOT applying the results of this study to your patient and you must search the literature to see if a study has been done that includes people like you patient (matched -- in a sense) or look to the basic sciences to fill the gaps of the study.

3. Assuming the study to be a case-control study, you will be using the ODDS RATIO and the definition is best cited as:

Since the odds ratio is larger than 1.0, one might be willing to conclude that the risk factor is associated with the outcome. Later on we will see that we need to look at more than the calculated figure to interpret the association, but for now make sure that you can calculate the figure and that you know that the number against which you want to compare it to is 1.0.

4. The following is a suggested response to applying the questions posited in lecture regarding a case-control study. You may have additional points but these remarks may help clarify the points covered in lecture.

1. Are the data dependable?

The data were collected from a University Health Service which (think for a moment about your undergraduate college) generally handles uncomplicated medical problems. These records should be fairly complete. The outcome in question (a UTI) can be defined in an unambiguous way -- note the criteria stipulated in the article. Thus, if anyone has a question about whether or not someone is a "case" or not they should be able to go to the medical record and verify that the criteria is met. This is the acid test in practice, send two people to independently review the same material and see if they reach the same conclusion. The outcome here is not subtle and the women should know if they have an infection because of the manifestations of the illness-- dysuria, urgency and frequency. I would be satisfied as to the dependability of the data.

2. Is recall bias operating?

YES. This is a danger in all retrospective studies. The major teaching point here is that one tends to focus on noxious stimuli or major outcomes. Thus the cases could remember perhaps more clearly their voiding histories better than people who have no particular reason to recall earlier behavior or action. A WORD OF WARNING HERE, HOWEVER. There are certain things that force individuals to suppress recollection. One example has been the recollection of mothers who give birth to children who present with birth defects -- either mental or physical. Histories of child abusers are many times devoid of clues when the parent is asked. One can generate explanations for these absences of memory on a common sense basis, but I would like to stress that people can overestimate as well as underestimate their actions and thus some corroboration of information is necessary. This corroboration can come from medical records (previous treatment for a specific illness or symptoms of an illness, neighbors, family, etc.). The authors have tried to address the problem of estimation of behavior by asking the patients in each group about other habits WHICH, BIOLOGICALLY SPEAKING, MAKE SENSE !!!!! In this instance, hygiene and sexual habits. The reasoning here would be that if one tended to overestimate the times they waited to relieve their bladder they would also tend to overestimate other relevant habits. The article reports these habits in some detail so one could be persuaded that the interviewers were pretty good about getting information. Since there was a similar response pattern between both groups in these other areas, one could assume that the recall bias was not a major problem in the study.

3. How alike are the cases and controls?

This study uses university women who are use the health service of the university. This provides a common setting for all subjects. Since I claimed (in Q1) that the service usually handles routine instances of illness, we can probably assume that the groups are generally in pretty good health. Notice that the heavy duty problems have been eliminated prior to the start of the study-- no chronic illness was allowed in either group. Thus the health spectrum could be assumed to be comparable between groups. Since women had to have been in the health service in order to be chosen, health seeking behaviors could also be assumed to be equal. If a women decided to go to her family osteopathic physician at home she would not have been in the study because she uses services beyond the university. So matching was done here but on a GROUP basis rather than an individual bases.

4. What kind of populations do the cases represent?

These are university women. They are generally better educated and more concerned about their health than the general public. The age range is also truncated (18 - 39). While this population may be representative of college populations in may not necessarily be suitable for all populations. To answer this question, you must satisfy yourself that the factors that make this study group special have no bearing on the disease (or outcome) in question. The three big ones here are age (young), no chronic illness, and education which might encourage better compliance with treatment regimens. You will be able to answer these concerns when you have studied the genitourinary system but now I would like to make sure you are aware enough to raise the questions. The TEACHING POINT TO REMEMBER is that a special population MAY be a good population to use if their "specialness" does not interfere with the natural disease process. If there is doubt you would be better off using the results ONLY on similar populations and not generalizing to other groups in other settings.

5. Are other BIASES evident?

a. Detection bias: In brief, are people looking for this disease now more than they did in the past?

Uncomplicated UTI's are not a new phenomenon and are a commonly occurring illness, relatively speaking. Because of the long standing nature of the disease entity I would not suspect this bias to be present.

b. Late look bias : In brief, are you looking at the disease close to the exposure? Looking late would exclude individuals who would have died soon after the exposure or who would have recovered without seeking medical care. Thus you would be looking at people who be better off, from a health perspective, just because they are still alive although ill.

With this definition in mind, I would not suspect this to be a problem here. Usually people do not die from a UTI. The symptoms of dysuria, frequency and urgency occur soon after the bacteria colonizes, thus attention is sought quickly. This should be contrasted with something like CORONARY ARTERY DISEASE which may not manifest itself clinically until years after exposure.

c. Non-response bias: In brief, this occurs when people fail to reply to solicitations for information. The important question to ask here is do the respondents differ in any way from the non-respondents which could influence the outcome or exposure history. An example of this is in obtaining information via questionnaires from study subjects. There have been numerous examples of differences in history of alcohol use, drug use and prescription compliance between people who respond and fail to respond to such inquirers. This was found out only after the non-respondents were literally tracked down and questioned on the spot, so to speak, about certain facts relative to exposure status.

Applied to this study, the investigators followed only the last 37 case patients but all 84 controls for a more detailed look at urinary retention. That is less than 50% of the group. It is not clear why this cut took place. While it is unlikely that the unstudied cases group all voided quickly after the urge to do so it is conceivable that the difference between the case and control group could be diminished to a more equal state. This is a problem here.

d. Volunteer : In brief, this is a bias that creeps in if volunteers would display different patterns of behavior than non-volunteers. We know that volunteers, over the short haul are compliant and excited to be part of a study and hence are different from the average patient. DO NOT BE TOO QUICK TO INVALIDATE ALL STUDIES WHO USE VOLUNTEERS, HOWEVER. The question is are the characteristics that differentiate volunteers from other people RELEVANT to the outcome or exposure status. If the nature of a study is to look at the pathogenesis of disease then the fact that they are human rather than a volunteer is what is important. If compliance is important in the study then you must question the generalizability of the study.

Applied to this study, both groups were volunteers and thus would not account for any difference in the histories given by the respondents. The fact that a behavioral regimen was part of the study raises concern for generalizability. One could say that if volunteers have a better behavioral compliance record than non-volunteers then this study should show the absolute best that the behavioral treatment can do. NOTICE that only 65% of the patients experienced no reinfection if they followed the regimen. If as a practicing physician you average people, then you should conclude that you will see less than this number if you try to duplicate their program. Also, what happened to the 12 people who were " lost to follow-up"?

e. SELECTION : This is really a collection of many of the biases above. It focuses your attention on any of the characteristics of the cases or controls that impinge on the outcome or exposure of the study. I think the best example would be if the cases and controls were inherently so different that one could not help but find a difference. One example, would be in studying the nutritional status of children and an author took a group from the inner-city of Brooklyn, New York and a second group from Bloomfield Hills, Michigan. This would show differences alright, but would make sense only to a legislative initiative, not a medical intervention.

Applied to this study, the selection bias is really only evident in that which is described above. I did not catch any other error.

f. Admission Bias : I put this in because it is called Berkson's Paradox of Berkson's Fallacy in the literature. The bias essentially is that certain diseases bring individuals into the hospital in greater numbers than others. Heart attack as opposed to influenza, for example. If the heart attack is also related to the outcome of interest say kidney disease and there is a high probability that the two diseases occur together, then the chances of finding the kidney disease in the hospital is increased. It is increased just because it can get in two ways: by itself and riding on the coattails of a jointly occurring disease. Thus the relationship between exposure and outcome is distorted since if the heart attacks serve as controls, they also have kidney disease.

Applied to this study, the status of a UTI is the only criteria for admission to the study since complicating factors are eliminated (p. 2525). Thus a fair admission rate into the study for both controls and cases is assured.

5. You have two choices explain to him in terms of risk or in terms of odds. First, you have to assume that the patient's operation is going to be identical to others that the surgeon has done in the past. This of course rests on your skill at doing a good History and Physical examination. The information gleaned from the H&P will let you know how similar the patient is to those the surgeon has worked on previously. Then from a statistical point of view the problem takes on the following form:
Surgeries performed by surgeon on patients like your patient


Thus if you chose the RISK approach you would say your chances are 50/80 or about 63% (actually 62.5%). If you chose the ODDS approach you would say 50/30 or about 5 to 3. The patient would in all likelihood understand the RISK approach a little better (unless s/he plays the horses). But what you can see here is that we are talking high risk stuff here. Your chances are only a little better than 50%. Does the patient really need surgery?

6a,b. The authors cite a RETROSPECTIVE study as the basis for their current work. This makes sense and is not an unusual occurrence. Recall that while child abuse is an abhorrent event it is fortunately rare. The use of a retrospective technique gives at least a clue as to whether certain variables occur together. If they do then further study might be indicated. The retrospective study is quicker and less expensive than a prospective study and thus is a reasonable place to start.

6c. The study population is from the INTENSIVE CARE UNIT of a hospital. This hospital draws from a wide geographic county. The authors cite that the distance one has to travel often impedes the amount of interaction between the infant and family. Since this is a regional center it suggests that there are not that many intensive care units for infants (neonatal intensive care units) or that these kids are really sick! Indeed, children were excluded from the study if they rapidly recovered or died. Thus the prolonged stay and the effect it might have on their constituted family might make this a tough population from which to generalize. Children who rapidly recovered were not included, these kids may be a better group from which to make generalizations, since most infants DO NOT go to an ICU in the general population. The explanation of the 21 families who were in the unit but not long enough for contact to be made is a curiosity to me. How long do they have to be there? The authors then proceed to enroll the remaining families into the study. This would preclude any sampling bias. FROM AN ETHICAL point of view the authors should be commended for providing the full hospital services to all families. THE TEACHING POINT is that either the STUDY population may be a problem or the SAMPLING may be a problem or both. Attention must be paid to each one. Here the Sampling is OK but the Population may be suspect.

6d. The authors report data on all 255 families. Thus there is no loss to follow-up that I can find.

6e. The scale is: 0 = absent

1 = present to some degree

2 = strongly present

This is an example of an ORDINAL SCALE.


Although I didn't ask the independent variable is the INVENTORY SCORE of the family.

6g. The authors have, as they say in the game, operationally defined the incidence of maltreatment as reports made to the local department of social services. I want you to go further than this however in your analyses to understand WHAT was reported. The evidence consists of: serious physical abuse -- 2 cases; and neglect -- 8 cases. Of these 8 neglect cases these were failure to treat chronic or acute medical problems and failure to comply with minimal well child care (i.e. immunizations, etc.). BUT THE MOST FREQUENT complaint was inadequate parental supervision.

6h. There is SURVEILLANCE BIAS. The study involved the department of social service immediately after the family inventory was administered and the family was identified as high risk as well as a special effort to make hospital support services available to them. These families are in a sense marked. Since follow-up visits are done by social service, the instances of abuse are more easily found since they are looking for it. I have little qualms about the detection of physical abuse since any child who has these problems will most times come to the attention of medical personnel. My concern is with the non-physical abuse instances of what has been termed maltreatment, in particular, lack of adult supervision. It is conceivable that the unreported group has left the children alone as well and maybe as frequently. IT IS JUST THAT NO IS LOOKING IN ON THEM TO CHECK. This is what I mean by surveillance bias. The high risk group is scrutinized more carefully than the "comparison group". A way around this would be to take a random sample of the unreported families and visit them on a regular basis to see if there are instances of unreported child abuse.

6i. The term incidence is used correctly. The children are all "abuse free" since they have just been born. They will BECOME abused. They will thus be NEW cases of abuse.

7a. Retrospective study.

Acne Present Acne Absent
Eat Breakfast 20 50
Did not Eat Breakfast 60 110
80 160


7b. Prospective study

Acne Present Acne Absent
Eat Breakfast 20 50 70
Did not Eat Breakfast 60 110 170


7c. The rules for interpreting the ratios are the same. Here both are less than 1.0. This suggests that there is a protective effect operating. Prospectively then, your chances of getting acne are less if you eat breakfast relative to not eating breakfast. Retrospectively, one would interpret this as among those who had acne they tended not to eat breakfast.

8. A) Scientific Hypothesis: The proportion of alcoholics, compared to nonalcoholics, die following accidents in which severe burns are sustained.

Independent Variable: Alcoholic status (alcoholic vs. nonalcoholic)

Dependent Variable: Death (Yes/No)

One thing that is not described is the seriousness of the burn. Were the alcoholics and nonalcoholics matched on severity of burn? What is the co-morbid (accompanying illnesses) that present with each patient? These are some of the things that must also be considered.

One-tailed vs. Two-tailed: Because of the assumptions, the authors would want to argue a one-tailed test is warranted.

B) Scientific Hypothesis: A relationship exists between parental smoking (smokers vs. nonsmokers) and number of colds per year in nonsmoking teenagers.

Independent Variables: Smoking status of parents (smoker vs. nonsmoker)

Dependent Variables: Number of colds in a 12 month period.

One-tailed vs Two-Tailed: Two-tailed. A relationship is suggested but no direction is stated. For example, it is not hypothesized that nonsmoking teens whose parents both smoke will have a GREATER number of colds per year than nonsmoking teens whose parents are nonsmokers.

Question: What if only one parent smokes? What if there is smoking, but it is not done in the house?

Suggested Answers to Homework Set 4


I. (a+c) / (a+b+c+d)

ii. a / (a+c)

iii. d / (b+d)

iv. b

v. c

vi. d / (c+d)

vii a / (a+b)

2. This means that of the people who test positive on you lab test, 77% of them actually have the disease. You can infer from this that 23% will be falsely positive.

3. With a high NPV I would probably conclude that the patient DOES NOT HAVE the disease. I must recognize that I could be wrong 10% of the time, since that it the percentage of falsely negative tests.




In brief, 1) Prevalence x Sample size = .64 x 360 = 230

2) Sensitivity x 230 = .93 x 230 = 215

3) Appropriate subtraction yields 15 for the false negatives and 130 for the number of disease free people in the CCU.

4) Specificity x 130 = .88 x 130 = 114.

5) Subtract for the false positives.

a) The PPV is equal to

The NPV is equal to

b) Given these high values I would feel comfortable concluding that if the test result is POSITIVE the patient is having an INFARCTION; if the test result is NEGATIVE I would feel comfortable concluding the patient DOES NOT have an infarction.

5) Notice that the Prevalence has dropped considerably. Since SENSITIVITY AND SPECIFICITY are INDEPENDENT of the prevalence I should not expect them to change and thus they will be same for my calculations. The PPV and NPV are absolutely dependent on the prevalence and this is the major teaching point. Upon recalculation:

PPV is 46%

NPV is 99%

Thus, for the general population of this hospital I would be on extremely shaky grounds concluding that a person has an MI based solely on a positive CPK. However, for this population, with this prevalence if the patient had a NEGATIVE test I would conclude that the there is NO MI. This is where you must study an article carefully. If the population has a prevalence different from yours, the PPV and NPV is going to change. Are your chances significantly improved for correctly diagnosing a medical problem based on the LAB? If no, don't run the test. It contributes little to your knowledge base and runs up the medical costs.

6a. A 6b. A 6c. B 6d. A 6e. F 6f. E





Causation 82

Correlation 82

Associations 82

Applications 83

Attack Rate 14

Biases 67

Birth rate 14

Case 7

Case Fatality 9

Case-Control Studies 65


Problems 66

Reasons for 66

Cause-specific death rate 16

Cumulative Incidence or Risk 5

Cumulative incidence 6

Death Rate 14

Dependent Variable 19, 38, 40

consequent 38

outcome 38

predicted 38

result 38

Descriptive Statistics 42


Measures of Central Tendency 42

Measures of Dispersion 44


Epidemiology 4

Associations 82

biases 67

Case Fatality 9

Case-Control Studies 65

Cumulative incidence 6

Cumulative Incidence or Risk 5

Incidence 4

Incidence Rate 4

Negative Predictive Value 32

Prevalence 4

Prospective Studies 62

Sample Size 85

Sensitivity 30

Specificity 30

Statistical Association 68

Statistical Testing 48

Survival 8

Fertility rate 14

Fetal death RA-TIO 15

Fetal Death rate 15

Hypotheses 50



Incidence 4

Incidence Rate 4, 6, 8, 14

Independent Variable 38, 40

precursors 38

predictors 38

Infant Mortality Rate 15

Levels of Measurement 39

Interval 39

Nominal 39

Ordinal 39

Low birth weight ratio 14

Measures of Central Tendency 42




Measures of Dispersion 44

Range 44

Variance 45

Morbidity 14

Mortality 14

Natality 14

Negative Predictive Value 32, 34

Null Hypotheses 52

Null Hypothesis

Decision Tree 91

Expected Number of Observations 60


Period prevalence ratio 15

Point prevalence ratio 15

Positive Predictive Value 31, 34

Predictive Value Negative 34

Predictive value positive 34

Prevalence 6, 31, 34

Prevalence (Point) 4

Proportion 17

Proportionate mortality ratio 15

Prospective Studies 62

Advantages 63

Cohort 63

Disadvantages 63

Follow-up 63



Rate 17

Ratio 17

Recall Bias 67

RISK 6-8



Sample Size 85

Power 85

Sensitivity 30, 31, 34

Specificity 30, 31, 34

Statistical Association 68

Statistical Relationship 68

Statistical Testing 48

Chi-Squared Statistic 61

Confidence Intervals 53

Decision Making 49

How 53

Hypotheses 50

P-Value 52

Sampling 49

t value 59

Study Population 64

Surveillance bias 64

Survival 8

Validity 30

Variable 38

Dependent Variable 38

Independent Variable 38

OST 590 Biostatistics and Epidemiology

Please fill in the information requested. The answers will be used for demonstration purposes in the next few meetings.

1. Initials (First, Last) ___ ___

2. Age at last birthday (in years, please) ____

3. Sex ___ M ___ F

4. Height (in inches, please) _____

5. Weight (in pounds, please) _____

6. What year do you plan to graduate ___ 2000 ___ 2001 ___ 2002 ___ Grad Student

7. Which major division of the medical field do you think you want to enter, at this time? (Does not apply to Graduate Students)

___ Medicine ___ Surgery ___ Don't Know

8. Please rank order the following expectations of physicians held by a sample of American people; 1 being the highest, followed by 2, 3, etc. (This is from a national survey entitled -- A Report Card on Americans' Primary Care Physicians)

A. ___ Be knowledgeable and competent

B. ___ Have a friendly personality

C. ___ Counsel patients on steps they could take to enjoy good health

D. ___ Really care about a patient's health