The Centre for Evidence-Based Medicine develops, promotes and disseminates better evidence for healthcare.

When making decisions about whether to treat patients with therapeutic interventions, we must take into account relevant information from randomised controlled trials (RCTs) or systematic reviews (SRs).

The integration of relevant evidence with clinical experience forms the cornerstone of evidence based-practice.

To asses evidence of effectiveness of a therapeutic intervention there are two important concepts that need to be addressed and understood. These are the internal and external validity of a given study.

External validity refers to the extent to which we can generalize the results of a trial to the population of interest, whereas internal validity refers to the extent a study properly measures what it is meant to. Basically, how much can we trust the results based on what the investigators did.

Look for the main characteristics of the study population. You should be asking how much this population matches the patients you see or how much do they differ. If they differ too much then no matter how good the results are they it will be difficult to justify applying the treatment to your patient from a given paper.

Define what the actual intervention is. For drug treatment this is usually straightforward, however the price and the availability, are important issues to take into account when making decisions. In non drug interventions defining the intervention can be more problematic. The intervention may be unrealistic in your setting, too costly, or there is a lack of appropriate training available to deliver the intervention.

What is missing from descriptions of treatment in trials and reviews

BMJ 2008 Jun 28;336(7659):1472-4. No abstract available

Glasziou P, Meats E, Heneghan C, Shepperd S

If you are not happy with your understanding of bias revisit the critical appraisal section.

In interpreting the findings there are three important concepts to understand: the sample size, the summary statistic chosen to report the outcome and the probability or confidence of the result.

In terms of analyzing summary measures, results for binary data (rates) in the control group or placebo group are known as the Control Event Rate (CER) and in the intervention or experimental group are known as the Experimental Event Rate (EER). Published articles will often calculate a relative measure, however to understand the magnitude of the results you should look for the absolute risk reduction (ARR) and the associated confidence interval. (Table 1)

Measure | Definitions | How to calculate |
---|---|---|

Relative Risk (RR) | RR is how many times more likely it is that an event will occur in the intervention group relevant to the control group. RR=1 means there is no difference between the two groups. RR>1 means the intervention increased the risk of the outcome. RR<1 means the intervention decreased the risk of the outcome | EER/CER |

Relative Risk Reduction (RRR) | RRR is the reduction in rate of the outcome in the intervention group relative to the control group | (CER-EER)/CER |

Absolute Risk Reduction (ARR) | ARR is the absolute difference in the rates of events between the two groups and gives an indication of the baseline risk and intervention effect | CER-EER |

Number Needed to Treat (NNT) | NNT tells us the number of patients we need to treat to prevent one bad outcome | 1/ARR or when the ARR is expressed as a decimal or 100/ARR when the ARR is a percentage |

**Table 1**

In its simplest form the p value can be thought of as the probability that a particular result would have happened by chance. Therefore a p value of 0.05 means that if we repeated a study 100 times and the result was positive this result could occur by chance five times. A p value of less than 0.05 is usually taken to indicate statistical significance. This value has no solid basis, other than being the number chosen many years ago. When many comparisons are being made, statistical significance can occur just by chance. A more stringent rule is to use a p value of 0.01 (1 in 100) or below as statistically significant.

A p-value is not the probability that a given result is right or wrong. It is not the probability that the result occurred by chance, or a measure of the clinical significance of the results. A very small p-value cannot compensate for the presence of a large amount of systematic error (bias) in the trial. If the opportunity for bias is large, the p-value is likely invalid and irrelevant.

In addition, you will want to see results expressed as a confidence interval (CI). The 95% CI gives an estimated range likely to include the true effect with a certain measure of certainty, in this case 95%. The width of the confidence interval gives an idea about how uncertain we are about the effect. A wide interval may indicate that more data should be collected and analyzed before determining the definitive result.

What is a composite end point and why it is used in trials and any potential weaknesses?

A composite endpoint is defined in terms of two or more primary clinical endpoints (at the patient level). They are used to enhance the statistical power of the study. The rational for using these end-points includes:

- The disease has many characteristics
- Low event rates occur in component endpoints or treatment effect on individual primary endpoints may be small
- Mortality needs to be accounted for in the overall endpoint
- To increase the power of demonstrating an overall treatment effect

Prevention of diabetes

BMJ 2006 Oct 14;333(7572):764-5. No abstract available

Heneghan C, Thompson M, Perera R

A surrogate outcome is an outcome which is not the true health outcome of interest. However, the relationship between the two on the causal pathway is what makes a surrogate outcome relevant and helpful. For example, blood pressure is used as an endpoint in many trials as a surrogate for reduced mortality or morbidity such as stroke. Blood pressure is chosen because it has a well established link with stroke, and trials aiming to demonstrate a reduced mortality would have to be large, and long.

You need to ask the question is there a sound reason to use a surrogate endpoint and if so has the relationship between the two outcomes been well established.

The features outlined are sufficient to navigate quickly and efficiently through an article. Use the three targets of define the population and intervention; search for and understand the biases; and interpret the findings to better understand decision making in clinical practice. Try selecting one journal and read the RCTs, applying the principles set out above to find strengths and weaknesses in the study design. If you think you have something to say in terms of critical appraisal or applying the results then consider sharing it with a wider audience by submitting a rapid response.

External validity of randomised controlled trials: “to whom do the results of this trial apply?”

Lancet 2005; 365(9453):82-93

Rothwell PM

Methods in health services research. Interpreting the evidence: choosing between randomised and non-randomised studies

BMJ 1999; 319(7205):312-315

McKee M, Britton A, Black N, McPherson K, Sanderson C, Bain C

Users’ guide to detecting misleading claims in clinical research reports

BMJ 2004; 329(7474):1093-1096

Montori VM, Jaeschke R, Schunemann HJ, Bhandari M, Brozek JL, Devereaux PJ et al.