Potential barriers to accurate reporting of harms

Our reanalysis of Study 329 revealed significant variations in the way adverse events can be reported, demonstrating several ways in which the analysis and presentation of safety data can influence the apparent safety of a drug.

1. Use of an idiosyncratic coding system

The term ‘emotional lability’, as used in SKB’s ADECS, masks differences in suicidal behaviour between paroxetine and placebo.

2. Failure to transcribe all adverse events from the clinical record to the adverse event database

Our review of Case Report Forms disclosed significant under-recording of adverse events.

3. Filtering data on adverse events through statistical techniques

Keller et al. (and GSK in subsequent correspondence) ignored unfavourable harms data on the grounds that the difference between paroxetine and placebo was not statistically significant, at odds with the SKB protocol that called for primary comparisons to be made using descriptive statistics. In our opinion, statistically significant or not, all relevant primary and secondary outcomes, and harms outcomes, should be explicitly reported. Testing for statistical significance is most appropriately undertaken for the primary outcome measures, since study power is based on these. We have not undertaken statistical tests for harms, since we know of no valid way of interpreting them. To get away from a dichotomous (statistically significant/non-significant) presentation of evidence, we opted to present all original and recoded evidence to allow readers their own interpretation. The data presented in RIAT Appendix 2 and related worksheets lodged at Study329.org will, however, readily permit other approaches to data analysis for those interested, and we welcome other analyses.

4. Restriction of reporting to events that occurred above a given frequency in any one group

In the Keller et al. paper, reporting only adverse events that occurred in more than 5% of patients obscured the harms burden. In contrast, we report all adverse events that have been recorded. These are available in table E in RIAT Appendix 2.

5. Coding an event under different headings for different patients (dilution)

The effect of reporting only adverse events that have a frequency of more than 5% is compounded when, for instance, agitation may be coded under agitation, anxiety, nervousness, hyperkinesis and emotional lability; thus, a problem occurring at a rate of >10% could vanish by being coded under different subheadings such that none of these reach a threshold rate of 5%.

Aside from making all the data available so that others can scrutinise it, one way to compensate for this possibility is to present all the data in broader system organ class (SOC) groups. MedDRA offers the following higher levels: psychiatric; cardiovascular; gastrointestinal; respiratory; and other. In RIAT Appendix 2, table E, the adverse events coded under ‘Other’ are broken down under the additional MedDRA SOC headings, including general, nervous system, metabolic, and pregnancy.

6. Grouping of adverse events

Even when presented in broader system groups, grouping common and benign symptoms with more important ones can mask safety issues. For example, in the Keller et al. paper, common adverse events such as dizziness and headaches are grouped with psychiatric adverse events in the ‘nervous system’ SOC heading. Since these adverse events are frequent across treatment arms, this grouping has the effect of diluting the difference in psychiatric side effects between paroxetine, imipramine and placebo.

We have followed MedDRA in reporting dizziness under ‘cardiovascular’ events and headache under ‘nervous system’. There may be better categorisations; our grouping is provisional rather than strategic. In RIAT Appendix 2, table E we have listed all events coded under each SOC heading, and we invite others to further explore these issues, including alternative higher level categorisation of these adverse events.

7. Insufficient consideration of severity

In addition to coding adverse events, investigators rate them for severity. If no attempt is made to take severity into account, readers may get the impression that there was an equal adverse event burden in each arm, when in fact all events in one arm might be severe and enduring while those in the other might be mild and transient.

One way to manage this is to look specifically at those patients who drop out of the study because of adverse events. Another method is to report those adverse events coded as severe for each drug group separately from those coded as mild or moderate. We used both approaches.

8. Coding of relatedness to study medication

Judgements by investigators as to whether an adverse event is related to the drug can lead to discounting the importance of an effect. We have included these judgements in the worksheets lodged at Study329.org, but we have not analysed them, because it became clear that the blind had been broken in several cases before relatedness was adjudicated by the original investigators, and because some judgements were implausible. For instance, it is documented in the Clinical Study Report (p. 279) that an investigator, knowing the patient was on placebo, declared that a suicidal event was ‘definitely related to treatment’, on the grounds that ‘the worsening of depression and suicidal thought were life threatening and definitely related to study medication [known to be placebo] in that there was a lack of effect’. Notably, of the 11 patients with serious adverse events on paroxetine (compared with two on placebo) reported in the Keller et al. paper, only one ‘was considered by the treating investigator to be related to paroxetine treatment’, thus dismissing the clinically significant difference between the paroxetine and placebo groups for serious adverse events.

9. Masking effects of concomitant drugs

In almost all trials, patients will be taking concomitant medications. The adverse events from these other medications will tend to obscure differences between active drug treatment and placebo. This may be a very significant factor in trials of treatments such as statins, where patients are often taking multiple medications.

Accordingly, we also compared the incidence of adverse events in patients taking concomitant medication with the incidence in those not taking other medication. There are other medications instituted in the course of the study that we have not analysed, but the data are available in our RIAT Appendix 2, tables K and L, and worksheets lodged at Study329.org, and in Appendix B from the Clinical Study Report. There are several other angles in the data that could be further explored, such as the effects of withdrawal of concomitant medication on adverse event profiles, as the spreadsheets submitted document the day of onset of adverse events and the dates of starting or stopping any concomitant medication. Another option to explore is the possibility of any prescribing cascades triggered by adverse events related to study medication.

10. Ignoring effects of drug withdrawal

The protocol included a taper phase lasting 7-17 days that investigators were encouraged to adhere to, even in patients who were discontinued because of adverse events. The original paper did not analyse these data separately. The elevated rates of discontinuation-emergent psychiatric adverse events revealed in our analysis are consistent with dependence on and withdrawal from paroxetine, as reported by Fava (2006).


Fava M. Prospective studies of adverse events related to antidepressant discontinuation. J Clin Psychiatry. 2006; 67 (suppl 4): 14-21.

Keller MB, Ryan ND, Strober M, et al. Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. J Am Acad Child Adolesc Psychiatry. 2001;40(7):762-772.