Death or oxygen, which is worse?

We have a big problem in neonatal research. We have constructed composite outcomes that have become the “standard of design”, but are not of much use for anyone. Because we are, rightly, concerned that death and other diagnoses may be competing outcomes, we often use as the primary outcome measure “death or BPD” or “death or severe retinopathy” or death or “neurodevelopmental impairment”. We have done this because dead babies can’t develop BPD, or developmental delay.

The idea, of course, is that we want to see if an intervention will improve survival without lung injury, for example. There are two problems with this, if the outcome is more frequent, but neither part of the outcome is individually significantly affected. What then? The other problem is that we might well find that death is less frequent but that lung injury is more frequent. And what then? If the composite outcome is unchanged, then strictly speaking we can only say that the study found no effect on the outcome, and an analysis of the parts of the composite outcome are considered secondary analyses.

This happens. The SUPPORT trial showed no effect of oxygen saturation targets on the primary outcome, but the low target babies had more mortality, while the high target babies had more retinopathy.

Study designs like this are effectively equating the parts of the primary outcome in importance for the analysis.

By studying the outcome of “death or BPD” we are effectively saying that an adverse outcome is being dead or being on low-flow oxygen at 36 weeks. I don’t think many readers of this blog would agree, if they themselves were critically ill, that surviving with a need for long-term domiciliary oxygen and being dead were equivalent.

This has again become painfully clear with the publication of the STOP-BPD trial. (Onland W, et al. Effect of Hydrocortisone Therapy Initiated 7 to 14 Days After Birth on Mortality or Bronchopulmonary Dysplasia Among Very Preterm Infants Receiving Mechanical Ventilation: A Randomized Clinical Trial. JAMA. 2019;321(4):354-63). This was a very high quality, important trial of hydrocortisone in ventilator dependent babies. Infants less than 1250 g birthweight and <30 wk gestation were randomized to placebo or to hydrocortisone 1.25 mg/kg/dose 4 times a day for a week, then 3 times a day for 5 days, then twice a day for 5 days then once a day for 5 days.

They had to be ventilator dependent at 7 to 14 days of age with a respiratory index (product of mean airway pressure and the fraction of inspired oxygen) equal to or greater than 3.5 for more than 12 h/d for at least 48 hours.

Which would mean for example a mean airway pressure of 8 and an FiO2 of 0.44.

During the initial months of the trial, participating centers noted that many infants receiving ventilation and considered at high risk of BPD had a respiratory index of less than 3.5 and were treated with corticosteroids outside the trial. Based on this feedback, the respiratory index threshold was reduced to 3.0 and finally to 2.5 (in May 2012 and December 2012, respectively) via approved protocol amendments.

By the end of the trial, then, an infant at 7 days of age, with a mean airway pressure of 8 on 32% oxygen or more would have been eligible.

The definition of BPD was oxygen requirement at 36 weeks (with an O2 reduction test if needing less than 30%). Death was also recorded to 36 weeks for the primary outcome. Which means that dying between 36 weeks and discharge would be considered a good outcome, if you didn’t have BPD.

The primary outcome occurred in 128/181 hydrocortisone babies (70.7%), and 140/190 controls (73.7%). In other words there was no impact of the hydrocortisone, which is what the abstract states. But at 36 weeks there were significantly, and substantially, more babies who received hydrocortisone alive than controls, 84.5% vs 76.3%, which was “statistically significant” p=0.048. Between 36 weeks and hospital discharge there were several deaths in each groups, and the difference had narrowed slightly, with 80% of hydrocortisone babies and 71% of control babies being alive, p=0.06.

This happened despite a very high rate of open-label hydrocortisone use in the control babies. In fact 108 of the 190 control babies received hydrocortisone.

The protocol is available with the publication, and it notes the following :

In case of life threatening deterioration of the pulmonary condition, the attending physician may decide to start open label corticosteroids therapy in an attempt to improve the pulmonary condition. At that point in time the study medication is stopped and the patient will be recorded as “treatment failure”.

This could occur during the 21 days of study drug use. In addition, physicians could give steroids after the 21 days of the study drug:

Late rescue therapy outside study protocol (late rescue glucocorticoids): Patients still on mechanical ventilation after completion of the study medication, i.e. day 22, may be treated with open label corticosteroids.

I’m not quite sure about this, but I think that 86 of those 108 control babies who received hydrocortisone got it during the 21 days study drug window, and 22 others received steroids after the study drug period. In the hydrocortisone group I can see no indication of how many got open-label steroids during the study drug period, but there are 6 who got steroids after the end of that period.

The substantial differences in mortality are despite a very high rate of treatment of babies randomized to control who received hydrocortisone, which will of course dilute the potential impact of the intervention.

There are modest differences in BPD between the groups, with the hydrocortisone babies having slightly more (100 cases vs 95), but if you express this result as “BPD among survivors”, the numbers are actually identical; just over 65% in each group.

I think the best interpretation of this study would be as follows: eligible babies who received immediate hydrocortisone, compared to those who waited and only received hydrocortisone in the case of a “life-threatening” deterioration, were less likely to die, but, if they survived had the same likelihood of developing BPD.

I hope there is neurological and developmental follow up planned for this trial, although the power of the study to say very much, when so many control babies received hydrocortisone, will be quite limited.

This is now a huge problem, the published article states there is no effect of hydrocortisone, but that is not what I get from the data.

Here is the cute graphic that accompanies the paper

Effect of Hydrocortisone 7-14 Days After Birth in Very Preterm Infants Receiving Mechanical Ventilation

What can we do about this? Based on this study, the use of hydrocortisone in a similar dose, to infants with substantial oxygen requirements after 7 days of age would be a reasonable choice. Waiting for life threatening deterioration (it would be interesting to know what that meant to the attending physicians!) seems to increase your risk of dying. I think it is unlikely that any neurological or developmental impacts of hydrocortisone are severe enough to be worse than dying, and I just hope that any long term outcome study of these infants does not use the outcome “death or low Bayley scores”.

Analyzing the deaths differently using survival curves gives the following, with a p-value suggesting that this is unlikely to be due to chance alone. I know it’s a bit more than .05, but there is only 1 chance in 17 that completely random numbers would give a difference like this :

https://cdn.jamanetwork.com/ama/content_public/journal/jama/937780/joi180157f2.png?Expires=2147483647&Signature=hTq13Xvb7CsNt4iGV2YbDuLylAyVKsFskeO4jGl-XR6W9ihP9Q0XV-8XD7qq1u6T9YHsWah4bxUVcNRetg35BknOyw-SIXqB~w9PCowxd3BK0ul8p3AW3W6vA4rPZ1xs692EplAQbCHUO8kO8sabQc5oz1pYmtXkfy8YYN08jU02zWEk2tkzFz101~tnugmfOeeBmeVxcEXpOgPonfnbHUNcRDSE6ZFwY0u5JjSI2j02wkWBS9TC00K825Y8uXCSs~RnBByZXc4~lLqxqly6LZ0Z-Qit1oSyfZqJi57eWT4JDQj1lthE17nAPVXhYNWXwYNm3Ft-5paaUTbCBpAcvA__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA

I think we have to stop using “death or BPD” as a composite dichotomous outcome for our studies.

There are alternatives, even when death and the other outcome of interest are competing.

One way is to analyze the same data differently. One method, for example, is to compare each babies outcome to all of the babies in the other group. A baby who dies receives zero points in comparison to the other group babies who died, receives -1 point in comparison to the other group babies who survived. Each surviving baby with BPD is then scored +1 point in comparison with the other group babies who died, zero points in comparison with the other group babies with BPD and -1 point in comparison with the surviving babies without BPD, and babies without BPD score +1 in comparison with babies who died or survived with BPD, and score 0 in comparison with babies who survived without BPD. The ratio of winning to losing babies is then referred to as the “win ratio”.

This is a variant of the method used by the study I discussed in my last post, Beitler et al examining different ways of determining optimal PEEP. Finkelstein DM, Schoenfeld DA. Combining mortality and longitudinal measures in clinical trials. Statistics in Medicine. 1999;18(11):1341-54. In fact it is more generally applicable, and there have been multiple publications about the method (and other related methods) as well as many publications using the methods, mostly in cardiology, where composite outcomes including death or a revascularization procedure, as one example, are common, but recognized to have differing weights. Pocock SJ, et al. The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal. 2012;33(2):176-82.

For example, if you ran a study with 20 babies per group, and the results showed group A had 5 deaths and 10 survivors with BPD, group B had 10 deaths and 5 with BPD. Our usual analysis would say there was no impact on “death or BPD”. The analysis that I have just suggested, in contrast, gives a score in group A to each one of the dead babies of -10, and -15 to those in group B. The BPD babies each score+5 in group A and 0 in group B, and the survivors without BPD score +15 in both groups. The win ratio for the trial is 3.0 for group A, as there are 15 babies who win overall in most of their pairwise comparisons, and 5 who lose. Calculating the p=value for this is complicated, but well described, and methods for calculating the confidence interval of the win ratio are, also.

Effectively, what this kind of analysis does is to rank the adverse outcomes, death being scored before BPD.

I would be fascinated to see what the results of STOP-BPD would look like if this kind of analysis was performed, the win ratio of the hydrocortisone group works out to 5.2 to my calculation, compared to 3.4 for the controls. It could be that such a difference is statistically significant, and such an analysis might enable future trials to be designed using this method.

You can also with this technique examine different severities of BPD, with BPD being scored as moderate vs severe. This kind of analysis can also include longitudinal quantitative measures, such as duration of home oxygen therapy, or number of admissions after discharge. Things which are, I would suggest, far more important to parents than whether the oxygen is stopped before or after 36 weeks.

Before there are any other trials counting death and BPD as equally important outcome measures, or death and retinopathy, or death and developmental delay, or “death, BPD, NEC, LOS, IVH, ROP” we should reconsider how we measure and analyze outcomes. We should be including outcomes that are important to families, rank them according to their relative importance to parents, and analyze them using methods which are now well validated which take into account their relative importance.

6 Responses to Death or oxygen, which is worse?

Dr. Rangasamy Ramanathan says:

10 March 2019 at 03:41

Excellent analysis Keith. In the SUPPORT trial, depending on the data are analyzed, there are differences in p value. Sometimes, we try to manipulate statistics to get the p value we want!! Even the NEOPROM report concluded that there is no difference in the PRIMARY COMPOSITE OUTCOME of Death or major disability at 18-24 months (p=0.21) . ROP, but, then they go on to say death alone was statistically significant at a p of 0.01!!!

Bree Andrews, MD/MPH says:

11 March 2019 at 10:19

Thank you Keith for showing us yet again how to think about neonatal research.

Pingback: What to do about early postnatal steroids? | Neonatal Research
Pingback: Longer-term outcomes: what should we measure? part 2 | Neonatal Research
Pingback: Which is worse; death or a low Bayley score? Comparing composite outcomes between groups, taking into account clinical priorities. | Neonatal Research
Pingback: STOP-BPD follow up study | Neonatal Research