I received a very thoughtful comment from Reese Clark, who many of you will know as a leader in neonatology whose many years of experience and important scientific contributions to neonatology make him someone worth listening to.
He has doubts about the reliability of the BOOSTII results, and therefore about the oxygen saturation target ranges that should be used. He notes 2 things, that mortality was getting better during the period that lower saturations were being introduced, and he refers to the meta-analysis by Manja et al. (Manja V, et al. Oxygen saturation target range for extremely preterm infants: A systematic review and meta-analysis. JAMA Pediatrics. 2015;169(4):332-40.)
I will refer to the systematic review first, because I didn’t comment on it when it was first published:
The systematic review by Manja, in fact, showed that death before hospital discharge was significantly increased by targeting low oxygen saturations, and that necrotizing enterocolitis was also increased. They downgraded the quality of evidence using, they stated, the GRADE criteria. But some of their reasons given for downgrading the evidence are bizarre, and not consistent with those guidelines at all.
For each of the outcomes they give these two reasons for downgrading them:
c. The pulse oximeter algorithm was modified midway through the study owing to a calibration correction, and this caused a deviation from SpO2 values.
d. The separation of SpO2 values obtained was not as planned in the study design/protocol. The median SpO2 value in the restricted arm (planned SpO2 of 85%-89%) was higher than 90% in some studies (Figure 1).
c. I don’t see how the change in the calibration would lead to downgrading the evidence, the trials were carried out as designed, and, when the calibration error was discovered, this was noted so that the analyses could take this into account if need be. It also is not entirely true. There was no oximeter calibration change in SUPPORT or in BOOSTII-NZ.
d. This is just not true. The separation of SpO2 values actually obtained was not part of the study protocol. The protocol was to compare the saturation target ranges, not the saturations actually achieved. This is like saying a trial of an anti-hypertension drug is lower quality because the blood pressure was not lowered as much as expected. IF you still see a significant difference in outcomes, despite the intervention being less successful than planned, isn’t that a major red flag?
Two other reasons for downgrading the evidence for the outcome “death before hospital discharge” are given as:
e. This was not a prespecified outcome in the Benefits of Oxygen Saturation Targeting II trial, which was prematurely stopped because of this outcome.
f. Only 4 of the 5 eligible trials reported on the outcome of death before hospital discharge (the Canadian Oxygen Trial group did not).
e. This is evidence of good research practice. If children are dying more in one arm of a trial than another, by a highly statistically significant (more than 3 standard deviations) degree, then to wait another 2 years, allowing continued enrollment, would be a criminally unethical thing to do. I addition there are very few deaths between discharge and two years, so the difference is likely to remain.
f. Why should this lead to downgrading the evidence? It is the quality of the included trials for each outcome that is important, not whether all trials reported the outcome.
At the time the Manja paper was published there were data regarding mortality at 24 months from 3 of the trials (SUPPORT, COT and BOOST-NZ). Mortality was increased by 16%, or in absolute terms, by 27 per 1000 infants, with the lower saturation target. This was not statistically significant (but not far off, 95% confidence intervals from 0.98-1.37), this evidence was downgraded to “moderate” quality for reasons c and d above. The new results from the BOOST-II studies show a relative increase in mortality of 20%, and an absolute risk difference of 35 per 1000 infants (all oximeters combined). Which is remarkably close to the pooled results from the previous studies.
To return to the first issue in the new comment, i.e. the fact that survival was improving during the period that lower saturations were being sporadically and inconsistently introduced. I think this is really questionable as evidence of the impact of lower saturation targets. It may be that survival was improving despite the lowering of saturation targets; in fact I think that a lot of the improved survival was due to changes in obstetrical attitudes and interventions, extremely preterm babies are often delivered in much better condition these days than they used to be. The only way to answer reliably the question of the impact of saturation targeting practices is to perform the kind of large RCTs that we have performed.
I don’t see any other way of interpreting these data than to admit that lower saturation targets lead to higher mortality from a variety of causes, as well as an increase in necrotizing enterocolitis. We might not like it (I don’t like it) but I can’t see any other valid explanation of this weight of evidence from high quality trials enrolling 5000 infants.
Reese Clark has now sent me some more interesting comments which I will put in the next post, and then discuss, probably in a third post.