I won’t make a point-by-point response to Reese’s comments, mostly because I agree with most of them!
Oxygen is toxic. Minimizing oxygen toxicity is a vitally important issue.
Alarm fatigue is a major problem. In our NICU we performed an audit, in the intensive rooms an alarm every 2 minutes was the average. Many alarms are annoying but don’t require immediate intervention. The single greatest source of alarms is the pulse oximeter. If you make the limits narrower the alarms will be even more frequent, and more likely to be ignored. Alarms which are ignored are worse than useless. We need smart alarms: as one not very smart example, a pulse oximeter high saturation alarm that switches off when the infant is in 21% oxygen, and automatically re-activates when the oxygen is re-started would be great idea, which I hereby copyright.
Reese makes the point that the difference in actual achieved saturations between groups was less than expected, which may be due to many different factors, including the masking algorithm.
He also notes that the difference in mortality between centers, that we see all the time, is probably greater than the difference in mortality between the saturation target groups, and that the New Zealand trial showed a minor difference in the opposite direction.
My response to this is 2-fold, if you look at the effect of an intervention between different sub-groups it would be remarkable if every subgroup had exactly the same benefit. So even if some centers, or countries, show effects which are of different size, or even occasionally in the other direction, that doesn’t invalidate the overall treatment effect. It is one of the reasons that you should be wary of subgroup analyses, even when they are pre-specified. You are bound to find differences between subgroups, hence the value of performing a statistical test of the interaction, and, even when that is statistically significant, recognizing that it is potentially subject to bias. The NZ group did indeed show a minor difference in the opposite direction, (14.7 vs 15.9%) but the confidence intervals for that are so wide they include the possibility of a major effect on mortality in either direction (RR for mortality = 1.10 (0.68-1.78)).
The even larger differences in mortality between NICUs, even after correcting for baseline risk, is a major issue for neonatology, quality control/benchmarking programs can address some of those issues, and Pediatrix have been extremely active in this field. I think there are opportunities to make improvements, using such data, that are greater than the effect on mortality of changing saturation limits. (I also think that such programs should be evaluated objectively, preferably using randomized trial designs).
One of the questions that we might ask, using a secondary analysis of the saturation trial data is, did centers with a higher overall mortality show a different effect than centers with a lower mortality? If such an effect was systematic and actually changed the direction of the effect, that would be really interesting. If the difference in mortality between the low and high saturation groups was randomly distributed, that would also be interesting and would confirm that the difference is likely due to the intervention.
I must say though, that, as yet, I still can’t see another explanation for the results of the oxygen trials than a true effect of lower saturations leading to increased mortality. The lower saturation targets lead to more hypoxia (by design), more intermittent hypoxia (Di Fiore JM, et al. Low Oxygen Saturation Target Range is Associated with Increased Incidence of Intermittent Hypoxemia. The Journal of pediatrics. 2012), followed by re-oxygenation and oxidative stress, more intestinal circulatory fluctuations (this last bit is speculative, but might well be true). These disturbances may happen hundreds of times more often in babies in whom the saturations are kept lower.
If the difference between groups was less than expected, but there was still an increase in mortality with lower saturations, then I find that even more worrying!
I don’t know how we are going to resolve all these issues, there are some areas in neonatology where there is wide agreement (surfactant for all intubated preterm babies needing oxygen, nitric oxide for full term babies with hypoxic respiratory failure and an OI over 25, therapeutic hypothermia for babies with stage 2 HIE), others where there is still much disagreement (when do you intubate that baby who might benefit from surfactant?) I think we owe it to the babies that we care for to find the best possible answer to these, and other questions. When reasonable people disagree (as I said in a recent editorial) we can clearly see there is uncertainty, the best way to settle uncertainty is perform a trial, but will we ever be able to perform another definitive trial of oxygen saturation targets? Maybe when automated FiO2 controllers are widely available, the algorithms are settled and adequately reliable, maybe then; but what ranges would we choose to examine?