The long-running epic of the oxygen saturation targeting trials is nearing completion. This publication of the joint results of the Australia and UK trials now includes the primary outcome for the trials, the combined rate of death or “disability”. Australia Boost-II and United Kingdom Collaborative Group. Outcomes of Two Trials of Oxygen-Saturation Targets in Preterm Infants. The New England journal of medicine. 2016. Disability is defined as being a cognitive or language score on the Bayley-3 of less than 85, severe visual loss, or disabling CP (GNFCS of 2 or more). I will avoid (for a change) ranting about the inappropriateness of referring to a Bayley cognitive or language score of less than 85 as a “disability”.
Because of what happened during the trials the analysis can seem quite complex. But the overall message is that the adverse outcome was increased in the low saturation group when the two trials are combined, however you slice the data.
In case there are any readers who don’t know, a calibration artefact was discovered during the trials, which was corrected, leading to each of these trials, and the COT trial, to have babies with oximeters from a before-correction group and an after-correction group. In the two trials, the difference in mortality only occurred after the change in oximeter algorithm, whereas the smaller NZ trial used only the original algorithm and didn’t find an effect on mortality (or on long term outcome) and SUPPORT, with somewhat different entry criteria, did show a difference in mortality despite using only the original oximeters. The Canadian Oxygen Trial also showed a higher mortality in the low saturation group after the oximeter adjustment, but it didn’t reach statistical significance.
The new publication shows no effect of the trial on “disability”, but the analysis of the primary outcome “death or disability” was significant for the pooled data. What gets complicated is that the UK group changed their primary outcome during the trial to be the rate of death or disability with the revised oximeters, whereas the Australians kept this as the whole group. In the UK the oximeters were changed after roughly ¼ of the babes were enrolled, while in Australia 3/5 of the babies were studied with the original devices.
So the primary outcome analysis of the original trials presented doesn’t include some of the randomized babies (in the UK trial), which bothers me a bit, but their data are presented and analyzed. And then there is quite a lot of detail in one of the tables. The combined outcome of death or disability was significant for the pooled data which included all of the randomized babies (48% vs 43%) and not far from significant for the revised oximeters (49% vs 44%, RR 1.12, 95% CI 0.99-1.27). As I mentioned above, there isn’t any sign of an effect on disability, the difference is all in mortality, now updated to mortality before 2 years of age, most dramatically when the analysis is restricted to the revised oximeters. For the revised oximeters alone the relative risk of death in the low saturation target group was 1.45, (95% CI 1.16 to 1.82).
As everyone now who has Masimo oximeters, they use the new algorithm, and other oximeters were never affected, this is the part of the results which is now most relevant, and I think needs to be taken very seriously.
One comment I would like to make is that the primary analysis for the trials is described as “pre-specified”. But how can the analysis by oximeter algorithm be pre-specified if the problem was discovered during the trial? Pre-specified is supposed to mean, “determined before the trial started”. I think the analysis is just fine, the dilemma about what to do when this was discovered part way through a trial is not easily resolved, and the different choices of the 2 trials can both be justified. It is the use of the word “pre-specified” that I think is incorrect. Also the definition of disability was changed after the study commenced as they (quite appropriately) changed from the Bayley version 2 to the Bayley version 3. The authors describe these events quite clearly in the text, but as they were changed after the trial started they shouldn’t be referred to as pre-specified. the authors are using the term to mean specified before the analysis was started, which is of course essential and very important, to avoid picking data that look interesting after they have been collected.
To end the saga that I mentioned at the beginning now only needs the NeoPROM collaborative to analyze the individual patient data. It’s hard to think that this will give any result other than an increase in mortality with the lower oxygen target.
One other outcome of interest is that in this trial, as in all the others, there was no increase in blindness. This despite an increase in retinopathy requiring treatment. This was also seen in the SUPPORT trial, but there was no increase in retinopathy in COT. I think this means we can be a bit re-assured that the use of carefully targeted saturations in the low 90’s will not lead to a new epidemic in blindness; but should not be sanguine about the risks of targeting the higher saturation group, treatment of retinopathy is not, by any means, without consequences, even if we can usually prevent blindness, very severe myopia, loss of peripheral vision, and poor cosmetic results are common.