Who was Bayes, and what did he know about medical research?

I don’t have much detail to answer the first question: he was an 18th century English mathematician who wrote something about probability, that was published after he died. That publication described something called Bayes’ theorem which is a way of incorporating the prior probability of something happening with the evaluation of new data to arrive at an updated probability. (I think) (someone tell me if I am way off base….)

So if you can calculate the probability of something, then you take into account any new information that you find, and recalculate a new probability. In some ways this is how we operate all the time in daily life, but Bayes thought of new ways of doing the calculations.

Anyway, I have heard about incorporating Bayesian probability into clinical trial design for a while, but I don’t recall having seen many examples. The idea being that the usual way of doing a trial is to assume that the two arms of the trial have an equal probability of being preferable (which is sort of like the null hypothesis) then do the trial with a specified sample size, avoid looking at the data until the end of the trial (if possible, but with safeguards built-in, just in case) and then do a test of significance and declare that one arm of the trial was better, hence the benefit of treatment B is proven. A Bayesian trial explicitly incorporates prior probability into the design, encourages adaptive trial designs with flexible sample sizes, encourages repeated looks at the data as they are accumulating, and at the end produces a new posterior probability that treatment B is better than treatment A.

Which all sounds interesting, and I know there are examples of it actually being done, but I wasn’t aware of any perinatal trials.

Here though is a trial of antenatal intervention for urinary tract obstruction called the PLUTO trial. It was a multicenter RCT with a planned sample size of 150 women and their fetuses. After several years they were only able to randomize 31 mothers, so they had to stop the trial, as they probably ran out of both money and patience. Now as far as I can tell the trial team did not put anything about Bayesian analysis into the registration documents, nor the published protocol, but, given that the sample size was so small and the results therefore rather negative, they proceeded with an analysis using Bayes’ methods. They used some estimates of what they thought, before the trial, was the probability that antenatal shunting would be the better treatment, and then calculated the new probability that shunting is better by adding in the new data.

So what did they find in the trial? Seven of the 16 babies randomized to be shunted survived to one year of age; and 3 of the 15 randomized to be treated conservatively, with evaluation and treatment after birth, survived.

That shows what a bad condition this is, the fetuses were eligible in cases of visualisation of an enlarged bladder and dilated proximal urethra, bilateral or unilateral hydronephrosis, and cystic parenchymal renal disease, if the obstetrician was unsure of the best clinical management.

The CONSORT diagram below shows you how horrendously complicated it is to do and then analyze a trial like this.

pluto

So of the 16 allocated to shunting, 3 were not shunted, and 1 changed their mind and decided to terminate the pregnancy, and there were 3 treatment related pregnancy losses. Some of those allocated to conservative treatment got shunted anyway, and some others terminated. So how do you decide whether shunting was better or not? I know the ‘correct answer’ is an ITT analysis, you just calculate according to the numbers randomized into each group. But I think there is a good case to be made here for, at least, taking out of the analysis the non-procedure related terminations, which gives you 7/15 vs 3/13 survivors to 2 years. I think an analysis by procedure actually performed is interesting also, but you always have to be very careful, as you don’t know why the protocol violations occurred, it may because of clinical factors that might also influence prognosis. Anyway the ‘as-treated’ analysis shows 8/14 survivors who were shunted vs 2/14 conservative.

This is suggestive that maybe the shunting really did help, but it is clearly still a maybe.

The reasons for going into so much detail of this trial (apart from the fact that it is a trial that we really needed, and it is a great shame that they were unable to get a bigger sample) is that the authors then incorporate a Bayesian analysis, they determine a prior probability that shunting would be better, then add in the new results, and then calculate a new probability that shunting really improves survival. They calculate the prior probability as being 0.79 that shunting is preferable, and with the new trial data they state that the posterior probability is now 0.86.

So this is a Bayesian analysis of results a trial that was planned as a more conventional trial to determine superiority. My major problem with the analysis in this trial is that the prior probability is based on asking ‘experts’ what they thought. I think that even prior probabilities should be based on some sort of data, now having said that the prior observational data were actually more positive than the experts’ opinions, which just shows you how careful you have to be about observational data, it can be seriously biased.

Trials actually planned using Bayesian methods are also interesting. I know little about this, so I was pleased to find this document, it is a document reproducing what was taught to participants in a workshop on clinical research methods. It is a great introduction to what clinical research is all about, and there is quite a long and detailed section about Bayesian trial design.

I have been wondering about trials like the future lactoferrin trials, for example, if we try to calculate the likelihood that lactoferrin will prove to be very potent at preventing nosocomial infections (for which we actually have some hard data), and incorporate that into trial design, perhaps we can reduce the required sample size for future trials, and get an answer sooner.

By the way all the articles about Bayes use the same image, which probably isn’t him!

Posted in Neonatal Research | Tagged , | Leave a comment

10 Things Having A Preemie Has Taught Me About Life

One of the ‘parent of premie’ blogs that I read is ‘Cheering on Charlie’ from a parent of a 26 weeker who is now a beautiful little girl of about 16 months (just about 1 yr old corrected!).

I really appreciated one of her recent posts, so I thought I would send out a link:

10 Things Having A Preemie Has Taught Me About Life.

Her honesty and insights are refreshing.

Posted in Neonatal Research | Leave a comment

The fetal microbiome

One of the correspondents on the ‘phemonena’ website is Carl Zimmer, who has a newish column at the NY Times about the findings which show that the fetus is not sterile. This is not completely new, but I think it is still not well understood. I thought until recently that the fetus and placenta were bug-free, but data a couple of years ago about placental PCR analysis showing lots of bugs, especially from premature deliveries made it clear that I had been mis-informed.

This article is a good introduction for the non-specialist.

Posted in Neonatal Research | Leave a comment

New Book

image

I seem to have spent much of this year writing review articles of one type or another. One of them was for a new textbook on nutrition of the preterm neonate, edited by Sanjay Patole from Perth.

There are many excellent chapters, including an incisive evidence-based review of gastro-esophageal reflux in the newborn infant (author K J Barrington).

It has actually been published and printed very quickly, so it is quite up to date, and worth a look.

Posted in Neonatal Research | Leave a comment

Pre-SUPPORT, what did we really know?

One of the misplaced criticisms that have been made about the SUPPORT trial and the consent forms was that we should have known before the trial that there would be a difference in mortality, and that there would be a difference in retinopathy.

Is that true?

What was the actual state of knowledge before the oxygen saturation targeting trials? I wanted to give a brief description of the history, to explain in more detail to those who read this blog but who are not aware how little we knew before SUPPORT and the other O2 trials.

There had indeed been previous trials of different levels of oxygen, but they were from 2 separate, very different, eras in neonatal care. Among the first Randomized Controlled Trials (RCTs) in medical care of the newborn infant, the first group of trials of restricted versus liberal oxygen therapy included a total of just over 400 babies. They compared prolonged use of very high oxygen such as >80% in one trial, or >50% for at least 28 days in another, to an alternate approach of limiting oxygen therapy, usually by means of a maximum FiO2 that could be given. One of those trials for example only allowed any oxygen if the baby was cyanosed, and then the maximum that could be given at any time was 40%. So even if the baby was blue and bradycardic, an increase in oxygen concentration was not allowed.

At this time, assisted ventilation was not available, blood gases were difficult to perform (often giving results the nest day), there was no CPAP, babies were often starved for the first few days of life to avoid regurgitation even though there was no intravenous nutrition, and only infants who survived the first 2 days were entered into the largest of these studies; of course, the large majority of babies with significant lung disease had already died by that time.

The results of these trials were emphatic. Very prolonged, very high concentrations of oxygen caused retinal damage in moderately preterm infants. Such high oxygen treatment had become the standard of care despite a lack of evidence of safety or efficacy. Not only did the liberal oxygen groups have much more severe retinopathy of prematurity (RoP), there were no apparent adverse effects from restricted oxygen, specifically, despite what we would now consider inappropriate restrictions in the low O2 group, there was no increase in mortality. It may be that there was significant O2 toxicity in the high oxygen groups leading to pulmonary and other multi-organ injury to balance the adverse effects of the O2 restriction in the low oxygen groups. Or perhaps the studies were just too small to have the required power.

It was only later, using epidemiologic data, that there were suggestions that mortality may have been increased by restriction of oxygen. If it is indeed true, it may be because physicians began limiting oxygen exposure from the moment of birth, including during the first 2 days even though the strategy had never been tested during that interval. It may also be that the epidemiologic analysis was misleading, the paper by Cross (Cross KW: Cost of preventing retrolental fibroplasia? The Lancet 1973, 302(7835):954-956.) has been widely quoted, but the analysis is rather questionable, being based on projecting what the trends in neonatal mortality might have been if O2 had not been restricted.

I well remember reading, in the 1970’s, a newspaper article (I seem to remember it was in the Sunday Times) about the history of clinical trials which reported the horrific tale of the trial that had blinded 20 babies. I was shocked, as a teenager wanting to go into medicine, that doctors could do such terrible things. Of course the truth was that the blinded babies were in the usual treatment group, and the restricted oxygen group, the newer therapy, had less blindness and the trial saved the eyesight of many thousands of babies as a result. But the newspaper wasn’t about to let a little thing like the facts get in the way of a good story (clearly, this still happens!).

After this, manufacturers of incubators used to make them with holes in the sides, so that when you gave oxygen, usually just by attaching a tube to the oxygen inlet in the incubator and turning on the flow, the O2 concentration would stay below 40%. There was often a large red flag that became visible behind the incubator when you turned the mechanism to occlude those holes and risked exceeding 40%! Air/O2 mixers often required a manual override if you wanted to give more than 40% O2. I presume this was to try and shift the legal liability from the manufacturers to the doctors, if there was a case of RoP.

A few years later, when routine blood gas measurement became available, and other forms of medical supportive therapy were also developing, a second attempt was made to address oxygen therapy. This second wave of trials (more of a ripple) included only 2 studies, with a total of only 170 infants. The low oxygen groups had a PaO2 which was targeted to stay below 50 mmHg in one study or below 40 mmHg in the other (if a capillary blood sample was used the PO2 was kept below 35mHg), while it was kept around 100 mmHg in the comparison groups. Both studies suggested that aiming at a lower PO2 was safe, and seemed to lead to improved resolution of lung disease, in aggregate there was a slightly lower mortality in the babies who had more restricted oxygen therapy, which was not statistically significant. There was no effect on RoP in these trials, indeed by avoiding frank hyperoxia, there was no significant RoP in either group.

Interestingly Dr Usher noted in the report of his trial, only ever described in a book, and sadly never published in a peer-reviewed journal, (which makes it quite hard to get hold of, thanks to Lisa Askie for sending me a copy) that even the low oxygen group often needed more than the arbitrary limit of 40% oxygen.

And those were the only reliable controlled data available pre-SUPPORT. There never was an O2 targeting trial using continuous monitoring, even when transcutaneous PO2 monitoring was introduced the limits to be aimed for were selected arbitrarily. In the early 1980’s pulse oximetry was introduced for continuous monitoring. The direct continuous non-invasive measurement of hemoglobin oxygen saturation was a great advance. The main disadvantage of pulse oximetry is the difficulty in avoiding hyperoxia; as the devices are accurate to plus or minus 5%, a measured saturation of 94% could actually be a saturation of 99% and a very high PaO2, in the range shown to be associated with RoP. This was considered to be less important than the advantages of truly continuous monitoring. Several workers suggested keeping the saturation below 95% or below 94% to reduce the risk. However, the failure to perform early trials examining different goals of oxygen therapy with continuous monitoring meant that there was no clear understanding of what level of oxygen saturation to aim for, and whether any particular saturation range was better than another.

One center in the UK decided that, because preterm babies ‘should’ still be in the uterus, where their saturation might be as low as 70%, there was no need to give oxygen unless the saturation was below 70%. Others were worried that a saturation too low might affect pulmonary vascular relaxation and they therefore kept saturations much higher, some with no maximum limit at all.

After the provocative publication from the UK, which showed very little RoP and no apparent harms, in particular no higher mortality, there were a large number of observational studies, mostly showing advantages to the development of RoP of lower saturation targets, and none showing increased adverse effects, in particular none showed any increase in mortality with their lowered saturation targets, nor any augmentation in adverse neurological consequences in the long term. However, the saturation targets examined in those observational studies all varied, and in most the lower saturation limits were in the low 90’s. The reduction in retinopathy was thought to be due to a reduction in frank hyperoxia. Further reductions to the high 80’s were suggested as a way of reducing oxidative damage (then being more widely investigated as a causal factor in other conditions such as bronchopulmonary dysplasia and even intracranial bleeding), and perhaps further reducing the incidence of RoP, but it was not known if this would be effective or safe, nevertheless, based on the observational data and the very old studies some centers did take up even lower saturation ranges.

I must be clear, keeping the saturations below 95% was considered adequate to prevent hyperoxia, and seemed to probably reduce RoP. It was thought by some that oxygen was no longer a major culprit in the causation of RoP as long as hyperoxia was avoided, so there were many studies looking at other potential risk factors, hypocarbia, hypercarbia, blood transfusions, and inadequate nutrition were all considered to be possibly important (and may indeed be important). It was not at all clear whether or not dropping the saturation ranges even lower, to below 90%, would actually have any additional benefit in reducing RoP.

And retinopathy had become a problem again. Since the introduction of continuous monitoring much more immature babies were surviving, the kinds of babies tested in the 2 early groups of trials almost never developed retinopathy, it was mostly babies of 26 weeks gestation and below who were developing the condition, also universal retinal screening was refined, and was picking up many more cases.

This is the situation that led to the enormous variations in O2 saturation targeting. Some centers felt that a saturation of 90 to 95% was the best idea, largely avoiding hyperoxia, and not wanting to risk hypoxia. So they would have a unit routine. The alarm limits for all preterm babies in the NICU would be set to, say, 89 and 96. Then the nurse would adjust the oxygen delivered, sometimes every couple of minutes, to stay within the target range, and the alarms would frequently sound, when the baby passed beyond those limits. Every preterm baby in the NICU would have the same limits, until they no longer needed oxygen. As the babies’ lungs improve, they will often have saturations above 95% even without supplemental O2, sometimes up to 100%. So if the oxygen requirements drop to 21% O2, and the saturation is above the target limit, the high saturation alarm is switched off. Some babies in the NICU never need oxygen, they may right from the start have high saturations even in 21% oxygen. We accept this because the PaO2 of a baby in room air, 21%, cannot easily exceed about 105 mmHg. Which was considered safe. It is very rare for a baby who has never had supplemental oxygen to develop RoP, even though their saturations may be above 95% for their whole life.

Another center examining the same data would set their oximeter alarm limits to a lower range. With that approach fewer babies ever get supplemental oxygen, they will tend to come out of oxygen earlier, but they will continue to have the same saturation target whenever they need oxygen until they are nearly ready to go home.

Unfortunately, some families who participated in SUPPORT have been misled by pressure groups, consisting of doctors who have never worked in an NICU, and PhD ethicists who have never even set foot in an NICU or taken the trivial effort of talking to someone who has. One such family who spoke on the day of the recent OHRP meeting noted (I paraphrase) ‘I didn’t realize that if my baby was in the trial that they would not receive the usual care of adjusting the saturation range according to his needs’. Unfortunately, whoever told them that lied to them.

As part of the preparation for the conference in Sydney where I just presented, I pulled out the PowerPoint presentation that I gave several years ago, when we were planning the O2 trials, and I was asked to present the rationale for the COT, Canadian Oxygen Trial (I think it was at the annual Canadian Paediatric Society meeting). In that presentation were two slides, one was the claims of the doctors who were happy using higher saturations, the other included the claims and rationale of those of us who were advocating lower saturations. Both groups claimed that their favorite range would reduce death, reduce BPD, and reduce morbidity in the long term. The low saturation proponents were in addition claiming that retinopathy might be reduced, the high saturation groups were claiming that retinopathy would not be further affected by going lower!

Yes we really were completely unsure what ranges were the best.

Posted in Neonatal Research | Tagged , , | 6 Comments

Travelling again

I am, for the next few hours, in Sydney Australia, at the end of a productive trip. We had 2 workshops, 1 on the current status of probiotics, the other on the question of consent for perinatal trials, focusing on the SUPPORT trial.

Both were productive I believe. And we will try and produce a report of each, and also another document which will be our recommendations. One of the best parts of the workshops was the involvement of several parents. Their contributions were the most important, and the use of social media by Melinda Cruz, the CEO and founder of the Australian Miracle Babies foundation allowed others to be involved in the discussion who weren’t even there… the wave of the future, or maybe of the present!

After it was over I took a ferry ride across Sydney Harbor, and on the way back a cruise liner was pulling out of the harbor behind the Opera House. It was a long exposure on a moving boat, but this is the least blurry of the shots that I got.

20130906_183558_LLS

Posted in Neonatal Research | Leave a comment

Does gestational age matter?

Taking a break from the SUPPORT brouhaha for a moment, here is a great systematic review from Greg Moore and colleagues in Ottawa. (Moore GP, Lemyre B, Barrowman N, Daboval T: Neurodevelopmental outcomes at 4 to 8 years of children born at 22 to 25 weeks’ gestational age: A meta-analysis. JAMA Pediatrics 2013,)

I have mentioned many times on this blog the problems with relying on the 18 month or 2 year Bayley score for determining if outcomes are seriously limited or not. Most ex-preterm infants with a Bayley MDI under 70 at 2 years do not have a cognitive impairment when you retest them at school age. Even infants with very low scores (under 50) will often be in the normal range if you test them later, at an age when they are old enough that you can really say something about intellectual ability. Testing later, say at 3 years, and using more extreme cutoffs makes the testing more predictive, but there is still a lot of uncertainty, once children approach school age then testing becomes better at discriminating between children with serious limitations and those without.

A lot of the previous research has shown very little effect of week by week changes in gestational age on outcomes. Epicure follow up at 6 years, for example, showed almost no gradient of outcomes between 23 and 25 weeks. Even large regional studies such as Epicure may, however, be suffering from a lack of power.

So Greg Moore and his co-authors did a systematic review of all the good quality studies that they could find that studied extremely preterm infants at early school age (4 to 8 years). They looked for cohort studies published after 2004, with follow up rates over 65% using standardized testing.

The results I want to concentrate on are the effects of increasing gestational age on the prevalence of impairment.

The authors divide impairment into moderate (more than 2SD below the mean on IQ testing, ambulant cerebral palsy, GMFCS 2 or 3, substantial visual impairment (worse than 20/40) or hearing restored with amplification) and severe impairment (IQ more than 3 SD below the mean, CP with GMFCS 4 or 5, no useful vision worse than 20/200 or profound hearing loss). There are nearly 900 babies in all in the 9 cohorts reported with gestational ages from 22 to 25 weeks.

When you have this large a number of patients, there is some reduction in rates of moderate impairment from 22 to 25 weeks, the frequency is around 24% for babies born at 25 weeks and increases by 6.5% for each week less, a statistically significant increase at p<0.01, but a smaller increase than you might think.

When you look at severe impairment, there was no significant effect of weeks of gestational age, being 14% at 25 weeks, up to 17% at 23 weeks, (the numbers at 22 weeks are really too small to say anything (n=12)).

So the majority of survivors are unimpaired or mildly impaired at any gestational age, and few (too many, but still few) are severely impaired.

This is very interesting data, which helps to clarify the outcomes of these patients. I think such a low frequency of severely abnormal outcomes, and the lack of an effect of gestational age are very important. If there is no substantial effect of the number of weeks of gestation, and the proportion who are severely impaired is small (and the individuals affected unpredictable before birth) then this should not be part of our considerations about whether or not we should start intensive care.

Interestingly, both this article and the editorial which goes with it, discuss the question of using long-term outcome data as an issue in the decisions of whether to initiate intensive care for the extreme preterm.

I am coming to the opinion that the incidence of profound disability is so low (and I am the first to recognize that 15% is way too high), so unpredictable before birth, and so poorly correlated with gestational age, that we should not include this as a consideration in our decision making.

My good friend and colleague Bill Meadow has an article shortly to be published in ‘Neoreviews’ I hope he won’t be upset with me if I steal his closing paragraph, but he expresses, much better than I can, an opinion which is so close to mine, and which indeed reflects my own personal experience as a parent of an extremely preterm baby. We were incredibly fortunate to have Sophie Nadeau, Gene Dempsey, and the rest of the team at the Royal Victoria Hospital in Montreal helping us through those very difficult times.

Perhaps antenatal consults oughtn’t really be about helping parents make life-and-death decisions at all.  Perhaps they should be about reassurance and human kindness.  We (the neonatology team) are here for you in your moment of unexpected and indescribable fear.  We may not be able to help you with your decision prior to delivery – the data are just too ambiguous.  But we’ll be with you every step of the way.  And if things turn bad during the NICU stay, we’ll be there – supporting your autonomy and helping you make the difficult decision of what is now in your baby’s ‘best interests’.  That may be the best we can do.

That sounds to me like a pretty good ‘best’.

Posted in Neonatal Research | Tagged , , | 6 Comments

Marche pour les prématurés

Last Sunday was the annual ‘marche pour les prématurés’ in Montreal, and several other cities across Québec. Some other towns have their walks this coming Sunday (see the website of Préma-Québec, who organize them). This year the walk was covered by the RDI TV station, who put a short section on their evening news. You can see it here (in French). Apart from the introduction with its obligatory statements like ‘the happiness of a birth is replaced by anguish when the baby is premature, often the parents have to live with the sequelae of saving their little miracle’ (Eugh). The rest of the presentation was rather well done. Of note there is an interview with our nursing unit chief and her daughter, born at 29 weeks, which Roxanne handles very well. And just before the end a flash of Violette picking up her badge with 24 written on it.

Posted in Neonatal Research | 1 Comment

Blurring the line between patients and research subjects

The US Department of Health and Human Services held a public meeting about consent issues for research on August the 28th, largely as a result of the controversy surrounding SUPPORT. It was called ‘Matters Related to Protection of Human Subjects and Research Considering Standard of Care Interventions’.

It was streamed live on the internet, but I only got to watch an hour or so, I suppose that much of it will be available in various forms in the future, hopefully all in one place on the HHS website.

Some of the presentations are already available elsewhere, for example the presentation by 2 people who call themselves ‘historians of medicine and human subjects’ research’. They are Alice Dreger, PhD, Professor of Medial Humanities and Bioethics, Northwestern University, Chicago, IL and Susan M.  Reverby, PhD  Professor in the History of Ideas and Professor of Women’s and Gender Studies at Wellesley College, MA.

The document is apparently a written version of their testimony to the panel set up by the HHS. It is a great example of the false dichotomy between clinical care and research; a lack of understanding of standard of care; a series of pontifications about acute care neonatology which only show they haven’t got a clue; and a misunderstanding of risk.

Drs Dreger and Reverby commence by claiming that doctors are being pushed into performing unethical research because of the pressure to publish, by the emphasis placed on academic productivity, and, even though they pay lip service to the idea that we may be motivated by a beneficent desire to do good, apparently that is being perverted by today’s academic climate. They of course give no data to support these offensive claims. There is of course also no acknowledgement that professional ethicists wholly supported by university salaries have academic pressure effectively coercing them into making scurrilous accusations about medical ethical lapses, nor that their livelihoods are threatened by physicians who have re-appropriated clinical ethics, so it is really good for their career prospects if they can claim that doctors are unethical and can’t be allowed to police themselves. (And in fact we don’t, IRBs all have non-medical and lay members, and more and more granting agencies require participation from families).

I will quote several illuminating passages:

As the OHRP found, parents should have been told that  randomization into restricted trial arms in this study could potentially increase (or  decrease) the odds that their babies would suffer death and particular impairments.  The protocol and publications show that the study was designed to determine just those risks. Consent was so poorly handled in this trial, OHRP should have required—and should still require—that the involved institutions now inform parents what they should have been told before enrollment—that being in the SUPPORT study was likely to have changed how their baby was treated in the neonatal intensive care unit (NICU) and might have increased the risk of death and disability.

Of course, most of that is not true, parents were informed that this was study to examine different oxygen saturation targets, and being in the study did not increase risks of death or disability. There was, I re-iterate, no difference in disability between the groups, although a potential difference was one of the reasons the study was done, there was no reason to believe that the 1300 babies enrolled in the trial would have more deaths or disability than 1300 similar babies treated outside of the trial. There still is no reason to believe that mortality was higher for the group as a whole, it is likely that as a group they benefited.

Parents were of course told that being in the trial might change how their baby was treated, that is what the consent process is all about, that is what clinical trials such as this one are all about; not continuing to do something when we are unsure what to do, but randomly comparing alternatives, which might not have been what the doctor would otherwise have done, but would have been acceptable alternatives.

Dreger and Reverby are very agitated about the use of the term Standard of Care. They believe that the use of this term is ‘most egregious’, seemingly being ‘designed to reassure parents who might enroll their very premature babies that it would have made no real difference whether or not they enrolled’. It was of course a statement of fact. Oxygen saturation targets differed greatly according to different hospital protocols, none of which were based on good evidence regarding clinical outcomes, and therefore were within usual acceptable standards of care. The ranges tested in SUPPORT were less extreme than some in active use at the time.

Standard of Care is sometimes used as a legalistic term (the Standard of Care for a ruptured appendix is immediate surgery, so you are negligent if you didn’t follow the Standard) but often in medicine there is no such Standard, and a wide range of approaches are within current acceptable limits, and can be considered to be standard of care. That is how the term was used here, and it is accurate, both oxygen saturation ranges were within the limits of acceptable contemporary practice, they were all within ‘standard of care’.

If  the  best  clinical  judgment  in  NICUs  was  not  evidence-based  because  we  lacked  the  data  the  SUPPORT  study  was  designed  to  generate,  then  the  parents  needed  to  be  told  that.

But they were told that! They were told: we don’t know which saturation range is best, that is why we are doing this study. In fact parents going into the trial were better informed than those who were not asked for consent. How many parents outside of the trial were told that the use of oxygen was extremely variable around the world, that the limits chosen in the unit where their baby would be treated were entirely arbitrary, and the saturation target limits chosen by their hospital might increase the risk of retinopathy or of disability?

Even if the same range of risks existed in ordinary NICU care, parents needed to understand their baby might be subject to a different subset of risk odds, and ultimately a different set of harms, via enrollment into this randomized clinical trial (RCT). The consent forms for the SUPPORT trial should have explained what care outside the trial would look like, and what risks were associated with that care.

I think you might expect that I don’t agree with this. at all.

The purpose of consent is to ensure that reasonably foreseeable risks of the research are explained. If the same range of risks exists in ordinary NICU care then to exaggerate the risks of being in the study by pretending that those risks are due to the research is untruthful. To try to guess what ‘subset of risk odds’ and what ‘different set of harms’ that parents need to understand is usually impossible until after the research is done.

We are also alarmed that pregnant women and their partners were asked to consider enrollment of their babies in this study just at the moment when those women were facing the very premature birth of their child. It is not clear to us that any mother, or her partner, in such a situation could have the mental wherewithal to seriously consider enrollment of their extremely premature baby in a major trial, particularly one that might change risks of death or disability. We hasten to remind those here that, much as we would dearly love data on certain interventions, there  are sometimes trials that simply cannot be done ethically.

Yes of course parents are stressed when their baby is in, or likely to be admitted to, intensive care, but we simply cannot accept the idea that there are some issues that cannot be investigated ethically as a result of their stress. As a group of people who talk to stressed parents every day of our professional lives (except when we are writing blogs) neonatologists are much more aware of this than most academic ethical thinkers. I have spoken to parents at 2 am who have just had a baby admitted to the NICU, and yes you can get a valid research consent in those circumstances. If the alternative is to give up, then I refuse that alternative.

we also must insist that discussing the  SUPPORT trial as a case of so-called “standard of care research” is just plain wrong. Several of the experimental interventions in the SUPPORT study did not represent commonly used clinical interventions. For example, we are unaware of any NICU that would carefully seek to maintain a very premature baby at an oxygen saturation level of 85-89% regardless of the baby’s clinical status, as happened to babies in one arm of the oxygen saturation intervention. We are unaware of any NICU that would regularly withhold surfactant from very premature babies.  Surfactant is a treatment widely believed to make an enormous difference in survivability of extreme prematurity. We are also unaware of NICUs where practitioners would, outside of research, be blinded as to the real oxygen saturation levels of children they are treating.

Unlike these authors, the SUPPORT investigators actually knew what goes on in an NICU.  And yes, before the SUPPORT trial was published there were (and maybe there still are) NICUs that maintained a babies saturation in that range for as long as they needed oxygen, regardless of other issues.

The statement about surfactant is perhaps the most revealing of the authors’ ignorance. In this trial surfactant was either given routinely after intubation in the delivery room or according to a different protocol which tried to avoid intubation until it was clear that the individual baby needed it. ‘Withholding life-saving surfactant’ sounds terrible doesn’t it, especially if you are totally uninformed; the babies in the restricted surfactant group DID BETTER. There was a 9% reduction in the rate of BPD which was almost significant. Infants who had their ‘life-saving surfactant’ withheld less frequently required intubation or postnatal corticosteroids for BPD (P<0.001), and required fewer days of mechanical ventilation (P = 0.03), so they were better off if they were in the ‘surfactant withholding’ arm of the trial. Babies benefit from surfactant if they need it, but exactly when you can be sure they need it still isn’t clear, even after SUPPORT. This landmark study showed that if you don’t intubate for surfactant until the baby reaches 50% oxygen, then they do better than giving them all ‘life-saving surfactant’.

As I said, I think this statement is very revealing, it really shows how little the authors know about neonatal care, but they still feel qualified to malign the doctors who performed this trial. ‘We are unaware of any NICU that would regularly withhold surfactant from very premature babies’. That could only have written by someone who has no conception of what the study was about, and never bothered to ask a neonatologist.

The statement about the masking of the saturation monitors is just bonkers. If the 2 approaches are clinically acceptable then to compare them in a masked fashion just makes the study more reliable. Of course you don’t do that outside of a trial, but it doesn’t make the trial non ‘standard of care’.

The authors object to the idea that health care should become

a so-called “learning health care system,” in which
essentially every patient becomes a subject (as this) requires a system where the line between patient and research subject necessarily becomes blurry. We strongly object to this idea.

Well sorry, the line is already blurry. We must continually learn about what we are doing. There are many ways in which evaluation of patient outcomes are used to try and improve the care that we give. Even the most unreliable reasons for changing how we deliver care (such as what happened to the last patient you treated) are based on data from other patients. The idea of building learning health care systems is to make that process reliable, and to benefit everybody, including current patients. If I analyze anonymized retrospective data to see if one hospital has better outcomes than another, then all of the patients whose data I look at are contributing to that process. A process which, in that situation, is without any risk to the individual, but which is limited, it can only give indications, and potential questions to answer.

Formally comparing current treatment approaches in randomized trials still treats all the patients as patients, who should, and in my experience do, receive compassionate thoughtful care. In fact, I would like to restate what I said above, there is no line, even a blurry one, between being a patient and a research subject, you can be both.

Posted in Neonatal Research | 4 Comments

Comparative Effectiveness Research: a parable

Two neonatologists work in the same NICU. One of them routinely starts assisted ventilation of babies with volume ventilation, the other starts with pressure ventilation. There is controversy in the literature and in practice among currently active neonatologists; some preliminary data suggests that there might be advantages of the newer mode of ventilation. On the other hand a majority of neonatologists continue to use pressure ventilation. Observational data from the NICU where these two doctors work does not show any difference in outcomes of one doctor compared to the other.

The pair decide to participate in a large multi-center trial, comparing their two favorite approaches. Because any differences that arise are expected to be modest, the sample size is large, over 1000 extremely preterm infants will be studied. The 2 doctors realize that the data to support one approach over the other is weak, and the situation in their NICU is evidence that a definitive trial is needed.

The primary outcome of interest is lung injury. Because infants who die cannot develop lung injury, the primary outcome variable of death or bronchopulmonary dysplasia (BPD) is used, but the prior data show no difference in mortality; the only difference that people think may be found is BPD.

 I think some of the controversy regarding comparative effectiveness research comes down to a difference in the answers to what is, to some extent, a philosophical question is: what is the risk of being in such a study?

Is it the risks of all the things that can happen to sick extremely preterm infants: death, IVH, sepsis, NEC, PVL, Retinopathy as well as BPD in the two groups? Let us suppose that there is no prior reason to suppose any difference in the other outcomes, should the consent forms describe these outcomes as potential risks of the intervention? On the other hand the published systematic review does state that serious hemorrhage and PVL, when put together as a combined outcome, are more frequent in the volume group, but those data are really questionable, the pathophysiology of the 2 injuries are different, putting them together can be questioned, and the subset of studies which are most relevant for the design of this new trial do not confirm the increase, also observational data don’t show more brain injuries in hospitals that tend to use pressure rather than volume ventilation.

Public Citizen would probably argue that pressure ventilation has the risk that there may be more hypocapnia, which can lead to more brain injury, they would argue that, on average, intra-thoracic pressures are higher with pressure ventilation, which might affect cardiac output, etc. They would probably say that there are more breaths with excessive volume delivered during pressure ventilation which might lead to pneumothorax.

They would also argue that with volume ventilation there may be ETT leaks which lead to underestimation of the delivered tidal volume and consequent delivery of excessive pressure which might lead to pneumothorax.

They would state that all these potential  problems in the 2 groups must be described in detail in the consent form, supposing that they are risks of the research project.

They would also include in their objection letter the following bizarre statement ‘The random assignment of the premature infants to one of the different modes of ventilation that are currently used — independent of certain clinical factors that would normally be taken into account in making ventilation decisions as part of routine care of an individual infant — clearly has the potential to alter the care that the premature infants would otherwise receive as part of usual care if they are not enrolled in the trial.’

(The adolescent response comes to mind ‘Well, Duuuuh’.)

Well of course, that is the whole point of doing the study, instead of babies haphazardly getting one treatment or another depending on who is on call, which NICU they are in, the time of night, the availability of equipment and how opinionated the respiratory therapist is, they will be randomly assigned, so that all those other characteristics are balanced; which means that some babies that would have had pressure will get volume, some who would have had volume will get pressure.

It must also be emphasized that in no study are all eligible babies enrolled. This occurs for several reasons, but one of the reasons is that, if a doctor thinks, for an individual baby, that it is clear that one of the study interventions is preferable; in such a case they have a moral obligation to treat the baby with that particular treatment. Hopefully such decisions will have an evidence-based reasoning. So the ethical requirement for equipoise must be understood to include equipoise for the individual patient. In other words, there must be no rational objection to this particular baby receiving either volume or pressure ventilation in order for them to be enrolled.

Furthermore, if during the study it becomes clear that the randomly assigned form of ventilation is not working for your patient then you must stop the study intervention and do what you think is best. This is not a theoretical concern, it happens all the time. Study protocols are designed to minimize such violations, by allowing other therapies in defined circumstances, for which the data are collected. All studies have some protocol violations, sometimes they are just mistakes, but sometimes it is because the clinical situation makes the doctor use treatment which is contrary to the protocol, which is as it should be if they have a good reason for supposing that their patient will be better off.

The consent forms for this hypothetical trial (actually a trial that really, really needs to be done! See my previous post) state that there are no additional risks of being in this trial in excess of routine care.

The consent forms state that ‘we don’t know if there is a difference in BPD, but that is why we are doing the study to find out, so we will count how many babies survive without BPD in each group.’

I think that would be appropriate.

The implications of the SUPPORT controversy are that some people think however that the consent forms should state that the babies in one group are more likely to have BPD and that the babies in the other more likely to have brain injury, that pneumothoraces are a risk of participation in the trial, and that babies managed outside of the trial will have individualized therapy designed by their doctors to give them the best outcome. Such a consent form would be much more misleading than stating that there are no additional risks to being in the trial.

 Let us hypothesize a possible outcome of the trial.

At the end of the study there is a difference in BPD, 25% in the volume group and 35% in the pressure group, which is statistically significant.

The babies enrolled in the study have a 45% incidence of that list of serious complications above, slightly less than the 50% incidence in contemporary babies not in the trial.

So overall 30% of the babies in the study developed BPD. The same 1000 babies treated outside of the study, would probably have had the same overall incidence of BPD, (or more likely they would have had more, as just being in a trial has benefits).

The babies who were started on volume ventilation end up having the lower rate of BPD; if they had received volume ventilation outside of the trial, which would have been true for half of them, they would have had that same rate of BPD. But the other half, who would have been pressure ventilated, have, as individuals, benefitted from the trial.

The babies in the pressure group end up having a higher rate of BPD, but those who would have been treated with pressure ventilation outside of the trial, would have had that higher rate of BPD in any case. The individual babies who ‘would have’ been treated with volume ventilation, but get randomized to pressure ventilation, are more likely to have BPD than would have been their ‘fate’ if they had not been in the study.

Of course, we did not know before the study that there was going to be a difference, and there is no way to know for an individual baby which group they ‘would’ have been in outside of the trial. That is starting to get a but metaphysical.

So can we say post hoc, that there was no additional risk of being in the trial? I would say yes.

One pressure group will criticize the study, because there was not a 3rd arm of the trial where the doctor decides, based on his gut feeling, which mode of ventilation to use. Another states that the consent forms do not mention that there might have been a difference in mortality, which should be revealed to the parents.

The pressure group states that babies in the trial may receive a treatment which is different to what they would have had if they were not in the trial. The investigators reply ‘exactly, that is the whole point’.

The need for comparative effectiveness research is because we have many situations where the variations in practice are large, the variations in outcomes are great, and it is not clear which variations in practice are related to which variations in outcomes. Does randomly comparing 2 different modes of therapy in routine use pose risks? No more than (and arguably, less than) the risks of haphazard variations in care leading to different modes of therapy being used.

If you are admitted to an NICU which is not in the trial you might get volume or pressure ventilation. If you are admitted to an NICU participating in the trial you will be asked to join, and if you consent you might get volume or pressure ventilation.

So it all comes down to where you think the risks lie. If you calculate risk based on the overall outcomes of the participants there is no increase in risk. If you calculate risk after the study is over, when you know the results, and you compare the outcomes of the group who had the worse outcome with their potential outcomes had they not been in the trial, then their incidence of the adverse outcome was higher (and the other group  was lower). But that is not ‘risk’. That would be like saying that driving at 100 kph through a built up area is not risky if, after you get home, it turns out you didn’t actually hit anyone. Or that giving up smoking doesn’t decrease your risk if you end up with lung cancer anyway.

I think the way forward is to consult parents. I think we need to find situations in which parents would agree that a waiver of consent is reasonable. If parents don’t find that to be a reasonable option for a study, then they should be involved in improving the consent process. By which I mean not making the forms longer, the SUPPORT forms were already almost unreadable, not in the reading level, but conveying the complexity of the issues, in the detail that is already required, led to forms which were 9 pages long or more. It is entirely appropriate to state that a study such as this does not increase risk, but, how do we talk to parents about the implications of the fact that we might actually find an important difference between the groups once it is over?

Should we always include something like ‘when the study is completed it may be that one group has better outcomes than the other, for example there may be more babies with BPD in one group than in the other’. I think that is implicit in the consent process for such a trial, but perhaps that would be preferred by parents. We should ask them.

Posted in Neonatal Research | Tagged , , | Leave a comment