Following on from my 2 recent posts, a new publication from the CHILD-BRIGHT network in Canada, (Synnes A, et al. Redefining Neurodevelopmental Impairment: Perspectives of Very Preterm Birth Stakeholders. Children. 2023;10(5) Open Access). CHILD-BRIGHT being a slightly tortuous acronym derived from Child Health Initiatives Limiting Disability—Brain Research Improving Growth and Health Trajectories. The article reports a study in which they asked a large number of stakeholders, mostly parents of extreme preterms, about whether they considered a particular outcome to be “a severe health condition”. This question was asked about 11 different scenarios, and respondents were also asked to rate the health of the child from 1 to 10.
The scenarios were constructed in order to reflect what are considered to be “severe NDI”, some examples are below, the terms in parentheses were not shown to participants, who all received a randomly selected 5 scenarios from the 10 that were created, the first 4 of which are shown below, who had isolated problems, everyone also received the scenario of a child without any delay or impairment.
Some of the scenarios had problems in more than one domain.
The figure below shows the proportion of respondents who thought that the scenario was a severe health condition. The infant with a cognitive delay, scenario 3, designed to be more than 2SD below the mean of the normal population on their BSID3 cognitive score, was considered a severe health condition by only 4.6%.
The babies with delay in both language and cognitive development, or with cerebral palsy, with or without a language delay, were more likely to be considered to have a severe health condition.
There was little difference in the rankings, or in evaluations as a “severe health condition” between parents, health professionals, respondents who themselves had been born preterm, teachers or others.
You might think that scenario 3 is not what you would yourself consider a severe health condition; surely that is not what is meant in follow up studies as being a “severe NDI”? It is important to recognize that the first and last authors of that study are 2 of the powerhouses of neonatal follow up in Canada, Anne Synnes and Thuy Mai Luu, and that the scenarios were carefully constructed to accurately reflect current definitions. As they state in the discussion:
“Stakeholders, parents, and clinicians generally rated the clinical scenarios more favorably than expected. This is important because the term “severe neurodevelopmental impairment” is used to make professional recommendations about life and death decisions. These outcomes are also used to communicate with parents and prepare them for the future. Our results identify the potential for miscommunication when the term “severe” is used.”
To emphasize, these scenarios reflect outcomes for which the label of “severe NDI” is applied in follow up studies. These scenarios reflect outcomes for which we may then discuss with parents the options of ending life-sustaining treatments.
Even I, as a long time critic of “NDI”, was taken aback by this study; the descriptions of actual children, satisfying criteria for “NDI”, sometimes in more than one domain, should give us all pause. Do we really think that limiting care for a baby, because of an increased risk of one of these outcomes is appropriate? That is not to say that occasional babies have outcomes more dramatic than these. There are occasional babies with profoundly limited function. I don’t know if there are reliable ways to predict those outcomes in the neonatal period.
Until follow up studies start to report on functional outcomes, using scales such as the GOS-E, that I recently mentioned, it will be impossible to know whether there are factors that we can use in the neonatal period to predict outcomes that are valid reasons for withdrawing life-sustaining interventions.
When it comes to more severe abnormalities. it remains the case that the prevalence of major disability correlates poorly with the Papile grade of IVH. This is in part because of some diagnostic ambiguity; some grade 2 hemorrhages are followed by early ventricular dilatation and may be reclassifed as grade 3 hemorrhages. In addition grade 3 hemorrhages are usually lumped with grade 4 hemorrhages in follow up studies, and it is thus not clear if their prognosis is better or worse, or similar. Also, grade 4 hemorrhages vary enormously from small localised intracerebral bleeds to massive bilateral hemorrhage.
There are a couple of published severity grading systems, the Bassan and Al-abdi systems, which have been evaluated in independent cohorts. Both show that there are smaller grade 4, or intraparenchymal, hemorrhages which have very little effect on outcomes. As IPH becomes more extensive the range of outcomes tends to shift to more cerebral palsy, and greater impacts on development. Even the most severe IPH are by no means universally followed by profoundly impaired futures.
A huge limitation of this field is that outcomes are usually defined by performance on standardized testing of developmental status, usually at 18 months to 2 years of age, and not by the impacts on function. I am very uncomfortable with the idea of taking a potentially life-shortening decision based on the probability that an infant may have a low BSID score.
Function is a broad term referring not just to impairments, but to capacity and performance, which are related but distinct concepts, for example, even with limited capacity for certain tasks, the performance of an individual can be improved by practice and exercise.
In our recent article, about the outcomes of children who survived after end-of-life decisions, we used a functional outcome scale to describe how the children were doing, which I think is much more useful than scores on developmental screens, or even on IQ tests, when we are discussing values and quality of life. That system, the Glascow Outcomes Scale-Extended (GOS-E) has the following grades : GOS-E of 1 is a “normal” functional outcome, 2 is “normal” function with ongoing medical needs, 3 is permanent impairment, but with a prospect of independent living, 4 is disability with likely supervised living in the future, 5 is disability requiring assistance with activities of daily living, 6 is disability requiring assistance with activities of daily living needing the permanent presence of another person, 7 is complete dependence on another individual with no possible communication, 8 is death.
As mentioned, I don’t think that Bayley scores (or IQ results) are relevant for decision-making for whether or not we should continue life-sustaining treatments or not. Functional outcomes are much more important, and the little data available suggest that the general public agree with that. In an interesting study questioning adults on-line, Dominic Wilkinson and his group described several scenarios, all based on recent high-profile legal cases in the UK, and they asked the respondents whether they thought life was of value for the infant, and whether withdrawal of intensive care was permissible. Most of the respondents in the UK agreed that, for an infant with no awareness of their surroundings, or with only “possible” awareness, life was of little or no value, and that withdrawal of care was either permissible or mandatory.
The following was one of their cases, a child with profound limitations,
The large majority of respondents thought the this child’s life was of benefit to them, and 75% thought that continuing intensive care was morally obligatory. Only 12% stated that if it were their child they would want treatment withdrawn.
Such profoundly limited function, equivalent probably to a GOS-E 6 or worse, is very rare in our preterm population, with or without intracranial bleeds. But it is impossible to find any literature correlating severity of early ultrasound abnormalities with a functional outcome evaluation that is really relevant to decision making.
What we are, therefore, currently faced with, is trying to predict whether outcomes will be so poor that continuing intensive care is questionable, and is something to be discussed, but based largely on guesswork, and on the risks that a baby will have “NDI”, but even “severe NDI” is far from what the general public think is a good reason for withdrawing Life-Sustaining Interventions.
If we display data such as that from Radic et al, the outcomes of their grade 4 group, if they were normally distributed, the developmental screening test scores would look like the orange curve below. The majority of the babies are to the right of the 70 line, and therefore not “severely impaired”
We could display the data from that study in arbitrary categories of severity, which follow-up studies are very keen on doing, in which case they would look like this:
These are the results (Radic et al) from the surviving babies who had follow up, which I estimated from the figure in the publication. Severe NDI in that study was severe CP [severely impaired ambulation or nonambulatory] and/ or developmental scores > 3 SD below the mean, and/or bilateral blindness.
As you can see, few babies with grade 3 bleeds have the “worst” grade of outcomes, less than 15%, and only 24% of those with grade 4 bleeds had “severe disability”. I can’t help but state this again, a low score on a developmental screening test is NOT a disability, or an impairment! Some children with such low scores will indeed eventually prove to have a “loss of function” which is an abbreviated definition of an impairment, and in some cases this might lead to a “limitation in activity or participation” an abbreviated definition of disability. But many will not.
That graph shows, yet again, that a serious abnormal finding on an early head ultrasound, shifts the uncertainty, at least regarding probable future developmental screening test results, but there will always be a large range of possible outcomes for an individual.
Here are similar data, just for the most severe IVH, from Desai et al
Again, only a minority (25%) of infants with a history of IPH (or IVH4) had “severe NDI” defined as cognitive scores of >3 SDs below the mean, GMFCS level IV/V CP, and blindness (vision, <6/60). Cognitive scores in this study were either Griffiths or BSID2 or BSID3.
The rare baby who is so profoundly affected that the general population would consider it reasonable to consider limitation of Life Sustaining Therapies, is probably in a subgroup of the “severe NDI” category. The proportion of babies with an IPH who have outcomes that are so severely impaired will therefore be rather less than 20%. Can we figure out which babies with IPH might end up with such poor outcomes?
Are more extensive IPH better for predicting severely abnormal outcomes? Both the ELGAN cohort, and the study by Nathalie Maitre have shown that bilateral IPH is a better predictor of CP than unilateral IPH, but not all of which was severely disabling CP. Two other studies found very poor association between Bassan scores of the severity of IPH and developmental scoring at 2 years, the Al-Abdi score seems slightly better correlated with developmental scores than Papile categories, but the differences are minor. There doesn’t seem to be any good data relating IPH severity to longer term intellectual outcomes, nor to important functional abilities.
Our problem, then, is not just the uncertainty inherent in trying to predict the future for an individual, (the kind of problem that many physicians, such as oncologists and surgeons, are often faced with), but a serious lack of relevant information. How often does a particular head ultrasound abnormality (such as an extensive unilateral IPH, for example) actually lead to serious functional limitations, with an impact on abilities to communicate? I don’t know, indeed no-one does.
I think it is really important to remember the following:
most babies with “NDI” are classified as such because of low scores on the BSID (or other developmental screening test)
most babies with low BSID scores at 2 years do not have life-changing impairments, they don’t even have scores on IQ tests which are low if you retest them at 5 years, if you look at the figure below taken from the 5 year follow up of the CAP trial, you can see that most babies with “severe DI” that is a Bayley MDI >2SD below the mean, had an IQ on the WPPSI-3 above 70 (all those with dots within the red parallelogram).
3. Most babies with low BSID scores function very well, and have a normal quality of life.
I don’t mean to suggest that having a low score on an IQ test is a sign of a serious impairment warranting limiting LST! It is just one illustration of the limited value of early developmental screening tests for the long term.
Finally, all that we can really say for more serious abnormalities on early head ultrasound is that they shift the uncertainty a little bit more than minor abnormalities. Developmental progress is pushed more to the left, and more babies fall into arbitrary categories of moderate or severe delay.
What those categories mean to parents will be the subject of the next post…. coming soon.
Routine early head ultrasound is the de facto standard of care in preterm infants. Recent statements from learned societies usually recommend head ultrasound at around 7 days of age, and many centres do them earlier than that.
Older statements suggested that the reason for early routine ultrasound was to decide about the appropriateness of continuing intensive care, more recently they tend to suggest their importance for finding treatable abnormalities, such as IVH with an increased risk of post-hemorrhagic hydrocephalus, and for predicting outcomes.
When we look at the entire literature which has examined associations between abnormalities on brain imaging (ultrasound or MRI), we find very little information that we can use to talk to parents about the implications for their individual baby. Most studies have investigated, and shown, some sort of overall group correlation between worse imaging findings and poorer outcome. For example, with large enough datasets, one can show that there is a statistically significantly worse outcome among babies with a grade 1 or grade 2 hemorrhage compared to babies without hemorrhage.
I submit that this information is of extremely limited value for counselling individual parents. Here is the Forest plot from a recent meta-analysis of the impacts of grade 1 and grade 2 hemorrhages (which is to say hemorrhages confined to the sub-ependymal region and/or blood within, but not distending, the ventricles)
The plot shows the adjusted Odds Ratios for having what is labelled “moderate-severe NDI” is probably about 1.35. Which looks really bad.
Perhaps we should tell parents after the ultrasound: “your baby had a grade 2 IVH, she now has a 35% greater odds of having a moderate to severe handicap than if she did not have the IVH”…. Or Perhaps Not.
The prevalence of what is called “moderate-severe NDI” is 20% with a grade 1 to 2 hemorrhage, and 17% without a hemorrhage, if we add all these studies together. Of note the data from Bolisetty which are used in this graphic and analysis are actually the data for “isolated” grade 1 and 2 IVH, which is those which were not followed by PVL, porencephaly, or ventricular dilatation. That eliminated 40 of their 336 grade 1 and 2 IVH. Other studies have used the worst head ultrasound findings, or have not stated if they eliminated some of the cases post-hoc if they developed other brain injuries. The data from Sharkaran et al actually include all grades of IVH that did not develop post-hemorrhagic hydrocephalus, therefore including some grade 3 and 4 hemorrhages, rather than just grades 1 and 2.
Perhaps we should tell parents after the ultrasound: “your baby had a grade 2 IVH, her relative risk of having a moderate to severe handicap is 18% higher than if she did not have the IVH”…. Or Perhaps Not.
The majority of so-called “Moderate-Severe NDI” is low scores on developmental screening tests, which, in the studies in these Forest plots, was either Bayley Scales of Infant Development version 2, or version 3 or Griffiths, or, in some studies, both BSID 2 and 3 were used depending on the year. Ages of follow up were from 18 months to 3 years. The specific items which led to a classification of “Moderate-Severe NDI” were either BSID2 MDI <70, or BSID3 cognitive score of <70, or either scale of the BSID 2 <70, or either motor or cognitive score on the BSID3 <70, or a mixture.
So perhaps we should tell parents after the ultrasound: “your baby had a grade 2 IVH, her relative risk of having a low score on the developmental screening test is 18% higher than if she did not have the IVH”… Or Perhaps Not.
Or, as mentioned below, people generally understand absolute risks better than relative risks. A more appropriate way of talking about the impact of a low grade IVH could be “there is a chance of the babies developmental assessment at 18 months to 3 years giving a low score; having the grade 2 IVH increases this chance slightly, from 17 babies out of every 100 having a low score to 20 out of every 100”.
Let us try and think about what that means for the population, and for an individual.
The average BSID MDI score was about 97 for very preterm infants (<30 weeks) in the study included in the above systematic review from Nova Scotia (Radic et al) who did not have an IVH, and, if they are normally distributed, this gives the distribution of scores shown below as the blue line.
The babies with grade 2 hemorrhages have BSID scores shifted downward, according to the results of that study, to a mean of around 92, (but the infants were also less mature and smaller with more other complications), which gives the orange line. As a result the proportion of babies with scores < 70 increases, in that study from about 20 to 26%, for grade 2 IVH.
One could ask if the scores among former preterm babies are indeed normally distributed, and they probably are, with perhaps, in some cohorts, a slight skew at the bottom end. It is very hard to be sure, however, as the data are not usually given; even when the mean and SD of the scores are described, the actual distribution is very rarely shown. It is something which we did show in the report of the 2 year outcomes of the CAP trial, which showed a shift upwards of about 3 points of the mean BSID2 MDI score in the caffeine group compared to the controls. Below is the graph of the cumulative distribution of the scores. You can see, at the bottom of the curves, that they are truncated at 49, as the few untestable babies were all assigned a score of 49.
It certainly looks very similar to a graph of the normal distribution displayed in a similar cumulative fashion, such as the one below with a mean of 0 and and SD of 1.
The point I am trying to make, starting with the example of low grade IVH, is that the head ultrasound result just shifts the uncertainty a little. In a stable preterm baby with no other medical complications, having a grade 1 or 2 IVH does, probably, overall, have a minor impact on developmental progress in early infancy, if we examine a large number of babies.
What does that mean for the individual baby in their mother’s arms? How should we explain to parents, of differing educational and social backgrounds that the outcome of their baby is just as uncertain as it was before the head ultrasound, but the risk of having developmental delay is slightly greater, that the zone of uncertainty has been shifted downwards, a bit? Do most of us even understand risk?
Of course we are not alone, many other physicians and counsellors have to talk to their patients (or parents) about risks of long term outcomes. Usually, I think, they are talking about things which are not quite as nebulous as scores on developmental screening tests, rather they try to discuss prognosis for survival, recurrence, colostomy, amputation, etc. even then there is always uncertainty, and almost never an ability to state with confidence what will happen to the individual.
Parents “often lack the health literacy needed to understand the words that their doctors use when describing medical alternatives. Patients even have difficulty comprehending many of the educational materials they receive from health providers. Although an average American reads at eighth grade reading level, health education materials are often written at a high school or college reading level, making the information contained in them inaccessible to the targeted audience.
Second, many patients have low numeracy skills, leaving them less able to derive useful meaning from the numerical information often presented in such materials (eg, risk and benefit statistics). To put the issues of low numeracy into perspective, approximately half of the adults in the United States are unable to accurately calculate a tip, and 20% of college-educated adults do not know what is a higher risk—1%, 5%, or 10%. Thus, when an oncologist tells a patient that his or her 5-year chance of survival is 85% or if an educational pamphlet informs patients that the risk of nausea from chemotherapy is 55%, many patients will not understand such statistics well enough to use them as part of making an informed decision.
That article from Peter Ubel and colleagues has 10 recommendations, based on the literature on decision-making and patients’ understandings of risks.
Use plain language to make written and verbal materials more understandable.
Present data using absolute risks.
Present information in pictographs if you are going to include graphs.
Present data using frequencies.
Use an incremental risk format to highlight how treatment changes risks from preexisting baseline levels.
Be aware that the order in which risks and benefits are presented can affect risk perceptions.
Consider using summary tables that include all of the risks and benefits for each treatment option.
Recognize that comparative risk information (eg, what the average person’s risk is) is persuasive and not just informative.
Consider presenting only the information that is most critical to the patients’ decision making, even at the expense of completeness.
Repeatedly draw patients’ attention to the time interval over which a risk occurs.
I wonder how many of us really understand what “an adjusted Odds Ratio of 1.35 of moderate-severe NDI” really means. I don’t suppose any of us would actually use those words when talking to parents, but can you explain that in clear language, that someone reading at an eighth grade level (about 13 years of age for the non-Americans) would understand? If you can do so, then 50% of your parents would still not understand! (And half of them would not even know what 50% means!) It is not, of course, just understanding the words or the numbers, but the subtle concepts underlying uncertainty, and trying to comprehend what that might mean for one particular baby.
Perhaps my usual approach for a baby with a grade 1 or 2 IVH, is reasonable after all, of just telling the parents, “the small bleed we just saw has almost no impact on your baby’s long term prospects, (s)he will almost certainly function well and have a good quality of life” which is a phrase you can say for almost all our babies. What about more serious brain injury? Part 2 is coming…
As usual, the annual meeting of PAS had too many things going on simultaneously to be able to get to all the interesting looking neonatal research. But here are a few things, that were of interest to me, and which I either got to in person or wanted to.
How much CPAP after extubation?
The Éclat trial (happy to see a French trial acronym) studied babies <28 weeks who had received surfactant and were on caffeine and were about to be extubated. They were randomized to either standard CPAP pressures after their first extubation (6-8 cmH2O) or higher pressures (9-11). The primary outcome was extubation failure within 7 days, which was defined as the FiO2 increasing by 20%, or a resp acidosis (PCO2 >60 and pH <7.20), bad apneas or urgent need for re-intubation. This is their poster (from the PAS website), which shows there were many fewer extubation failures using higher CPAP, and no difference in other adverse outcomes.
With 69 per group they were not powered for many other outcomes, and it is interesting that the incidence of BPD is very high in both groups, but I don’t know what definition this was using, I would guess it included mild BPD.
Would you like a side order of budesonide with that surfactant?
The results of the Taiwanese trial (sadly lacking an acronym) which randomized infants to surfactant alone (Curosurf) or poractant with budesonide (0.25 mg/kg/dose) and allowed up to 3 surfactant doses in each group, were presented. To be eligible babies were <1500 g at birth and required intubation in the delivery room or within 4 hours after birth. There were slightly over 300 babies in the trial, and the primary outcome was “BPD or death”. The average GA was about 27.5 weeks in each group. Mortality was similar in the 2 groups at close to 10%. 60% of the surviving controls developed BPD, and 38% of the surviving budesonide babies, with, also, more mild cases in the steroid group, and more severe cases in the controls.
It looks like the steroids were more effective in the larger more mature babies, secondary analysis by weight stratified subgroup doesn’t shown an effect in the <750 g babies, but there were less than 40 per group who were this small, and very little power. Blood pressures were higher in the treated group, suggesting systemic absorption, which you would expect. Alan Jobe has shown in preterm lambs that the majority of administered budesonide enters the systemic circulation, less than half being left in the lungs. I didn’t see any data on late-onset sepsis, and long term follow up will be important, as it is very early steroid use which has been most strongly associated with cerebral palsy in previous studies of systemic use.
None of the previous trials of steroids which have shown a decrease in BPD, and which have followed the babies after discharge, have shown improved long term respiratory health. Long term respiratory outcomes will be essential.
This looks encouraging, and I know some centres are doing this already, we will discuss in my centre, but I’m not sure I’m quite ready to expose every intubated VLBW to steroids on day one. Awaiting the PLUSS trial with bated breath.
Optimistic for long term pulmonary outcomes
Peter Dargaville presented the long term pulmonary and developmental outcomes of the OPTIMIST trial of minimally invasive surfactant treatment. In that trial 488 babies of 25 to 28 weeks gestation were randomized to MIST or ongoing CPAP or nasal IMV (PEEP of 5 to 8) if they needed 30% oxygen or more in the first 6 h of life. The primary outcome was “death or BPD”; there were slightly more deaths in the MIST group (10% vs 8%) but less BPD among survivors, 37% vs 45%. “Death or BPD” was therefore a bit less frequent with MIST (44% vs 50%) but was “not statistically significant”.
The follow up of that trial up to 2 years of age gave this CONSORT flow diagram, showing that up to 2 years the mortality was very similar. For various reasons, especially COVID, in person follow up was only the minority, but they had on-line questionnaires filled in by parents, and all the developmental outcomes were similar between groups. The proportion of babies with language or cognitive delay was identical between groups.
In contrast the respiratory outcomes all look better in the MIST group.
There were questions about whether this was really a trial of early versus late surfactant, but as Peter rightly said, it was not! Control group babies did not all get surfactant, babies in both groups were intubated for surfactant if they reached 45% (MIST was not allowed) and after intubation further therapy was according to usual practice, so INSURE, or more slow weaning, or whatever your usual was were OK. Most babies were expected to be on caffeine, but it was not mandated by the protocol.
This is one of very few studies of respiratory interventions which seems to show clear long term respiratory benefits, beyond just the presence of a BPD diagnosis. 72% of the controls got intubated within 72 hours of birth, compared to 37% of the MIST babies.
MIST in the delivery room, with coffee!
Anup Katheria presented another trial of thin catheter administration of surfactant, this time being randomized in the first hour of life, once babies of 24 to 29 weeks were stabilised on CPAP, and had received caffeine. Controls also received caffeine, but no surfactant, 180 babies were randomized and the main outcome was need for intubation, determined by needing over 40% O2, or a respiratory acidosis or bad apneas. Babies with less-invasive surfactant and caffeine required intubation in the first 72 hours of life 23% of the time, compared to 53% of the time for the caffeine only group. The surfactant group did get their caffeine a bit earlier than the controls (at 52 minutes of age compared to 70 minutes). Death before discharge was rare in this study (only 3 babies died, all in the control group), and there seems to be less BPD in the surfactant group, 26% vs 39%, although the study was not designed to have power for the BPD diagnosis, it looks like there probably was an effect.
Anup Katheria helpfully listed the differences between his study and OPTIMIST
I really hate the idea of performing laryngoscopy without analgesia, and sedative premedications were not allowed in the OPTIMIST trial. I don’t know if they were allowed in CaLI, but if we can avoid intubation completely in a large number of LISA/MIST babies, which these survival curves below demonstrate dramatically, and therefore avoid prolonged desaturations, bradycardias and multiple intubation attempts, then maybe an un-premedicated less-invasive surfactant in early life will reduce, overall, the amount of pain experienced.
Putting things together
It is starting to look like very early minimally-invasive surfactant administration, perhaps routinely in the DR, and with very early caffeine administration, is the way to go. Not because it decreases “BPD” but because it seems that long term respiratory health is improved by avoiding intubation in the first few days of life (long term outcomes of SUPPORT and now OPTIMIST)
Perhaps early less-invasive surfactant should be co-administered with budesonide, but we should carefully consider that the NEUROSIS trial, of repeated budesonide inhalations starting during the first 24 hours, showed a small excess of mortality in the budesonide group. OPTIMIST also shows a small excess of mortality with budesonide. In addition, budesonide when mixed with surfactant enters the systemic circulation, and very early systemic steroids in preterm infants increase motor delay and cerebral palsy.
Of all the previous steroid trials that have shown reduced BPD and have also reported long term respiratory outcomes, I don’t think there is a single one which has shown improved longer term respiratory health.
Of course those studies are all contaminated by frequent treatment of control babies with steroids, so long term outcomes may be no different for that reason. On the other hand, potent systemic Steroids may decrease inflammation and thus lead to fewer babies needing oxygen at 36 weeks, but they also impair lung growth and interfere with lung development. I would like to see some evidence that long term respiratory health is improved before treating all babies receiving surfactant with budesonide, not to mention some evidence that there are no adverse effects on neurologic or developmental outcomes.
One trial which will should have an on the care of newborn infants is the trial of early versus late hernia repair. 40 neonatal centres in the USA randomized 320 preterm infants to inguinal hernia repair before discharge (between 37 and 42 weeks, if I remember correctly) or later repair after 55 weeks PMA (55-60 weeks according to the record on clinicaltrials.gov). That record also notes a sample size of 600, and notes that a Bayesian analysis of the data will be the primary method. The primary outcome was a composite of adverse events, which I can’t now find details for but included things like post-operative apnea, requiring intubation post-op, and incarceration of the hernia.
The justification for pre-discharge repair has always been that incarceration was relatively common, and we should repair the hernia before discharge so that doesn’t happen. Incarceration did occur in 4.5% of the late repair group, it was defined as needing surgery, sedatives or a surgeon to reduce the hernia. There were a few incarcerations in the early treatment group as well.
Overall there were more adverse events in the early surgery group, 26%, than the late group, 18%, which made it highly likely (96% probability) that late surgery led to fewer adverse events. The total number of hospital days after randomization was also shorter in the late repair group. One somewhat unexpected finding was that 10% of the hernias resolved in the late group. Apparently this has been suggested before (many years ago), and was probably a real finding, as the presence of most of the hernias was confirmed independently.
Some of our families live hundreds or thousands of kilometres away, and re-admission for later hernia repair may be more of a problem for them. But for the majority of families who live with easy reach of my hospital, it looks like we should probably wait for 3 months post-discharge before fixing their hernias.
The only real disappointment of this trial is the lack of a catchy acronym, it could have been HIDE (Hernias of the Inguinal region, Delayed vs Early repair) or some combination of S (for surgery) HI and T (for trial)…
Premies are cool enough
A randomized trial of cooling for asphyxiated preterm infants who have not been well studied in previous trials (33 to 35 weeks gestation) but who sometimes get cooled, because of “mission creep”, was performed by the NICHD network.
The calculation of sample size was interesting, as they designed the trial to get the largest sample which they considered feasible within a reasonable time, and planned a Bayesian analysis of the trial. They planned to enrol 168 babies, and achieved that, and care was identical to that for more mature babies, with careful normothermia in the controls, and 33.5 degree target temperature in the hypothermia group. Cooled babies were more likely to become hypoglycaemic, and more likely to have major bleeding, but most importantly more likely to die, 21% vs 15% with a Bayesian calculation of a 77% probability of a real increase in mortality with cooling compared to normothermia.
The primary outcome was death or “disability”, which was a Bayley 3 cognitive score <85 or GMFCS >2 or a seizure disorder or deafness. The primary outcome was very similar between groups, with a neutral probability on Bayesian analysis of being better or worse.
This is probably the best evidence we will get for the efficacity of therapeutic hypothermia in late preterm infants, and strongly suggests that it is not effective, and may be harmful.
Mild HIE, to cool or not to cool, that is the question.
From Imperial College London a combined report of 2 trials targeting term babies with signs of mild encephalopathy. They had to have a clinical examination, in the mild category, and a normal aEEG. The trials includedonewith babies randomized before 6 hours, who were then treated with normothermia or standard hypothermia, and a second group of babies who were already cooled, and who were >6 hours of age, they were randomized to shorter or standard hypothermia.
The trials are quite small, and so far only short term outcomes and MRI were available. Cooled babies were more likely to be intubated and to need inotropes. They also had longer hospital stays. As for the MRI results, they appear to be worse in the cooled groups than the normothermic infants.
As I wrote previously MRIs are very poorly predictive of outcomes on an individual basis, but there are overall correlations in groups of babies between poorer outcomes and worse MRI scores.
Therapeutic hypothermia for mild HIE has been growing in frequency, and I have been guilty of starting it when I wasn’t sure whether to, or not. I felt from previous data that it was probably safe, and possibly effective. I may have to revise that opinion, but clinical examination may change prior to the time limit for cooling, and babies with normal or mild encephalopathy may deteriorate. I think based on this data that we should be cautious cooling babies who are clinically in the mild category, ensure that they are re-examined at 5.5 hours and have aEEG or EEG. If the examination only shows signs of mild encephalopathy, and the electricity is normal, then scrupulous maintenance of normothermia and supportive care is indicated.
Acute phase reactants increase for many different reasons, such as being born. Other causes of increased CRP include maternal antibiotic prophylaxis, TTN, HMD, Surfactant treatment, meconium aspiration syndrome, prolonged rupture of membranes, HIE, higher birthweight, gastroschisis, and probably someone scowling at the baby,
They do not rise instantly when a baby is infected, but take a variable amount of time to go up. Specificity and sensitivity for bacterial sepsis are both very poor, we should stop using them for late onset sepsis. A Cochrane review, latest update 2019, revealed a sensitivity of 0.74 and a specificity of 0.62 for late onset sepsis, which, combined with the fact that the majority of screens for late onset sepsis are negative, means that the PPV for a raised CRP is less than 20%, and the NPV of a low CRP is about 5%
For early onset sepsis, there are still many places where they are included in initial evaluation, and repeated CRPs may be used to decide on continuing antibiotic treatment. In some studies an increased CRP is used to define late-onset sepsis, even in well-appearing babies. Which leads to some interesting circular arguments about who should be treated with prolonged antibiotic courses.
They give results from 2009-2014, where CRP was routine “10,134 infants were admitted; 9,103 (89.8%) had CRP and 7,549 (74.5%) had blood culture obtained within 3 days of birth. CRP obtained ±4 hours from blood culture had a sensitivity of 41.7%, specificity 89.9% and positive likelihood ratio 4.12 in diagnosis of EOS. When obtained 24-72 hours after blood culture, sensitivity of CRP increased (89.5%), but specificity (55.7%) and positive likelihood ratio (2.02) decreased”.
They then compare approximately equal periods of about 2 years each with (n=4,977) and without (n=5,135) routine use of CRP. Of note the later period was also the time when the centres started using the Kaiser EOS calculator for the term and late preterm babies, and stopped doing some sepsis evaluations in preterm babies with ultra-low risk delivery characteristics. “We observed lower rates of EOS evaluation (74.5% vs. 50.5%), antibiotic initiation (65.0% vs. 50.8%), and antibiotic prolongation in the absence of EOS (17.3% vs. 7.2%) in the later period”.
They also showed no difference in the incidence of EOS (about 2 per thousand), no difference in how long it took for septic babies to receive antibiotics, and no differences in clinical outcomes.
CRP results do not sufficiently discriminate between infected and non-infected babies to be useful. This is true for EOS and for LOS. Abandoning CRP use has no adverse impact, but reduces antibiotic exposure of non-infected babies.
The Finnegan score was developed and evaluated in the 1970’s as the first systematic way of monitoring babies going through perinatal drug withdrawal. Although it was an advance at the time, it was not developed using modern standards, and was never, I don’t believe, studied in a randomized trial, indeed it would be hard to know what you could have compared it to, other approaches were rather haphazard.
The Finnegan scoring system, and the treatment approaches based on it, became the default approach, and have been the standard of care for many years. The problem of neonatal drug withdrawal seems to be getting much more frequent, especially in the USA, reliable prevalence data are of course difficult to obtain, but 7% of US mothers report taking opioids during pregnancy, 1/5 of them report abusing opioids, rather than therapeutic use. The incidence of neonatal abstinence syndrome, NAS, nearly doubled between 2000 and 2017, and I think all indications are that it is still increasing, with 7 cases per 1000 births in 2017. I.E. nearly 1% of all births in the US.
In Canada in 2020 the incidence was about 6 per 1000 live births (not including Quebec who collect data separately). The opioid crisis in Canada is, generally speaking, worse in the west of the country and is spreading eastward, with the highest rates of overdose deaths in British Columbia, followed by Alberta then the prairies and Ontario, with far fewer cases as yet in Quebec.
With this enormous and increasing incidence, evidence-based methods for evaluating and intervening for these infants were required, and a new program, first reported in 2017 was developed, based on a simplified evaluation (which is where ESC comes from: is the infant able to eat, to sleep, and to be consoled), use of non-pharmacologic measures to calm the infant, reduced stimulation, skin to skin care, encouraging breast feeding, and limiting morphine use.
Several publications strongly suggested that the ESC approach was followed by a reduction in duration of hospital stays, and less opiate administration. This new publication (Young LW, et al. Eat, Sleep, Console Approach or Usual Care for Neonatal Opioid Withdrawal. N Engl J Med. 2023) reports the first large RCT, performed as a cluster randomized trial, with all the 26 centres switching from standard care to ESC, but doing so at times which were determined randomly; a “stepped wedge” design.
It would be easy to criticise this design, it is almost implicit in the design that ESC is preferable! No cluster was randomized to switch from ESC to standard care, which you could do if there was really equipoise. I think it shows that most people were already convinced that ESC was the better approach, but they wanted some scientifically valid way of confirming that. The design does give real-world information about the size of the impact of ESC, and allowed some post-discharge evaluation of safety.
The primary outcome of the trial was the age of being medically ready for discharge, defined as : “an age of at least 96 hours, a period of at least 48 hours without receipt of an opioid, at least 24 hours with no respiratory support and with 100% oral feeding, and at least 24 hours from initiation of maximum caloric density.” over 1300 babies were enrolled, and 837 were discharged when medically ready, and therefore contributed to the primary outcome.
There were therefore 468 who were discharged before they met this definition, almost all were either <96 hours of age (n=211), or it was <48 hours since their last opioid dose (n=231). The investigators therefore also evaluated a “modified definition of medical readiness for discharge” which included an age of at least 72 hours and at least 24 hours without receipt of an opioid. This definition could be evaluated for 1164 of the infants.
Just getting the babies home earlier, while a benefit of itself, is not necessarily an advantage if the babies need to be re-admitted for further NAS therapy, or for other related reasons, or if parental coping is affected and more babies have adverse outcomes. The babies were therefore followed for 3 months post-discharge for the following outcomes “(any acute or urgent care visit, emergency department visit, or hospital readmission), and a composite critical safety outcome at discharge and through 3 months of age (nonaccidental trauma or death).”
Follow-up of these babies may be very difficult, especially in the US health environment. “Outcomes after hospital discharge were assessed prospectively at 3 months of age by means of a review of electronic medical records (including linked medical records) and media review through a search of public records (e.g., news reports, obituaries, and registries)”.
The main results are below: as I mentioned, the primary outcome was assessed for 837 babies, but the actual mean length of hospital stay includes all the babies, as does the dramatic difference in the proportion who received morphine.
The modified definition occurred at 14.5 days in the usual care group, and 8,1 days in the ESC group.
For the 3 month outcomes :
About 2/3 of the hospitalisations in each group were possibly related to NAS.
This study should be the death-knell (whatever a “knell” is) for the Finnegan score. It is not particularly objective, nor clearly very useful for determining therapy. The simplified ESC system should rapidly become the standard of care, and the standard against which any innovations in NAS care should be tested.
As one example, there are several trials of buprenorphine for NAS, as an alternative to either morphine or methadone, those trials (such as this one) have generally used the Finnegan system, and have often shown shorter treatment with buprenoprhine than the alternative. The effect size of ESC is larger than that shown in those trials, in general. Maybe some of them should be redone, with ESC as the approach to care in both groups.
For those of my readers going to PAS in Washington DC, I have put together a panel to discuss painful procedures in neonatal research projects, which starts at 2pm on Saturday, April 29, 2023. The stellar cast of speakers include the moderator John Lantos, who will introduce the topic and talk about research ethics and how painful interventions in children and babies fit into the ethical framework. I will give some examples of painful procedures in neonatal research projects, how they were justified, if at all, and try to create some guidelines for what might be acceptable, Ruth Grunau will talk about how pain adversely affects the developing brain, and Sunny Juul will talk about designing research projects that minimize additional pain.
I think it will be an important and informative session, hopefully with enough chairs for everyone who wants to attend. It is in the convention center 145 AB.
The definition of the outcome was “one or more of: intraventricular hemorrhage of grade 3 or 4, cystic periventricular leukomalacia, post-hemorrhagic ventricular dilatation, cerebellar hemorrhage, and cerebral atrophy” which are all defined more precisely in the supplemental material. One quibble I have with the definitions is of grade 4 IVH, defined as “parenchymal haemorrhagic infarction visible in the periventricular white matter”, what we actually see is echodensity, and it is not possible to really ascribe a mechanism of injury to an ultrasound appearance. As an upcoming review article that I co-authored will point out, periventricular echodenisities are not all strongly associated with poor long term outcomes. The extent, location and whether uni- or bi-lateral are all important.
This study was large enough (n=1600) to show moderately large differences in the incidence of ultrasound abnormalities, 90% power to show a reduction from 34 to 26% among infants <28 weeks GA. Monitoring had to start within 6 hours of birth and continue for 72 hours, a treatment algorithm was followed to protocolise responses to low cerebral saturations.
The results in the 2 groups were, basically, identical.
My big question is whether this is a surprise, I think that low cerebral oxygen saturations are due to a large number of different processes, and that there is not a clear link between low cerebral oxygen saturation and severe IVH or PVL. Only 29% of the babies in the oximetry group had a change in management because of low saturations, so you would only expect the intervention to have any impact on that 1/3 of the babies. PVL may occur after an episode of shock, but the strongest correlation is with perinatal inflammatory disorders, like chorioamnionitis.
I also think it is a real possibility that less cerebral hypoxia could improve long term outcomes without necessarily changing brain ultrasound findings, and I really hope that after this great effort to perform a really important trial, there will be long term follow up of the survivors.
For now there is no clear evidence that routine cerebral NIRS, and responding to low cerebral oxygenation, improves short or long term clinical outcomes in the preterm.
Discussions with parents about the progress of their baby, when we have concerns about their survival, or poor prognosis, are not infrequent in the NICU. We may decide to continue as before, or to limit interventions, or to withdraw some, or all, life sustaining interventions. One of our fellows, Beatrice Boutillier had collated the results of a large number of such discussions in a French tertiary NICU, and investigated the outcomes, who died, who survived after treatment limitation or withdrawal decisions, and the long term outcomes of those who survived. Annie Janvier and I collaborated with Beatrice, and with Valérie Biran, the chief of the NICU at Robert Debré hospital, to publish the data from their NICU.
There is very little other such information in the literature. Although there are several publications about the proportions of deaths that follow end of life discussions, many of those have collected data only on babies who died, there is little about those who survive after a decision to limit or withdraw care.
Although each case is unique, we squeezed them into general categories, using a revision of the categories of Eduard Verhagen et al. In some, all ICU care was withdrawn, including assisted ventilation, this could either be because it was thought there was little chance of survival in an unstable infant (category C), or because the long term prognosis for an acceptable quality of life was thought unlikely (category D). There were some infants where there was a decision to not escalate care, such as deciding to not go to the OR, or to not start an inotrope, or sometimes just to withhold CPR, these were category B. Those with withholding some interventions or limiting care were the babies most likely to survive, 18/29. But even among those with withdrawal of life-sustaining interventions, 4/41 unstable, and 12/94 stable infants survived.
All of the babies who survived to discharge required some medical follow up, and for many there were multiple specialties involved. Of the 34 who survived to 2 years, we had functional outcome information on 32, 8 of whom had outcomes between functionally normal, up to functional limitations with likely supervised living in the future, the remaining 24 were predicted to need help with activities of daily living or to be partially or totally dependent. We used the Glasgow Outcomes Score-Extended, as adapted for children (which I think gives a much better picture of the functional abilities of the children than a label of “NDI”).
The results point out the uncertainties inherent in our practice; even unstable babies that you think will likely die, who are so sick that you decide to withdraw their life-sustaining interventions, may still survive. After doing neonatology for over 40 years now (I know, I don’t look that old) I find myself often humbled by seeing things I would have thought impossible. Recently I looked after a baby with a pH <7.00 for 12 hours, (the baby was on extreme support, the parents weren’t ready to withdraw LST, so we continued while assuring good analgesia), who survived and is doing OK after discharge.
I am sometimes tempted to give up trying to predict anything! But I think it is better just to be transparent and honest and be open about the uncertainties of life, and of neonatology. We must always recognize that we are lousy at predicting survival, and even worse at predicting long term functional disability. Even with all the tools at our disposal, it is rare that we can say anything definitive.
One thing that doesn’t seem to help parents is to try and develop predictive models which give a percentage likelihood of survival or of poor long term outcomes. Although it is a different time of life, a recent study is relevant McDonnell SM, Basir MA, Yan K, Liegl MN, Windschitl PD. Effect of Presenting Survival Information as Text or Pictograph During Periviable Birth Counseling: A Randomized, Controlled Trial. Journal of Pediatrics. 2023. This study, using a vignette of a threatened delivery at 22 weeks gestation, showed that it didn’t make any difference how outcome data were presented (text or pictogram) to treatment decisions, it also showed that it didn’t make any difference what data were presented! Whether a 30% or 60% survival probability were revealed to a randomly selected group of 1000 women of child-bearing age didn’t make any difference to decisions, it didn’t even make any difference to what the participants thought the chance of survival was; when told the baby had a 30% chance of survival the participants thought the baby had a 68% chance of survival. Participants, however, thought that even with palliative care the baby was likely to survive (median 58% chance of survival), which does show that the presentation of the information, no matter how it was done, did not lead to the respondents having accurate knowledge of the results of palliative care.
“These seemingly nihilistic study results are illuminating. Should we be spending our time trying to get better at sharing outcome probabilities with expectant parents if the probabilities aren’t what matter to them?”
I never give percentage survival figures, unless parents ask for them (which they almost never do), and, to be honest, I am not at all sure that the difference between 30% survival and 60% should make a difference in decision-making. You might as well say 50:50, and leave it at that for both cases. There is a reasonable chance of survival, and ICU care is probably worth a try, I would say, and most of the respondents in the article by McDonnell seem to say the same thing. It might be different if the chances were 1% vs 99%, but it is hard to think of a scenario which could be included in such a study where the chances would be 1% survival.
To return to the case series that we just published, some of the considerations in counselling in the NICU are similar. We can try and calculate survival rates, but each case is different, each individual baby can only be 100% survival or 0% survival, and parental (and physician) attitudes and values are probably more important to the outcomes of such decisions than percentages. We have to always remember that neonatologists are remarkably bad at predicting the future.
We should also remember that parents of surviving preterm infants almost never regret their life and death decisions. In another study that we published recently, Thivierge E, et al. Guilt and Regret Experienced by Parents of Children Born Extremely Preterm. Journal of Pediatrics. 2022, 113 of 248 parents of former preterm infants, seen in the follow up clinic, expressed some regrets about the NICU stay. None of them regretted decisions made to start or continue NICU care. We know from other data that decisional regret is more common after decisions to opt for palliative or comfort care alone. The regrets that parents did express were more associated with lack of self-care during the NICU stay, regrets about having a preterm baby (for which mothers often blamed themselves) and regrets related to their role as parents in the NICU. Addressing those issues could help to improve parental mental health after discharge.