Longer-term outcomes of very preterm babies, what should we measure, when and why?

Two recent articles have discussed the issue of what outcomes we should measure to analyze neurological and developmental progress in the preterm baby. Both are thoughtful critical pieces that say many things that we need to think about as we follow our patients.

McCormick MC, Litt JS. The Outcomes of Very Preterm Infants: Is It Time to Ask Different Questions? Pediatrics. 2017;139(1).

This review/opinion piece describes some of the limitations of our current approach, and how many important outcomes are not routinely evaluated. It does, unfortunately, refer to many studies that have evaluated Bayley scores as if they measured IQ (for example Betty Vohr’s 2004 study, comparing outcomes between NICHD network sites on 20 month Bayleys, is referred to as showing variations in IQ. IQ, which has major limitations as a measure of outcomes itself, does at least have some correlation with school success and difficulty, it should not be confused with developmental quotients from a BSID exam, which have no clear correlation with functional outcomes). Near the end they state the following:

there is a need to shift to multifaceted conceptual frameworks accounting for physiologic and environmental influences on health and development. Broadly construed, such models should incorporate longitudinal observations of function and changes in function due to maturation, family dynamics, and social environmental contexts. Of particular importance is the
identification of appropriate interventions to buttress the child’s
ability within his or her familial environment.

In other words, trying to reduce outcomes to a single number (such as a Bayley cognitive composite score), is a reductionist approach that is absurd, we need to examine the range of the childs abilities, functions, behaviour and emotional life in order to help them.

Kilbride HW, et al. Prognostic neurodevelopmental testing of preterm infants: do we need to change the paradigm? J Perinatol. 2017.

This second article reiterates many of the issues I have been ranting about on this blog for a while, they start with discussing why one might want to do neurodevelopmental testing of preterm infants

(1) the results can help determine which children  should receive early intervention or enhanced educational services; (2) the assessments can be used as outcome measures in research protocols to determine whether specific neonatal interventions lead to better results and (3) such information may also be used to inform clinicians and parents about the appropriateness of providing care for certain groups of infants.

I think there are also other reasons for performing such testing, preparing parents for the future, and increasing the understanding of the patterns and developmental trajectories of very preterm babies, are 2 examples.

The authors then describe some of the tests that are used, focusing on the various editions of the Bayley tests of infant development, and the 3 editions of that test. They note the now well-publicised shifts in the norms of those tests, and then after a short section discussing the adult and adolescent outcomes of the very preterm baby, discuss whether early developmental testing can be used to predict later intellectual function test scores.

In general, the ability to predict cognitive outcomes at school age from infancy and preschool ages has been described as a conundrum. The elusive nature of estimates of IQ stability may be due to differences in sample selection, data analytic approaches, the presence of appropriate control groups as well as validity of assessment instruments, as discussed earlier. Even in the best of testing circumstances, defining impairment in early childhood is imprecise and is likely to over-estimate level of disability.

They note that there are major socio-economic impacts on development of the very preterm baby, and that those factors become more important over time; the CAP study cohort was a good example of this, the change in scores between the 18 month Bayleys and the 5 year WPPSI was greater in children whose parents had more social advantages.

IQ scores, from testing close to school age, are more closely associated with school performance than earlier developmental testing, but we should ask whether even those scores can, or should, be used as a way of determining whether a child’s life is worthwhile or not. For that is the implication of our use of developmental or IQ testing as a way of dichotomizing the lives of the survivors of NICU into those who are impaired and non-impaired, or intact and non-intact, or disabled and non-. Whatever the terminology the outcome calculators have the advantage of not just relying on gestational age to predict outcomes, but the huge disadvantage that they are used by practitioners to predict which side of the dichotomous outcome “survival without disability” compared to “dead or disabled” a baby will likely fall.

In reality the outcomes of our babies are not dichotomous, being dead is not the same as being disabled, all types of disability are not the same, and how a child with impairment experiences their own life and how they impact a family are not dichotomous phenomena, good or bad, either.

Telling a parent-to-be that a child has a predicted 21% chance of ‘survival without profound impairment’, in the example they use, actually means that they have a 33% chance of survival, and among survivors 64% do not have very low scores on Bayley-II testing at 20 months of age or disabling cerebral palsy. Saying that to parents requires that we know something about the outcome data that our statements are based on, and the major imitations of those data,

Categorization of children based on composite findings should be limited to outcome measurements for research purposes. Providers who counsel families prenatally regarding risk for
extreme preterm or other difficult newborn conditions need to
fully understand the implications of 24-month  neurodevelopmental findings to avoid using terminology that overstates what is known.

I don’t fully agree with the first sentence there though, I think we need to rethink how we use composite outcomes when we design research, as I’ve mentioned previously the SUPPORT oxygen targeting trial was actually a negative trial, the composite outcome of “death or severe retinopathy” was not significantly different between groups, only the individual parts of that outcome were different, with death being higher and retinopathy being lower with the lower saturation targets. But to demonstrate that authors have done 3 analyses, the composite and individually, RoP, and survival, which inflates the risk of a type 1 error, and it has been suggested that should be taken into account in the analysis. Other ways of analyzing trial outcomes with potentially competing outcomes have been proposed, instead of creating potentially confusing composites. I don’t actually think anyone really wanted to know what was the impact of different oxygen saturation targets on “survival without severe RoP” we wanted to know, was it safe to aim for lower targets that some people were already targeting (we were really asking that question about longer term outcomes, not expecting a difference in mortality), and did it really further reduce RoP.

To return to the comment about prenatal counselling, I have to agree with the authors, we should completely avoid presenting outcomes as a risk of a composite outcome compared to not having that composite outcome. The risks of death and of potentially life-affecting impairments must be presented separately, some parents will want to explore the different kinds and severities of various potential outcomes, some will want much less detail, or only focus on the chances of the most severely limiting outcomes.  It is important that we don’t just note something like “parents do not want a handicapped child” without exploring what that means to them. In studies when parents have been asked what they meant by phrases such as that one (and there aren’t many such studies), they generally state that to them an outcome which would make them accept withholding or withdrawal of life-sustaining interventions is a child that ‘cannot think’, or has “no ability to communicate”. In other words, certainly not a low Bayley score or learning difficulties at school, but the most profound limitations.

A brand new publication by parents of an extremely preterm baby and Mark Hudak, a neonatologist from Florida has just appeared. The father writes a blog “They don’t cry” that I have often visited (which is unfortunately not mentioned in the article, and includes a great video of the baby, now a 4 year old child, reading a dinosaur book). The article recounts the experiences of the parents, and is well worth reading, if you don’t have access to the article Eric Ruthford (the dad) recounts some of the same experiences in the early posts on his blog. One horrifying interaction came just before his son, Gabriel was born:

When birth became imminent at 22 weeks and 6 days, 2 neonatologists counseled us that standard practice was to not resuscitate infants born before 23 weeks and 0 days and that many neonatologists in our region believed that resuscitation was unethical in the 22nd week.

The neonatologist who arrived 30 minutes after Miri’s water broke said, “At this stage, I don’t recommend that babies should be intubated because the results are so poor. If you give birth after midnight—that’s just the line for when we’ll intervene—I’ll be the one who comes and resuscitates the baby, but my heart won’t fully be in it.”

I hope the neonatologist who said that, and suggested that the approach would change at midnight (is that on the first stroke of midnight, or was he going to wait until the 12 chimes had all rung?) is embarrassed by that now. Apparently he did finally come to resuscitate Gabriel at 11:20 pm at 22 weeks and 6 days, and did a good job according to the parents. Who offer the following advice:

  • Physicians should seek to understand the values and motivations that underlie the wishes that parents express. If parents ask the physician to not resuscitate their infant, the physician can probe this by saying, “What, in your mind, are some reasons for this decision?” Although some may think this is insensitive, an honest response will help illuminate underlying parental concerns and allow the physician to speak directly to them.

Our motivations were driven both by our religious value that all life, no matter how brief, glorifies God and by our belief in Gabriel’s autonomy—if he could survive, we owed him that chance.

  • When an infant is going to be born in to the “gray zone” in which resuscitation is a parental choice, the physician can say, “Your child will be welcome in our nursery.” Such an approach would have greatly diminished our stress without introducing bias either way and would have affirmed Gabriel as a person. Miri remembers being especially frustrated during the antenatal counseling that the doctors talked about him as a medical condition, not as Gabriel—we had picked his name at that point—or even as, “your baby.” Miri viewed 22 weeks and 6 days as a description of her condition, not as a way of describing Gabriel, and she regarded the statistics relating gestational age to outcomes as being similarly impersonal.

  • The physician can talk about the differences between a child who lives an hour in the delivery room versus one who lives for a few days or weeks in the NICU. Some parents might believe that a short goodbye would be easier. Other parents might feel worse if they did not give their child a chance to survive. We were in the latter camp, and were sobered but not dissuaded when the doctor who recommended against resuscitation told us that setbacks and failures in an infant’s treatment become harder to take later on. In our 5-month NICU stay, Gabriel did have setbacks that frightened us and we often feared that he might not survive: but we never had second thoughts about our decision to offer him a chance for life.

  • For some parents, statistics about functional outcomes will influence decisions. Optimally, outcomes should be more robustly descriptive. “Profound to severe disability” and “severe to moderate disability” sounded to us like “life without parole.” It would be helpful to hear directly from the parents of a premature infant about their perception of their child’s happiness—and their own. For parents concerned about their child’s future abilities, a visit from a pediatric neurologist or developmental specialist who can provide first-hand knowledge about the daily lives of former premature infants could be similarly instructive. For parents concerned about the expense of care and about their inability to leave money in their wills to a potentially disabled adult, a visit from a financial case worker could help. Alternatively, an online system or binder with printed materials might convey information in all 3 areas.

The parents’ thoughts are accompanied by a thoughtful discussion by the neonatologist who states :

They suggest that parents have an opportunity to talk with other parents of premature infants who survived with disability. Perhaps neonatologists should have the same opportunity to challenge their biases. An increasing literature attests to the fact that many disabled survivors of prematurity self-report an acceptable quality of life and do not regret their survival. And should not that be a key consideration for all of us?

The final section is written by the 3 authors together, it ends:

Exploring the fundamental motivations behind parental desires can guide information sharing to be more illuminating than a recitation of survival statistics or graded descriptions of long-term neurodevelopment that do not meaningfully convey a child’s potential abilities. Under similar circumstances, 2 sets of parents may reach different but nonetheless supportable informed decisions. A physician often has these discussions thinking what he or she would decide in a similar circumstance. Yet in the gray zone, the physician is obliged to put aside personal bias to forge a partnership with the parents and to support their most informed decision on behalf of themselves and their child.

That picks up some vitally important issues, policies and position statements have in the past focussed on ensuring that we tell parents all of the bad things that can happen, and all the potential limitations of extremely preterm babies. When do we tell them the positives? What most preterm babies can do, how they positively impact the lives of their families, along with the difficulties?

The outcomes that we should be measuring should be broader and related to function and abilities. They should be reported in ways which describe the range of capacities of our graduates, and show their abilities, not just their disabilities. We should not lump together outcomes which have very different implications for parents, and for the child themselves. As Saroj Saigal and I wrote in an editorial once, (Barrington KJ, Saigal S. Long-term caring for neonates. Paediatr Child Health. 2006;11(5):265-6) we should be proud of the way that neonatologists invented the field of outcomes research, but we need to do still more, to ensure that we don’t just identify and measure problems but study ways to lessen their impacts and further improve the lives of our patients.

About keithbarrington

I am a neonatologist and clinical researcher at Sainte Justine University Health Center in Montréal
  1. Drpkrajiv says:

    A great blog. Very well presented the facts of the predictment of thr 23 weeker. The analysis of the babys brain potential is cloudy and in the end are we justified to play god. We should have more objective analysis of brain like mr spectroscopy to prognosticate intact survivali at the bedside .the values may need to be standardised for age

    • I think we need to be very careful about introducing new technologies for prognostication. Unless we have a very high positive predictive value for profoundly limited outcomes, then using any additional test risks making the predicament, worse. A recent systemtic review of EEGs for prediction, for example showed that there are statistically significant correlations between EEG abnormalities and outcomes. That is not at all the same thing as saying that for an individual patient, EEG findings are sufficiently prognostically accurate to use them to make treatment decisions.

  3. Thanks for a very thoughtful post. The outcomes that matter change over time as children age. We begin at 3-4 months with using the GMA to assess risk for CP, the TIMP to assess functional abilities, and the HINE to assess neurologic signs. These measures tell us what we should do NOW, i.e., at 3-4 months, while we watch longitudinally for other outcomes that matter without a wait and see attitude that prevents children from receiving therapy until they don’t walk or otherwise present huge failure to meet expected developmental outcomes. On the other hand, negative results on these early tests are very reassuring as they have high predictability to later outcome. Statement of Interests: I am the Manager of Infant Motor Performance Scales, the publisher of the TIMP.

