My previous post about the FETO trials noted that the published trials reported a clear benefit of antenatal treatment of the highest risk group, but the moderate risk group had an improved outcome which didn’t meet classic definitions of statistical significance. I noted then that, if the trials had been run as a single trial with 2 risk strata, the overall benefit for the entire sample would have been highly significant, and that it was likely that a test for interaction would not have shown a statistically significant difference between the risk strata.
In this re-analysis the authors looked at the GA of intervention and conclude that FETO leads to improved survival in both the severe and the moderate groups, indeed overall there was no impact of severity (O:E LHR) on effectiveness of the procedure, rather it seemed that earlier treatment was associated with greater improvements in survival, but also greater increases in prematurity.
These 2 figures show the survival advantage of having FETO as compared to usual treatment based on the O:E LHR, even with the lowest risk babies in the trial, FETO is advantageous, as shown by the blue line and the blue shaded area representing the confidence limits of the benefit. The dotted lines are what the predicted survival benefits would have been if the babies had FETO at different GA, not allowed by the protocol, suggesting an even greater survival benefit for the moderate babies if they had FETO earlier.
Of course this analysis is confounded by the correlation between severity and timing of intervention, as by design more severe lesions were treated earlier.
The next challenge will be to find ways of preventing membrane rupture and preterm labour after fetoscopic intervention, if we can do that, then the benefit of FETO for improving survival will likely be even greater. I think this analysis confirms, that, if it is available, FETO should be discussed with mothers carrying fetuses at moderate risk, not just the most severely affected.
Just imagine for a moment that you are the parent of Jo, who is 4 years old, Jo has a sudden onset of breathlessness and the investigations in the Emergency Room show a spontaneous pneumothorax, that needs a drain. You are approached to participate in a research trial of a new anaesthetic spray, shown to be effective in animals, to be given prior to the intercostal incision between Jo’s ribs for chest drain insertion. The control group will have a placebo spray with no active ingredients. The study is a standard, high-quality placebo-controlled trial with masked randomization, and masked intervention. Sounds great, right? You are informed that the local ethics committee approved the trial, which has been registered on a public database.
I think any parent agreeing to potentially have a chest drain inserted in their child without local anaesthesia would be failing in their duty of care to that infant; any researcher designing, performing, or interpreting such a study is unethically assuring that half of the research subjects will feel avoidable pain; and any journal publishing the results is complicit in promoting such unethical practices. The approval of such a trial by an ethics committee does not absolve anyone of their responsibility for their participation.
There is no excuse for designing or performing or publishing a study where any of the infants are assigned to have pain. Just as there would be no excuse for those involved in Jo’s care allowing or participating in a trial where half of the children have a chest drain inserted with no analgesia.
How should we perform research to advance the care of newborn infants who need painful procedures?
The only reason for performing clinical research is to improve care. So trials showing that a new intervention is better at controlling pain than no analgesia are useless. The only justification for such a trial would be if analgesic interventions currently known to be effective were unavailable or toxic. Neither is true. Skin-to-skin care may not always be possible prior to a heel-stick, for example, if it is needed urgently, or the parents are absent, or the baby is too unstable to move in a hurry. In those cases giving a dose of sucrose and a soother can easily be done. If you don’t have licensed preparations of sucrose then concentrated sterile glucose can be given orally, overall I think it is a little less effective, but still causes a beneficial reduction in pain.
Clinically useful, ethically acceptable pain research is not difficult to perform in newborn infants, these above examples are from around the world, and I have found examples from countries with good resources, and from countries with very limited resources. As has been noted, (Harrison D, et al. Sweet Solutions to Reduce Procedural Pain in Neonates: A Meta-analysis. Pediatrics. 2017;139(1):e20160955.) the effectiveness of sweet solutions has been known since the very first trial, and since their effectiveness became incontrovertible, more than 800 babies have been assigned to groups who are intended to experience avoidable pain. Since that analysis was published in 2017, there have been many more.
The first priority must always be to put the baby and their family first, and treat them as you would wish Jo to be treated, if you had to go with Jo to the Emergency Room. Ensuring that your child had adequate pain protection for a planned painful procedure should definitely be your goal, the same thing applies for our patients.
Unethical pain studies are still being published, in journals which include several from mainstream publishing houses.
In these studies published recently and appearing on-line in recent weeks, newborn infants were assigned by the researchers to experience pain. The reviewers of the articles accepted that the research subject babies had pain imposed by the investigstors. The editors of the journals involved are also complicit by agreeing to publish articles whose research required that babies experienced avoidable pain.
This must stop.
It is unacceptable to inflict pain on babies in order to complete a research project.
Please inform the editorial board of the following journals of your concerns. The chief editor’s professional email is provided, or an email address associated with the publication if I can find one.
I will repost this page with new articles whenever I spot them.
The Effect of Maternal Voice on Venipuncture Induced Pain in Neonates: A Randomized Study. Chen Y, Li Y, Sun J, Han D, Feng S, Zhang X. Pain Manag Nurs. 2021 Oct;22(5):668-673. doi: 10.1016/j.pmn.2021.01.002. Epub 2021 Mar 2. PMID: 33674242. The trial randomized 58 babies undergoing venepuncture to have a recording of their mother’s voice, or to have the painful procedure without analgesia. Editor-in-Chief: Dr Elaine Miller, MILLEREL@ucmail.uc.edu
Effect of White Noise and Lullabies on Pain and Vital Signs in Invasive Interventions Applied to Premature Babies. Döra Ö, Büyük ET. Pain Manag Nurs. 2021 Dec;22(6):724-729. doi: 10.1016/j.pmn.2021.05.005. Epub 2021 Jun 28. PMID: 34210600. In this trial 66 babies (which was either randomized, according to the text, or non-randomized, according to the figure) were assigned to either have a lullaby, or white noise, or to have no intervention, prior to the pain of a “blood collection” (not further specified). Editor-in-Chief: of Pain Management Nursing, Dr Elaine Miller, MILLEREL@ucmail.uc.edu
Gao H, Xu G, Li F, Lv H, Rong H, Mi Y, Li M. Effect of combined pharmacological, behavioral, and physical interventions for procedural pain on salivary cortisol and neurobehavioral development in preterm infants: a randomized controlled trial. Pain. 2021 Jan;162(1):253-262. doi: 10.1097/j.pain.0000000000002015. PMID: 32773596. In this trial, 38 preterm infants were randomized within 7 2hours of birth to have all of their painful procedures during hospitalisation untreated, they were allowed to get a soother when they cried! In the combined interventions group, preterm infants received sucrose, massage, music, non-nutritive sucking, and gentle human touch. I think this is the worst of all the trials I have seen recently, the control babies had repeated evaluations of PIPP score during painful procedures, and continued to have multiple experiences of moderate to severe pain, with average PIPP scores of over 12. How can the reviewers and editors possibly have thought this was OK? Editor-in-Chief: Dr Francis Keefe keefe003@mc.duke.edu
Here is some suggested text you can cut and paste into an email if you wish.
The article “XXXXX” published in your journal, describes research which is clearly unethical and should not have been published. In this study, newborn babies were assigned to a group designed to experience pain. Effective methods to prevent pain caused by skin breaking procedures are well known, easily available, and cheap or free. Those methods include kangaroo care/skin-to-skin contact, oral sucrose or glucose solutions, especially when combined with non-nutritive sucking, and breastfeeding. There is no valid reason for denying such pain reduction methods to research subjects. Publication in a high-quality journal such as the one you edit gives credibility to the research and suggests that it is acceptable to inflict pain on babies in order to complete a research project. Research which compares an analgesic intervention for a painful procedure in newborn infants to an untreated control group is useless in improving care. As effective pain measures are already well known, the only research which could possibly improve care is that which compares different analgesic interventions, or examines the addition of measures to those already known to be effective. The most effective way your journal could improve pain control in newborn infants would be to cease publishing research which unethically randomizes babies to have avoidable pain. All future trials in newborn infants undergoing planned painful procedures that the journal publishes should ensure that research subjects in all groups receive proven effective methods of pain control. I urge you to retract this article and to establish editorial standards which prohibit the publication of research in which avoidable pain is imposed on newborn infants. Pain inflicted on babies in order to perform research should be an immediate criterion for rejection of a manuscript.
Although babies under 25 weeks account for a tiny proportion of births, and a small proportion of NICU admissions, the importance of the question asked in the title can be seen by the ongoing number of publications, below are just a few of recent relevant publications of interest.
These first 2 are results from a on-line survey study in the UK. The 336 respondents were 50% neonatologists, but, as an on-line study it is hard to know how representative the responses are. Nevertheless it really suggests a shift in attitudes, with respondents more likely than in previous years to accept active intervention at 22 weeks, and to take into account other risk factors than just gestational age. A major concern I have with these types of decisions is that they are often based on risk of mortality, and also on risk of “major impairment”. Although what this means is discussed in the “framework” document, and is generally intended to mean a severe intellectual impairment (IQ <55), disabling CP, blindness or profound hearing impairment, I don’t think that the implications of that label are necessarily always explicit. Especially in an online survey, which didn’t, as far as I can see from the supplement, re-iterate the criteria for a definition of major impairment.
Do we really think that an increased risk of blindness is a good reason for denying intensive care to babies? Or disabling motor dysfunction with a normal intellect? I think in reality that it is the risk of severe intellectual impairment which drives all of our concerns about “severe impairment”, and that we should acknowledge that. To take this further, if a baby was born needing resuscitation at term, with known congenital blindness, would it be ethically acceptable to deny that baby intervention because of known (rather than a predicted statistically increased chance of) blindness? We could ask the same thing for congenital deafness, and for GMFCS of 3 or 4 (level 3 meaning able to walk with a hand-held mobility device, and level 4 meaning self-mobility is limited, may need a wheelchair, but usually with head and trunk control), if we had a definite diagnosis of those impairments prior to birth would it be justifiable to withhold intensive care? I think we should focus on the outcomes we really worry about the most, level V GMFCS, and very severe intellectual impairment. They are outcomes which are relatively uncommon, even at the most extreme gestational ages.
In addition, in this study, the respondents were asked to estimate the probability of death and severe impairment, and the responses were extremely variable. This figure shows the range of opinions about potential survival and proportion of “severe disability”, each dot represents one response, the horizontal lines are medians and IQRs, and the asterisks are estimates from the on-line NICHD risk calculator. However, the NICHD calculator now gives a range of outcomes, rather than a point estimate, and the definitions of profound impairment, etc. are not the same as the BAPM definitions. The data do show major variability in estimates of survival and long term outcomes.
This survey study asked physicians involved in perinatal care what makes an acceptable quality of life, and what proportion of infants born at various extreme gestational ages would be likely to have an acceptable QoL, and then what their own resuscitation preferences were. One notable finding is that very few respondents had any encounters with former very preterm children, and those with the fewest encounters, in particular obstetricians, had the most negative views of quality of life outcomes.
There was a close correlation between having a negative opinion of the QoL of extremely preterm infants, and not wishing to actively intervene. This has been shown before; the over-estimation of serious adverse outcomes drives an unwillingness to actively intervene.
Of course, an unacceptable quality of life is by no means equivalent to severe disability, the large majority of disabled children have a good quality of life, there is nothing in the literature to support the prejudice against extremely immature babies, that supposes that they have a high probability of an unacceptable quality of life.
One other disturbing result of the LoRe study was the belief that having an extremely preterm baby would have a negative impact on the family as a whole, a belief held by 73% of the respondents. There is absolutely no data to support this, the large majority of families report both positive and negative impacts of very preterm delivery. Of course there are challenges, sometimes major challenges, but living with a former extremely preterm infant does not have an overall negative impact on families.
In this study parents were counselled about threatened extreme preterm delivery at between 23 weeks 0 days and 24 weeks 6 days, they and their physicians then completed questionnaires about shared decision making, and they were later followed up to see if they experienced decisional regret. This study took place in the Netherlands, where active care is generally not offered at 23 weeks gestation, and, in fact, the mothers included in this study all delivered at 24 weeks (n=3) or later. To be brutal, shared decision-making does not occur in the Netherlands for babies born at <24 weeks gestation. Parents are given no share in that decision. Four of the 22 counselling sessions resulted in a decision for perinatal palliative care, there were low scores for decisional conflict, and, from the few surveys returned 1 month later, little decisional regret. I can’t tell from the publication if any babies actually died after receiving perinatal palliative care or not, which is an important limitation. Decisional regret is relatively uncommon after many different decisions in life, we all tend to believe in retrospect that we probably made the right decision. Decisions which lead to a child dying may be different, especially if you later discover that your baby had a greater chance of survival, and of survival with a good QoL, than you were informed. Another study by this group showed no significant difference in decisional regret according to the decision taken, but there were only 4 who had a decision for palliative care, and 2 of those 4 had high scores for decisional regret.
De Proost L, et al. On the limits of viability: toward an individualized prognosis-based approach. J Perinatol. 2020;40(12):1736-8. This article is a commentary, the authors of which include the first author of the previous article, pleading for a revision of the Dutch guidelines to take into account other factors than gestational age, and against a rigid guideline based on GA alone. Indeed, I think it is about time; survival of babies in Holland at 24 weeks is substantially lower than many other jurisdictions, and survival of babies at 23 weeks gestation is 0. This is despite a high quality health care system, dedicated staff, good regionalisation, excellent training, all the things that are needed to have excellent results. I think Dutch families deserve better.
Overall, I think these articles give me some hope that things are changing in the right direction, caregivers are more willing to intervene for more immature babies, and some prominent physicians are working to promote individualized decision making, rather than blanket denials of active care. There is a still a crying need for education, otherwise couples with threatened extremely preterm delivery are being given information which is biased and inaccurate.
Among the participants of this study, “the majority described a good QoL in terms of emotional well-being (eg “loved”, “happy”, “supported”), whereas a poor QoL was described in terms of functionality (eg “dependent” and “confined”)”. They were a mix of different ethnic, religious, educational and financial backgrounds, but the sample is too small to analyze whether those factors were important in their attitudes.
It has been shown several times that healthcare workers attitudes differ from the rest of the population, I think probably rather more than 54% of caregivers would agree that “some disabilities are worse than death”, certainly that is the implication from the study by LoRe et al, but this study from Indianapolis had a substantial number of respondents who disagreed with that statement.
Update: Jan 10 2022; I slightly changed the title of this post, from “how do we make decisions for the most immature babies, and their families” to “how do we make decisions for the most immature babies, with their families” as I think that better reflects the subject matter (about shared decision-making) and my feelings of how we should partner with families to make the best decisions for our patients.
In order to answer the question posed in the title you would need to take babies at risk of seizures, but not yet having clinically diagnosed convulsions, randomize them to have routine continuous EEG monitoring or not and then treat according to the EEG results or according to whether they had clinical seizures. In order to know what was happening in the controls, they should have continuous EEG monitoring as well. Then you would follow the babies up to an age where clinically important differences can be reliably determined.
That study design ignores the fact that we don’t even really know whether treatment of clinical seizures improves long term outcomes, compared to just ignoring them. I am not suggesting we should ignore them, just that we have no proof that anticonvulsant treatment of seizures improves long term outcomes. Many neonates have a few seizures during a relatively short period of their life, and it si possible that treating them doesn’t do much to protect their brain. It is however, the standard of care to treat recurrent clinical seizures in the newborn, and current evidence suggests that phenobarbitone is the preferable first line treatment.
NEST (Newborn Electrographic Seizure Trial) was designed to enrol 520 babies in 2 groups, with amplitude integrated EEG either masked or unmasked. However, slow enrolment and loss of equipoise led to termination of the trial after 212 babies were randomized, and 20 in each arm were lost to outcome assessment, leaving 86 babies in each group.
As the visual abstract above shows, in terms of outcomes there were small differences in “survival without disability” which favoured the treatment of clinical seizures alone group. As this table shows all of the electrographic seizure treatment babies and all but 3 CSG (clinical seizure treatment alone) babies did indeed have seizures on the EEG, and babies in the ESG (electrographic seizure treatment group) received more anticonvulsants; most babies in both groups received the maximum 40 mg/kg of phenobarb that was allowed but more ESG babies received phenytoin and 3rd or 4th medications.
What isn’t discussed very much is that the ESG babies had a greater seizure burden, especially between 12 and 72 hours, where the seizure burden was about double. So, despite randomization, seizure burden was higher in the group that eventually did worse. Also, it is not clear why there were 78 babies in the CSG group with seizures, but only 64 of them received phenobarb, whilst the treatment algorithm suggests that any baby with a single seizure should receive at least one loading dose of 20 mg/kg of phenobarb. Perhaps they had just a single seizure and the attending physician decided not to follow the protocol, suggesting that they were less seriously affected infants.
Unfortunately the early termination of this trial and the difference in seizure burden despite randomization makes it difficult to conclude one way or the other regarding treatment of non-clinical seizures. But the small difference in the results favouring clinical seizure treatment alone will allow future investigators to design and perform another trial. The actual differences in medications administered were relatively small, there were 14 in the clinical group who received no anticonvulsants, and 16 who received more than one anticonvulsant, compared to 0 and 26 in the electrographic group. Suggesting that only 24 babies in the trial had their management affected, which makes the trial, the largest yet conducted of the issue, severely under-powered. What I mean is that even if the underlying reason for treating was different in the two groups, the majority of infants in both groups received 40 mg/kg of phenobarb and nothing else, and you wouldn’t expect much difference in outcome in those infants, even though perhaps the ESG babies received their second phenobarb load earlier.
A future trial will have to be significantly larger than this one, large enough to ensure that randomization leads to a seizure burden which is similar between groups, and large enough to ensure a significant difference in anticonvulsant receipt. I think conventional EEG is probably preferable, although many centres cannot easily install cEEG in the middle of the night, so starting with aEEG and switching to cEEG when possible would be a pragmatic design reflecting reasonable current practice.
Based on this trial and the rest of the literature, I still have concerns that multiple electrographic seizures are associated with, and may worsen, brain injury. I certainly don’t think this means that we should ignore the frequent “non-convulsive” seizures that are seen in asphyxia, especially after initial doses of anticonvulsants, but perhaps we can be less aggressive at treating occasional electrographic seizures. In contrast, I have seen babies with electrographic status epilepticus lasting hours, with no clinically evident convulsive movements, and I am sure that can’t be good for your brain. I think it is likely that phenobarb treatment of such babies is the right thing to do, as brain protection by phenobarb has been shown in a number of models, and in one human infant RCT. What the second line treatment should be, I am completely in the dark; bumetanide, or perhaps just more phenobarb?
One of the trials we have been waiting for has just been published Dargaville PA, et al. Effect of Minimally Invasive Surfactant Therapy vs Sham Treatment on Death or Bronchopulmonary Dysplasia in Preterm Infants With Respiratory Distress Syndrome: The OPTIMIST-A Randomized Clinical Trial. JAMA. 2021;326(24):2478-87. As you can see above, there was “no significant difference” in the primary outcome of death or BPD, as defined by needing more than 30% O2, or being on respiratory support, or having <30% oxygen and failing a room air challenge, all at 36 weeks. I have ranted often enough about this outcome that I won’t repeat it here. I will say that I think OPTIMIST is the second best trial name ever in neonatology (after ELVIS), but maybe it was too OPTIMISTIC to think that there would be a large reduction in BPD in the most immature infants who were stable enough to be eligible for this study. To be eligible the baby had to be breathing sponatneously on CPAP in the NICU, so many of the sickest babies, would already be intubated, and infants thought to be in need of immediate intubation were also excluded. Those are quite appropriate exclusions, but it means that the remaining babies were relatively lower risk infants.
I’m trying out a new format with a sort of home-made visual summary of the article that you can see above, but it’s quite a lot of extra work, so I don’t know if I’ll continue. Anyway, my bottom line is that MIST looks positive for the 27 and 28 week infants, but the unexpected increase in mortality at 25-26 weeks gives me pause. Apparently the causes of mortality were distributed among all the usual causes, which is a bit strange as all the usual adverse outcomes were slightly less common with MIST, late onset sepsis, NEC, spontaneous perforation and IVH.
As usual, I hate to say it, but I think we need another study. A trial with even earlier surfactant, perhaps prophylactic MIST or at 25 % oxygen, and concentrating on the more immature infants, perhaps combined with a higher dose of caffeine.
I think if such a trial could show the pulmonary benefits, in particular the reduction in home oxygen therapy, without a mortality effect, then MIST/LISA could become routine for the very immature, 25 to 26 week infant. Until then, I think that an overall benefit in those babies is not proven.
The upcoming issue of “Seminars in Perinatology” is about the controversies in caring for the babies <25 weeks gestation. Babies born at 24, 23, 22 or even now 21 weeks gestation are so physiologically immature that we can’t just assume that something that works at 28 weeks will also be beneficial for that sub-group. My contribution to the collection of articles (Barrington KJ. The most immature infants: Is evidence-based practice possible? Semin Perinatol. 2021:151543) was an attempt to find data about the most immature babies from recent large RCTs, the kind of evidence base that is reliable for determining treatment options. I searched large multi-centre RCTs from the last 5 years, and found 30 trials enrolling a total of over 25,000 very preterm infants. Many trials excluded the most immature babies, either formally by setting a lower limit for gestational age, or by being performed in countries (France and Holland) where active treatment at 22 and 23 weeks is absent or rare, and even at 24 weeks is (or was) quite limited. I could finally find only 3 babies of 22 weeks whose data were presented in those studies, and very few who were clearly of 23 weeks gestation. There were three trials that reported results by GA strata that included a stratum <25 weeks, and those trials, in total, included 711 babies.
Our entire evidence-base for therapies in the most immature babies is thus extremely limited, and most things that we have questions about are uncertain for those babies. That uncertainty can be seen in other articles of the series, such as this one (Sindelar R, et al. Respiratory management for extremely premature infants born at 22 to 23 weeks of gestation in proactive centers in Sweden, Japan, and USA. Semin Perinatol. 2021:151540) in which Richard Sindelar and his co-authors describe how different centers use assisted ventilation for the profoundly immature. In Uppsala, Sweden they start the babies on “conventional” ventilation with volume guarantee, and are able to maintain the majority of their babies like this, in Iowa, USA they start the infants on high-frequency jet ventilation and extubate at 850 grams minimum, in Kanagawa, Japan they start on conventional ventilation but change to high-frequency oscillation early if cardiac function is good. The authors describe a few points in common in their approaches, and one that is not mentioned. They have in common the use of 2.0 mm ETT for the 22 weeks infants, a focus on preventing over-distension, especially with the oversight of experienced clinicians, minimal handling, not trying to extubate too early, and giving surfactant very early. What they don’t mention is probably the most important factor; a belief that good survival is possible, that good long term outcomes are usual, and that it really is worth it.
To follow on from my previous post, there is also an article from Thuy Mai-Luu who runs our follow-up program at Sainte Justine, and Rebecca Pearce, an ex-NICU parent. They discuss how to improve neonatal follow-up to make it more relevant to parents, they include the following table:
The focus on examining the strengths, rather than just the weaknesses, of former preterm infants, and avoiding simplistic dichotomised outcomes in favour of a broader, more balanced portrait of functions, are approaches that I absolutely agree with.
There are several other articles in this issue that should have a major impact on how we practice, including a pro-active integrated approach to the mother with her fetus, the nursing approach to these babies and their parents, and many others, all of which I would recommend. Given the possibility of more than 50% survival at 22 weeks gestation, shown by several groups, it is no longer ethically justifiable to universally deny intervention to such infants. But as we plan our intervention strategies for such babies we should all be listening to the experts and learn from how they have been able to get such good results.
Neonatologists can be said to have invented “Outcomes Research”, the systematic evaluation of patients after an acute-care event. It is a field of research which has now been extended from early infancy into the adult period and provided enormous insights into what happens to babies who are born very preterm. When the field was developing there were concerns about how our patients would fare when their brain development had been interrupted by preterm delivery, often accompanied by serious illness, and sometimes major complications, surgery, and brain injuries seen on CT scan or ultrasound. As a result much work has concentrated on neurological examinations and developmental screening tests, partly to evaluate and describe outcomes, but also to ensure that chlidren received any interventions they might need with minimal delay.
The major question that we need to ask now is: what outcomes are most important to parents? One serious problem with our outcome research had been the attempt to categorize infants into “normal” and “abnormal”. This has been driven by the appropriate concern that our NICUs should be aiming for not just improved survival, but for improved “intact survival”. That is a term that I have heard many times, but it requires some thought; what do we mean by intact? And who decides what is an acceptable outcome?
I have written many times on this blog about the problems that arise when we decide that a Bayley score at 18 to 24 months of age below 2 SD below the mean is an adverse outcome, whereas a score of 1.9 SD below the mean is a good outcome, even when the child has a serious behaviour disorder. This arbitrary and artificial dichotomising of the whole of child development skews the literature, and has impacts on how we care for babies.
In this post I will not be concentrating much on how the artificial dichotomising of child development skews research design, even though that is a major problem. Because it is easier to design studies with a dichotomous composite, where death and “NDI” are given equal weight, and the answer to whether a treatment is better than another is supposed to them be yes or no, based on the impact on that composite outcome.
In everyday clinical practice we (as health care professionals) really want to know whether treatment A decreases death compared to treatment B, and if not, what are the relative impacts on developmental progress, pulmonary development, neurologic impairment, etc. But what do parents want to know? I think we can be sure that parents would want to know if one treatment compared to another affected mortality, but beyond that, what outcomes are really important to parents? What are their relative importance? What kind of adverse outcome would be enough to outweigh an improvement in survival?
Personally, I would be surprised if a group of parents cared whether their babies’ Bayley scores were more likely to be above or below 70.
A group of collaborators from my hospital have just published an article reporting a study (which is part of the Parent Voices Project) where parents were asked their opinions about the progress of their extremely preterm infants. Jaworski M, et al. Parental perspective on important health outcomes of extremely preterm infants. Arch Dis Child Fetal Neonatal Ed. 2021:fetalneonatal-2021-322711. This was a group of parents with 213 children aged from 18 months to 7 years of age, less than 29 weeks gestational age. Children are routinely followed until 36 months corrected age, and then some are discharged if they are doing well with normal developmental progress and the parents have no behavioural concerns, which means that the older children in the cohort tend to be those where there are some issues that need following. Overall there were 55% of parents who had some concerns about their infants developmental progress, including an 18% who had concerns about behavioural and emotional issues. Parents of the 53 infants identified as having “severe NDI” (defined as a motor, language, or cognitive composite score <70 on the Bayley3, or cerebral palsy with a GMFCS of 3,4, or 5, or needing amplification or being visually impaired in both eyes), were only slightly more likely to have concerns about the development of their infant than the 82 without “NDI”. Growth, feeding and respiratory concerns were similarly represented among the subgroups. Mothers and Fathers had similar concerns and there were no substantial differences by gestational age group.
Of the issues that matter to parents, the quotes suggest that it is really the child’s functioning that they worry about (‘I wish she would express herself more clearly’), whereas scores on standardised tests were never mentioned! Of course, if those scores help to identify infants who would benefit from intervention, then they may be of value to the individual. If they, overall, reflect the function of the group then they could be a reasonable way of summarizing outcomes. But behavioural/emotional development is very rarely mentioned in outcome studies, despite the major importance to many parents, nor are feeding and growth commonly reported.
This project says to me that we should systematically describe and report behavioural, emotional, feeding and growth outcomes in our follow-up studies, and in the follow-up of our RCTs in extremely preterm infants. Those things concern parents just as much as developmental delay, and have impacts on the families similar to other things that we describe routinely.
As a parent myself of an extremely preterm infant, perhaps I can allow myself, for this first post of the new year, to give my opinion of what outcomes matter. My daughter indeed had her standardized tests, administered by a friend and colleague in the McGill University follow up system (Dr Elise Couture). I was rather uninterested in the results of those tests, I don’t even remember if she “passed” or not. I knew she was progressing, was learning new skills, and had behavioural issues that were not out of the ordinary. She had feeding issues that were more disruptive than any problems with cognitive development, and when she got to school had some issues with her learning style, of an executive function type. With the dedication of her mother, in particular, she was able to progress through school, and is very soon to graduate from high school, at which she works extremely hard. For me the most important features of follow-up were to know whether there were any specific therapies that could help to make her life easier: Did she need physio? Would speech therapy help? How could we best assist her learning pattern? Her ENT problems and pulmonary development were also of major interest, and fortunately she didn’t have any major hearing or vision problems needing intervention, other than spectacles. I try to imagine what my attitude would have been if she had turned out to have a Bayley MDI (she did the version 2) score <70. I can’t imagine Annie or I would have treated her any differently, and the idea that having a low Bayley score would have been the same outcome as her being dead, I find offensive.
On the other hand, evaluating, and finding a way to summarize, the developmental progress of our patients is worthwhile, having a more robust development, more similar to babies born at term would be a good thing. A more detailed description of outcomes, including average scores and the spread of scores, on standardized screening tests, would give a more useful picture of the outcomes of our patients than just the proportion of babies below various thresholds. More information about what outcomes matter to parents should help us to design follow up programs that respond to their needs.
Recently my heart betrayed me, after more than 60 years of excellent service it decided to beat far too quickly. My atria started to beat at nearly 300 times per minute, and my ventricles, unable to beat that fast, responded to every second atrial contraction to beat at 145 per minute, this is a cardiac arrhythmia known as atrial flutter. Although I felt OK in general, I could feel my heart was beating too fast, and after waiting for 30 minutes to see if it would calm down, I decided to take myself off to a local emergency room. I didn’t feel sick enough to call an ambulance, so I called a cab and asked them to take me to the nearest Emergency Room, at the Jewish General Hospital in Montreal. When I arrived, I was seen within seconds by a nurse who triaged me as a level 2 emergency which meant an immediate admission to one of the ‘crash’ rooms. (level 1 means you are actively bleeding or in cardiac arrest!) Within about 5 minutes of my arrival ER, I was in a highly monitored bed, with my ECG displayed for the emergency room doctor to see, an I.V. in place, in a hospital gown, having my vital signs taken by an experienced nurse, who was able to tell me that I was stable and that the doctor would see me within a few minutes, which she did.
In total I was in the ER for about 8 hours, during which time I had an x-ray, 2 formal ECGs, an expensive drug that I had never heard of before, multiple disposables, and a bedside echocardiogram. The expensive drug converted my heart rhythm to normal, after about 10 minutes, and then after a few hours of monitoring, I was able to go home. I got back home in the early hours of Canada Day, the year was the 150th anniversary of the convention that created Canada. Two days later I was seen by a specialist, and had another ECG and a formal echocardiogram by an experienced technician. About 4 weeks after this (I delayed the appointment for my annual vacation) I had a 48 hour recording of my heart activity, which showed no further evidence of atrial flutter, I was then seen by an internationally reputed expert in cardiac electrophysiology within a month, and we decided together that I should have a cardiac catheterisation and an endomyocardial ablation, which means advancing a catheter into my heart and burning the lining of the heart in a particular place to make it very unlikely that another episode would ever occur.
Three weeks after seeing the specialist in his clinic I was admitted to a day hospital, an ECG showed my heart rhythm was normal, and I was sent to the catheterisation room, one of the most advanced that exists anywhere, where under x-ray control, and with 3-dimensional mapping of my cardiac activity, the electrophysiologist burnt the inside of my heart, and interrupted the pathways that might otherwise leave me at risk of a recurrence. About 6 hours later I was allowed to go home.
When I got home from that, I thought I had better figure out my bill to make sure I could afford it all:
Taxi to hospital for initial episode $14, taxi home from hospital $14, outpatient prescription drugs $1.84 (outpatient drugs are not completely covered for most people), parking for the first specialist appointment $12. Public transport to get to the Sub-specialist appointment $3.25
Total cost $45.09
That’s it.
Everything else was covered by the provincial healthcare system, paid for by taxes.
Tommy Douglas was a former professional boxer, who was also a baptist minister, and is the father of Canadian Medicare. He was from the Canadian Prairies, has been referred to as the ‘greatest Canadian of all time’ and worked tirelessly to start a Canadian Health Care system which provides care to all, regardless of ability to pay.
Our system (actually systems, there are significant differences between provinces that are responsible for administering health care) is far from perfect, acute and emergency care tend to be favoured, so neonatal care, for example, what I do on an a daily basis, is in a privileged position. Central management has advantages, it makes regionalisation quite effective, so we have almost no avoidable deliveries of very preterm babies in non-tertiary hospitals. Central management also creates problems, with, for example, the size of medical school intakes oscillating as the government tries to decide if we have too many physicians or too few, and keeps changing its mind.
Chronic care, and domiciliary care are the big losers in our system, as it is politically easier to cut budgets when the adverse effects are slowly cumulative rather than acutely visible. Non-urgent surgery is another place where our system does relatively poorly, so a hip replacement might be quite delayed, with consequent avoidable pain and disability. Although, in fact, some type of waiting list for non-urgent procedures his an important way of containing costs, if everyone can get a hip replacement within a few days of qualifying for one, there has to be a great deal of redundancy in the system.
One interesting comparison with the US system was made a few years ago by John Ralston Saul. The cost of US Medicare and Medicaid, divided by the entire US population, (even though those items only cover a small part of the US population) was substantially greater than the cost of Canadian Medicare, divided by the entire Canadian Population; but the Canadian system covers everybody. A system with a layer of administration dedicated to making a profit has to be more costly.
Outpatient drug costs are one area where there are substantial differences between provinces. In Quebec where I live, if your employer does not provide drug benefits there is an ‘individual mandate’ and everyone must buy their own medication insurance. This insurance has a maximum co-pay of about 1/3, but has an annual cap, which means that there is an annual maximum which any individual has to pay of 1066$ in any year. As I have now passed 65 years I am covered by the provincial medication insurance, as is anyone without a job or with a lower income.
I am certainly grateful for the Canadian health care system, both as a patient and as a part of the system, no-one find themselves in debt because of medical costs, those who pay taxes pay for the medical care of those who pay little or no taxes. In a relatively just society, that is how it should be.
I was stimulated to publish this post, which I wrote 3 years ago on hearing of the tribulations of my friend and colleague Nick Embleton, In one of his posts on his blog he mentioned his gratitude that his medical care was covered by the UK National Health Service. He referred to the care as being “free”.
But of course it is not free, the nurses and physicians and technicians and administrative personnel are all paid, the drugs are paid for, the equipment is bought, the hospitals are built and maintained (sometimes poorly!). None of that is “free”.
What actually is happening is that we, as a society have decided, that when anyone gets sick we will all pay a little bit towards their care. Anyone who pays taxes in Quebec contributed a few cents towards my care, the purchase of the 3-dimensional fluoroscopy unit, my drugs, the salary of the nurses who cared for me and my physicians fees. Even those in other provinces contributed a little, as the federal government collects money from taxes and sends then to each province in the form of equalization payments.
This is sometimes referred to as public health insurance, but it isn’t really even that. Insurance policies are paid for by the individual towards an eventual adverse occurrence, in which case their costs will be paid. Paying for health care as a collective is a way of caring for those who are sick at the moment that they need it. No-one has to worry about health care bills or becoming bankrupt. Of course, the young rich and healthy contribute towards the care of the old poor and sick more than they get back… at first. But one day we all need health care, to receive it without having to dread the bills that might arrive is a sign of a healthy society.
There have been attempts to de-fund Canadian Medicare, and the NHS, from ideologues whose idea of society is more a “sink or swim” image. But Medicare in Canada and the NHS in the UK are extremely popular, so attempts in the UK to destroy the system are usually disguised as attempts to “improve efficiency” or “decrease overheads”. Our systems, however, are already incredibly efficient, with much lower proportions of costs going to administration than happens in systems relying on profit-taking insurance companies.
I am privileged to live in a province (Quebec) where these collective principles are never under attack. None of the political parties are opposed to Medicare being funded out of the public purse (i.e. our collective contributions), even though they may have different approaches, and tend to try to re-organize the system every time there is a change in government, with resultant chaotic periods. I can rest assured that the next time I need acute care I will not end up with a bill (if I have to take an ambulance it will be even cheaper, as that is covered also) because my colleagues, my neighbours, and people in far away towns that I will never meet will pay for my costs.
Thank you Tommy Douglas, thank you Canada, thank you Quebec.
One feature of the redesign of Pubmed from a couple of years ago was that the abstract view now gives a place for the conflict of interest statement, which in this case is very long. The article has 44(!) authors of whom 10 were employees of Illumina, and 4 others had some potential conflicts. The study was apparently designed by a group of people 4 of whom were Illumina employees, and the corresponding author is an Illumina employee. That really makes a mockery of the statement in the small print after the article that “Outside the sponsor employees listed as authors, the sponsor organization had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.” I have no idea what that means when company employees were intimately involved in every single stage of the trial, which was entirely funded by the company.
One might guess that this potentially leads to some bias. Over-interpretation of the value of WGS is likely in a study designed to determine whether WGS leads to clinically important changes in clinical management (COM, as they abbreviate it) if the very existence of the company funding the trial depends on clinicians believing that WGS is often beneficial.
The trial was designed in a fairly innovative way, probably as a way of trying to ensure high consent rates, they performed WGS in all of the babies and families in the trial, but told them the results at different times. Infants and families (at least one family member had to participate, so either duo, trio or more WGS was performed) were randomized to have sequencing results revealed either at 15 days after randomization or at 60 days. Infants less than 120 days of age were eligible if they had a strong clinical suspicion of a monogenic disorder. They were then followed to see if a COM occurred up to 60 days after enrolment for the primary outcome, and then up to 90 days to see what happened to the delayed reveal group.
I have a question about the ethics of this; if WGS revealed a treatable condition (for example a metabolic disorder requiring a diet change) would it be ethical to continue to hide the results for 45 days? I don’t know if there was a possibility in the protocol to do that, but I think that question does point out that it is extremely unlikely that WGS will reveal a condition that needs, and is responsive to, a specific treatment. Most such conditions will be rapidly screened out by the testing we already do in the NICU. In the ethics section of the protocol there is no mention of this as an issue.
One thing I like about this study is that they acknowledge that those other tests are going on, and therefore compare the COM in the 2 groups, but that starts to get a bit murky, what does COM mean in the delayed group? These were critically ill babies in the NICU (mostly, 7% from the PICU and 10% from cardiovascular ICU) so changes in management were happening all the time.
There is no actual calculation of sample size. They decided to enrol 300 infants and then state that this gives an 80% power to detect “a significant increase in the proportion of cases with COM on the assumption of a 20% difference in diagnostic yield between the early and delayed groups”. Which is a strange way to express it, and doesn’t state what difference in COM they are powered to detect. In the published protocol accompanying the article they state that 300 infants gives a 90% power to detect “a significant increase in proportion of cases with” COM.
I think that determining an arbitrary sample size based on resources is not unreasonable in trial design, sample size is always a compromise between the ideal and the possible, but the power should then be calculated appropriately and presented clearly.
Sample size has to be based on a clear primary outcome. I was very surprised that the protocol has no clear definition of the primary outcome. The section of the protocol on determination of test outcomes is all about the WGS results, there is no definition of what is meant by COM, and in particular what is meant by COM in the delayed group. It was apparently the site PI who decided about the presence and classification of a COM, and they were of course, not blinded, but knew what genetic results were available. In the second supplement to the publication there is a table that looks like this
This looks like a reasonable way of defining COM after genetic testing, and I will assume that this also was used to determine COM in the delayed group if they received a genetic diagnosis prior to getting the WGS results.
That being so, there were finally 354 babies randomized; why there were more than 300 is never explained. The primary outcome of the study was that, at day 60, 21% of the early WGS group had a change of management and in the delayed group 10% had a COM, which using chi-square with Yates’ correction gives a p-value of 0.011.
Because I am cynical I wonder if there are more than 300 infants in the study because after 300 it wasn’t quite significant, so they decided to enrol another 50 or so… That is probably libel to write that, and as I am against reliance on p-values in any case, I wouldn’t dream of making that statement. But to go 17% over the planned sample size is unusual.
More important than the p-values is the question of whether there was a real added value to the parents or the infant of getting early WGS results. One thing I really appreciate about this, and a few other publications about the possible value of WGS, is the individual patient data that are available (obviously without any details that could allow identification of the individual). Which means that some of the purported COM can be investigated in detail. For example subject 312, who had a diaphragmatic hernia and a secundum ASD. This patient was in the early group and is listed as having a COM type S after a non-contributory WGS. I think this is pushing it a bit, I presume the idea is that they proved the baby probably did not have a lethal genetic condition, and that was the added value of WGS. Patient 17 had a long QT interval, and the specific gene defect found was in RYR2, which apparently leads to a malignant form of recurrent ventricular tachydysrhythmias. Treatment is probably the same as other prolonged QT patients, but I can certainly go along with the evaluation that this WGS result helped the infant and their family, the standard panel for long QT apparently doesn’t pick this up.
Another infant in the trial, listed as having a COM, had a diagnosis of a hypomyelinisation myopathy, for which there is no specific therapy, did knowing the gene really help the infant? Surely the supportive care the baby needs is related to his hypotonia, feeding difficulties etc. and not to his genetic abnormality.
Another example is an infant with multiple anomalies who had a diagnosis of Cornelia de Lange syndrome after WGS, for whom the COM is noted as both M and S. However, there is no specific treatment for the syndrome, and supportive care depends on the symptomatology, not on which gene is abnormal. I’ve seen several babies with this syndrome, and not knowing which gene was abnormal did not, I think, have any impact on how we looked after them.
One thing that geneticists are all too aware of, but which seems often lacking in publications of the proposed benefits of WGS, is the problem of phenotype-genotype correlation (or the lack of correlation). It is vital to remember that exactly the same genetic defect can cause a range of phenotypic expression. In relatively common disorders, for example CHARGE there are enough cases to have publications about the issue, and there have even been cases of monozygotic twins with CHARGE who have dramatically different phenotypes. As the article I just linked to concludes, “the phenotype cannot be predicted from the genotype”.
Although I am being very critical of this article, I do think that WGS is sometimes indicated, and I do think there may sometimes be important benefits for families. Just knowing the diagnosis can be a benefit, having some idea of prognosis is important, even though it is important to always be aware of the limitations of predictions based on genotype. For genetic counselling, a diagnosis, and sometimes the identification of the gene, may be essential. And, for future possible pre-implantation diagnosis, knowing the precise genetic abnormality is essential. For some parents, however, knowing that their infant’s problems are due to genes inherited from them can cause them to feel guilty.
In a study which Annie Janvier and our genetics group at Sainte Justine did, 14% of parents, when asked about their experience with Chromosomal Micro-Array testing, expressed concern that they may be responsible for their child’s genetic condition and 12% of parents did actually feel an increase in guilt after the test. (although others, also about 10%, will feel less guilty).
The benefits of WGS testing, however, very rarely include a specific change in management. In this trial, unlike other studies, there are very few instances of COM involving palliative care or end-of-life care. Two of the cases listed as having a COM involving palliative or end-of-life care had a non-contributory WGS, so I don’t understand how that helped to make the decision. In other cases reported in other studies it seems to me that performing WGS only delayed the decision to redirect care. Often, the clinical details presented for those other cases suggest that redirection of care would have been appropriate without WGS. If parents and care teams continue intensive care in the hope that the WGS is going to produce a result which will dramatically change their infants treatment and prognosis, then that is going to be an extremely rare occurrence, and parents should know that. It may be that the most common impact of WGS testing is continuing invasive care and life support while waiting for a result, care which may be painful, stressful, and ultimately prove futile.
Informed consent for WGS testing requires that we have an honest evaluation of the potential benefits and risks. Some of the investigators who have published on the issue seem to have vested interests in proving how beneficial it is to clinical care. In my personal experience, the most common benefit of a positive WGS to the individual baby has been that we can stop searching for other diagnoses, and concentrate on their symptomatic management. The majority of WGS are non-contributory, however, and we just carry on doing what we were before. I think our group’s scepticism about the likelihood of finding a treatable diagnosis means that we have not delayed redirection of care while waiting for a WGS result.
All of our advanced diagnostic testing, from genomes to PET scans to whatever is next, needs to be accompanied by an unbiased evaluation of benefits and risks (and costs). Only then can we adequately inform parents.