Are these data relevant to my practice?

A couple of recent published trials have made me wonder about that question, and how to assess if an impact suggested by the results of a trial might be relevant to how I practice, and would likely be reproduced if we introduced the intervention in our NICU.

The specific question raised by these two studies is, “When the control group of a trial has an adverse outcome more frequently that I see in my practice, can I expect the same relative effect if I apply the intervention to my patients”?

For example, in this randomized controlled trial of elevated midline head positioning of extremely preterm infants, (Kochan M, et al. Elevated midline head positioning of extremely low birth weight infants: effects on cardiopulmonary function and the incidence of periventricular-intraventricular hemorrhage. J Perinatol. 2019;39(1):54-62) the authors examined the impact on IVH of keeping infants less than 1000g birth weight in a seat at 30 degress of elevation with the head maintained in a midline position during the first 4 days of life, compared to supine positioning (without elevation) accompanied by changes in head position. The primary outcome of that study was the total frequency of peri- and intra-ventricular hemorrhage, which was actually a little higher with the head elevated, 34/90, compared to 31 /90 for the FLAT group. On secondary outcome analysis the distribution of  PIVH was different, with more intracerebral (grade 4) bleeds in the FLAT group than in the ELEVated group, 14 vs 6 (the incidence of grades 3 and 4 together, the more commonly used outcome, was 18 vs 11, or 20% vs 12%).

Which looks like an interesting difference if it can be confirmed. Lets assume for the moment that this is a real impact of the intervention, and not due to other effects, such as random differences (very likely in a small study), lack of blinding (impossible to do in a study like this), adverse impacts of the control intervention (possible, but a quite standard nursing approach is described), unmasked randomization (method of randomization is not adequately described apart from the “use of a randomization table”), baseline imbalance (there are some differences between groups) and other possible sources of bias. If we assume those things for the sake of this argument, we are still left in the controls with a very high incidence of intracerebral hemorrhage, 16%, among an otherwise unselected group of infants with a birth weight below 1000g, average 732 grams, average gestation 25 – 26 weeks, and an even higher incidence of grade 3 and 4 together of 20%.

In my practice, and in the CNN overall in recent years (annual reports are available on the website), the frequency of severe IVH (3+4) under 1000g is more like 11%. In the methods section of this article the authors note a recent incidence of overall IVH of 40% in their practice, and that “eighteen percent of these infants” had grades 3 and 4. It isn’t clear to me whether that means 18% of the ELBW, or 18% of the 40% who had IVH, which would be 7.2%.

In general, to be confident about the impact of an intervention, it is preferable to see at least one very large trial with narrow confidence intervals including babies from multiple centers. The alternative being several trials with different risks in the controls and similar relative risk reductions in order to conclude that something which appears effective in a very high risk population is also effective in a lower risk population.  That is the kind of information that we have for inhaled nitric oxide in term infants, for example. Here is the Forest plot for the iNO studies, as you can see, the larger studies had relative risks of “death or ECMO” between 0.56 and 0.74, and the confidence intervals all overlap substantially, despite differing control group risks.

The second figure is revised from the the term iNO Cochrane review, I have ordered the studies by the control group risk of adverse outcome, and shown the risk difference. As you might expect, the higher the control group risk, the greater, in general, is the risk difference, even though the relative risk, (or risk ratio) is similar. That is why you need to be sceptical when you hear statements such as “it has only been shown to work in high-risk groups”, as we used to hear about probiotics. It is much harder to show a “statistically significant” outcome when the baseline risk is lower. But if the relative risk reductions are similar across studies with different baseline risk, that is a good reason to think that the impacts are the same.

For this positioning intervention, which has significant impacts on care of these babies, the high incidence of PIVH in the controls, reducing to an incidence with the intervention which is more similar to other recent publications, is consistent with an impact of the intervention, but is also consistent with a randomly higher incidence in the controls than usual. This is why to me the meaning of the ‘18% of these infants’ is important, if, over several years, they have had an 18% incidence of severe IVH among the ELBW, and in this study it is similar at 20%, with a substantially lower incidence in the group who had the intervention, I think that is more likely that the reduction is a real impact of the intervention, (in contrast to having a usual rate of 7.2% that increased to 20% during the period of the study).

So what do other trials say? Well there aren’t any, I think. The 2017 Cochrane review found 2 trials comparing supine positioning with the head midline, to supine with the head turned to the side (total n=110), and found no differences of note in any outcome.

A potentially important study, which certainly needs to be repeated, and it would be nice to study head elevation and midline positioning as separate interventions.

The other trial that has made me think about this issue is the PREMILOC trial. (Baud O, et al. Effect of early low-dose hydrocortisone on survival without bronchopulmonary dysplasia in extremely preterm infants (PREMILOC): a double-blind, placebo-controlled, multicentre, randomised trial. The Lancet. 2016;387(10030):1827-36). In that trial, routine administration of hydrocortisone at low doses over a total of 10 days led to a reduction in mortality before discharge, and a reduction in oxygen use at 36 weeks compared to control in infants of 24 to under 28 weeks.

When trying to assess the relevance for my practice, I note that the study did not include any infants under 24 weeks, or with a birth weight below the 3rd percentile, or with ruptured membranes before 22 weeks, or with 5 minute Apgar under 4, or with congential anomalies detected prenatally. It was overall a very high quality multicenter trial including a little over 1000 babies.

One of the striking things in the results of that trial is the high mortality before discharge in the 24 and 25 week infants, which is 44% with placebo and 42% with hydrocortisone. In my practice and in the 2017 CNN report, mortality among infants at 24 and 25 weeks admitted to the NICU (data which include those other very high risk groups) is 20%. The incidence of severe IVH is also very high in PREMILOC  at about 25%. Even at 26 and 27 weeks mortality in both groups is higher than the CNN mortality, 7.6% HC and 15.3% placebo in the PREMILOC trial, compared to 6% in the CNN, again including growth restricted babies.

There are also other signs in the publication that the treatment approaches were substantially different to what we do in our practice, 35% of placebo and 30% of hydrocortisone babies were on inotropes at study entry, 43% in each group received insulin during the trial.

If I already have a mortality which is substantially lower than the intervention group in a trial such as this, am I likely to see an impact of the intervention in my practice?

The answer, I would say, is definitely a ‘maybe’. If the same pathophysiology for the adverse outcome exists, and if the underlying risk factors are similar (including aspects of obstetric management), then I might expect a similar relative reduction in mortality, and a smaller absolute reduction in mortality. Of course as the absolute benefit become smaller, the potential secondary adverse effects of the intervention become potentially more important, in this specific case the increase in late-onset sepsis may be more important in the calculation of the risk-benefit balance.

I am not saying that we should ignore the results of trials if there are details of management or outcomes that are different to local data, that would just lead to chaos and a complete rejection of evidence-based practice, I am saying that we need to be thoughtful about interpretation and application of trial data. If local approaches and outcomes are dramatically different from a published trial, the we should take into account the toxicity/adverse impacts of the intervention, and whether there are other data from a variety of populations that are consistent. To go back to the probiotics example, the relative reduction in the risk of NEC is similar across studies, suggesting a real impact overall, although there are differences between the studies they consistently found no adverse effects of priobiotic use. Even if you have locally a low incidence of NEC, it is likely that it will be further reduced by probiotics, without any increased risk, and without much cost.

For routine midline head-elevated positioning, or routine use of hydrocortisone in the 1st day of life, I will wait for more data from different groups with differing baseline risks before changing my practice.

Posted in Neonatal Research | 1 Comment

Pulse Oximetry screening; a bizarre decision in the UK.

Universal pulse oximetry screening for critical congenital heart disease is a simple cheap addition to universal hearing and metabolic screening with undeniable benefits. Infants with undiagnosed life threatening congenital heart disease can be detected prior to closure of the ductus arteriosus, and prior to discharge from hospital. Infants who have such critical disease can have intervention, including surgery, with a lower mortality compared to infants who present after discharge who are often in shock at the time of diagnosis. Many bodies, including the Canadian Pediatric Society have come out in favour of universal pulse oximetry screening as a result.

Despite all of the data about the beneifts of screening, in the UK the National Screening Committee has just recommended against inclusion of pulse oximetry screening in the national program of neonatal screening, which basically means the neonatal physical exam.

The main justification of this decision appears to be their evaluation of evidence about “harms” of screening. Potential harms are listed in the following way:

• A positive result from pulse oximetry will generate some harms, including: parental anxiety, a longer stay in hospital, possible transfer to the neonatal unit, further tests to assess for non-symptomatic conditions.
• For many of these babies the further investigations will be unnecessary and the baby will be identified as healthy. This is a false positive result.

They also seem to doubt the benefits of a true-positive screen:

• For babies with CHD or other non-cardiac condition it is not clear that investigations and identification of these conditions will lead to any better outcome than a diagnosis at the time the baby becomes symptomatic.

I find this decision bizarre given that the preferred method of the committee is therefore physical examination, which has a false negative rate in the UK of over 50%, and a false positive rate of about 50%. Those data are referred to in this report, as the best studies were performed in the UK, but the potential harms of physical examination (false positive rate high) and false reassurance of negative physical examinations is not considered. I think the routine physical examination is therefore far more questionable than routine pulse oximetry, for the detection of congenital heart disease.

In some studies, in fact, referral for neonatal cardiac ultrasound is not increased by routine pulse oximetry screening. It is just more appropriately targeted. As about 4 to 5% of newborn infants will have a murmur, referral based on the clinical exam leads to many more false positives, especially when the target condition is critical heart disease, than oximetry.

What I find most bizarre, is that between the actual report of the literature review, and the chapter which compares the findings of that review to the criteria for institution of a screening program, there is a huge disconnect. The addition of pulse oximetry screening to the neonatal exam seems to fulfill all the criteria in the list.

Criterion 5

This risk of discharge home without a diagnosis or of severe acidosis has been estimated to be reduced by around 60% with pulse oximetry.
The benefit of newborn screening will be reduced if antenatal detection increases significantly, however current models suggest that newborn screening will remain clinically effective and cost-effective for life-threatening or critical CHDs until antenatal detection rates are above 85-90%…
Non-cardiac conditions leading to low oxygen saturation, such as respiratory or infective illness, may be found in infants with low oxygen saturations (false positive screening results). The benefits and costs of further investigation and early diagnosis of such conditions requires further investigation before these diagnoses can be considered a benefit of screening.

I do agree with that last paragraph, babies with lowish saturations who have, for example, increased pulmonary vascular resistance may do well without intervention, just being the slowest percentiles for the resolution of their fetal pulmonary vasoconstriction. Some probably do benefit from finding low saturations, such as those with sepsis, but it isn’t clear from the literature exactly how many “false positives” actually have conditions that need, and benefit from, intervention. But I also don’t think you can write all the ‘false positives’ off as an undue risk of screening, at least some of the babies will benefit.

The consultation process included an in-depth examination of the ‘false positives’ from the UK pilot study. you can see that if you click on the ‘Notes from the Pulse Oximetry workshop’ on this page. That review examined what happened to the 239 babies with a positive screen (0.73% of the screens performed); there were 14 babies with congential heart disease. Being fairly conservative in their opinions, there were another 36 babies who had conditions requiring treatment. 32 babies had discharge delayed despite not having a treatable condition, and the remaining false positives went home as planned anyway.

Criterion 8

A pathway for clinical investigation after a positive screen result on pulse oximetry has not been clearly established or evaluated in practice. Essential considerations prior to implementation of pulse oximetry in a national screening programme would therefore be to agree a policy for
investigation to identify cardiac and non-cardiac causes of low oxygen saturation, including consideration of the resource implications and acceptability to parents.

Not really in agreement here, the first stage of evaluation of a baby with a confirmed positive screen should be a rapid expert cardiac ultrasound, if there is no structural heart disease, then the second step is not entirely clear, I would agree. Who needs evaluation in what order for what conditions? Do they all need a chest x-ray? Or blood culture) But eliminating critical congential heart disease is a clear priority as the first step on the pathway.

Criterion 10

…Early detection of life-threatening CHDs in asymptomatic newborns allows management aimed at preventing cardiovascular collapse before intervention, a particular risk for duct-dependent cardiac defects, and there is some evidence that this can lead to improved short and long-term outcomes after surgery.


Criterion 14

Antenatal ultrasound, newborn clinical examination and pulse oximetry appear acceptable as  screening tests. However the acceptability of high false positive rates (which may raise anxiety) and  false negative rates (leading to false reassurance) requires further exploration for all screening modalities.

I don’t know the literature about parental anxiety from false positive oximetry screening, but there are several studies about parental impacts of false positive hearing screens, which are very reassuring. They show that false positive screens are not a major burden to families, and that they appreciate the value of screening despite the false positive test of their child. It is important to have good communication with the parents prior to or during the screen, with written and/or verbal information. The experience from the Birmingham study seems to show the same thing. (Powell R, et al. Pulse oximetry screening for congenital heart defects in newborn infants: an evaluation of acceptability to mothers. Archives of disease in childhood Fetal and neonatal edition. 2013;98(1):F59-63).

Most neonatal screening tests are trying to detect relatively rare phenomena, and false postives for almost everything (which may cause stress) are more common than true positives. In this light, the ratio of true positives to false positives for pulse oximetry screening is similar to many of the things we already screen for. For example for neonatal hypothyroidism screening there are between 2 to 3 times as many false positives as true positives, and if your program starts with T4 screens that ratio is even higher. The benefits of hypothyroidism screening are so evident that we accept a substantial number of false positives, and so do the parents.

As for false negatives, this is also an issue with other screens, such as hearing screening, and parental information is key. But is it truly an issue? If a child has critical congenital heart disease which is missed by a falsely negative oximetry screen, and then presents with a serious deterioration later on, are they worse off than if they had not had a screen at all? Are parents likely to say “he had a normal oximetry screen, so we won’t take him to the emergency room?” I think kids with a false negative screen will likely be in exactly the same position as non-screened kids, therefore without an adverse impact.

Criterion 15

Existing evidence suggests that the benefits outweigh the harms for newborn screening, when the screening test is clinical examination with or without pulse oximetry, and for antenatal screening, when the screening test is antenatal ultrasound.


Criterion 16

The existing evidence strongly suggests that pulse oximetry in conjunction with clinical examination is more cost-effective than clinical examination alone. Further evidence, including estimation of  QALYs, continues to support this.

Strongly agreed!

The literature review does have a couple of strange features, including a figure showing a list of congenital heart diseases causing death in the first year of life, which is headed by ventricular septal defect. They don’t give the correct reference for the data from which the data are derived, and I can’t find any data in the references by Wren, that they seem to be referring to, that supports that figure. If the quality of the understanding of the problem is reflected by that figure (when did you last see an infant with a VSD die? Before 1 year of age? After excluding complex cardiac malformations which happen to include a VSD?) A fairly recent study from France, for example put VSD as the least likely to cause death before 1 year of age.

Overall I think the recommendation of this committee doesn’t follow from their review of the literature and their own pilot project. If you are in the UK you can make a comment before this is finalized, go the university of Birmingham website here, and follow the links.


Posted in Neonatal Research | Tagged , | 1 Comment

Gastric acid is good for your bones.

We’ve known for a while now that suppressing gastric acid production in preterm infants increases Necrotising Enterocolitis and also systemic sepsis. Presumably this is because the intestinal microbiome is deranged by allowing the survival of pathogens as they pass through the stomach. There is direct evidence of this effect.

In older children we also know from a large multicenter RCT that PPI (proton pump inhibitor) use increases pulmonary infections in children with poorly controlled asthma, without any benefit.

There are also data about the adverse impacts of PPI use on iron and calcium absorption, mostly from adults, and that this increases the chance of having low bone density and more fractures. They also cause hypomagnesaemia and reduce B12 absorption.

Now observational data from a huge database (Malchodi L, et al. Early Acid Suppression Therapy Exposure and Fracture in Young Children. Pediatrics. 2019:e20182625)
show the same association in young children. There were nearly a million children in the database of the US Military HealthCare System, and, amazingly, more than 10% of them had acid suppression therapy in the first year of life. 0.9% of the children had a PPI, 8% a H2Receptor Antagonist, and 2% had both.

Exposed infants were more likely to have fractures,  the longer they received acid suppressants and the earlier they started, the greater the risk.

Although there is some doubt about the precise mechanism, the data are very consistent, and our very preterm babies, who are already at risk of poor bone mineralisation, should not be placed on acid suppression therapy unless there is reliable evidence that they have a clinical condition caused by gastric acid.

For most of the 10% of the infant population who received the therapy in this study, I can wager that there was no such evidence apart from, perhaps, a tiny minority. For nearly all of the NICU patients who receive such therapies, I can make the same wager, and be quite sure that my stake will be safe.

Even among infants who are thought to have clinical events related to reflux (a thought which is almost always shown to be erroneous when tested objectively) there is little or no reason to block acid production.

The latest in a large number of studies attempting to elucidate the relationship, if any, between reflux and cardio-respiratory events, presents recordings of multi-channel impedance, ph-metry and cardiorespiratory recordings in 52 preterm infants who had symptoms thought to be due to reflux (regurgitation, rumination or irritability) and on-going cardiorespiratory events. (Nobile S, et al. Correlation between cardiorespiratory events and gastro-esophageal reflux in preterm and term infants: Analysis of predisposing factors. Early Hum Dev. 2019;134:14-8.) Babies were on average about 1kg birth weight and between 1 and 2 months old when studied. Five of the 52 babies perhaps had a temporal association between reflux and events, as calculated by the “symptom association probability” of more than 95%, and in only 3 of those did the events follow the reflux, within 2 minutes. Of those events which followed reflux episodes, many were non-acid or very weakly acidic.

So even among the small proportion of events which occur within a couple of minutes of a proven reflux event, which may be a random association or possibly causative, there is no good reason to suppose that blocking gastric acid production will make any difference.

In the absence of multiple intraluminal impedance studies, the only clinical sign which strongly suggests reflux is a vomit stain somewhere. (As we are just past father’s day, I can testify to the diagnosis of reflux from the puke-stained shoulders of my shirts). If the baby is found with regurgitated milk in their bed, then they have had a reflux episode, no other clinical sign is reliable.

In a recent review article from an expert in the field, Sudarshan Jadcherla, the highlight of the conclusion regarding therapy of reflux disease was “No single pharmaceutical or non-pharmacological target exists to treat infant GERD”. Not only that but commonly used therapies are toxic, and may affect your patients’ bones.

Posted in Neonatal Research | Tagged , | 1 Comment

What respiratory outcomes are important?

When bronchopulmonary dysplasia was first described by Northway in 1967 he didn’t try to produce a definition, his paper was a description of a small number of preterm survivors of high oxygen and positive pressure ventilation. He noted some years later the long term implications for pulmonary health of the survivors, during adolescence and as young adults, many had hyperinflation, and small airways dysfunction was common. Many had on-going respiratory symptoms.

Since then we have been trying to arrive at a way of predicting long term pulmonary health without waiting for those very prolonged outcomes. Andy Shennan came up with the idea of oxygen requirements at 36 weeks because that was a better predictor of post-discharge pulmonary problems than oxygen need at 28 days, which had become the rather arbitrary definition prior to his publication. Even with the first publication in 1988, the positive predictive value of O2 at 36 weeks for post-discharge respiratory problems was only 63% and the NPV was 90%. With changes in survival and the pattern of lung injury, the usefulness of oxygen at 36 weeks as a predictor of clinically important pulmonary injury has diminished.

A study by the Canadian Neonatal Network/Neonatal Follow-up Network examined the predictive ability of various combinations of oxygen therapy and other respiratory support among infants <29 weeks gestation for predicting what they called “serious respiratory morbidity” which was defined as either (1) 3 or more rehospitalizations after NICU discharge owing to respiratory problems (infectious or noninfectious); (2) having a tracheostomy; (3) using respiratory monitoring or support devices at home such as an apnea monitor or pulse oximeter; and (4) being on home oxygen or continuous positive airway pressure at the time of assessment between 18 and 21 months corrected age. (Isayama T, et al. Revisiting the Definition of Bronchopulmonary Dysplasia: Effect of Changing Panoply of Respiratory Support for Preterm Neonates. JAMA Pediatr. 2017;171(3):271-9.) At least 3 rehospitalizations was chosen because the 95th percentile of the number of readmissions owing to respiratory problems in this cohort was 2. Such serious morbidity occurred in 6% of the approximately 1500 babies included, and was not well predicted by oxygen requirements at 36 weeks, it was better predicted by either oxygen or respiratory support (which included high flow nasal cannulae at more than 1.5 litres per minute or other invasive or non-invasive support) and determining that at 40 weeks post-menstrual age was a better predictor than at 36 weeks. 24% of the babies in the cohort had oxygen or respiratory support at 40 weeks, of them 16% had “serious respiratory morbidity”, of those who did not have O2 or respiratory support at 40 weeks 2.4% had this severe morbidity.

Although this definition gives an adjusted odds ratio for serious respiratory morbidity of 6.1 (95% CI 3.4-11.0), which is better than a definition at 36 weeks, I am not sure how useful it is, with a PPV of 16% and a fairly high negative predictive value, a large majority of infants with this definition of BPD do not have ‘serious respiratory morbidity’.

The bigger problem maybe, is the definition of SRM (typing the whole phrase takes me too long!) being home on an apnea monitor doesn’t seem equivalent to me to having 3 hospitalisations, or still being on CPAP at 18 month follow-up. I think determining which definition of BPD, (a definition created by physicians), is best at predicting SRM, as defined by physicians, won’t necessarily help us a great deal. We should first determine which respiratory outcomes are most important to families, and then rank the clinically important outcomes (or rather ask parents to rank them), if we did that we could produce some sort of ordinal score of severity of preterm chronic lung disease. Then we could figure out if it is possible to predict those outcomes before they occur. We also should be very careful about introducing items which may be very strongly affected by different practice patterns, (such as home apnea or saturation monitoring), in some centers home use of such monitors is exceedingly rare, compared to others.

If I were to guess, I would think that families, during the first year or 2 or life, would be more disturbed by emergency room visits, and hospitalisations (especially PICU re-hospitalisations) and next by home oxygen, especially as the infant becomes more active, and next by daily use of respiratory medications. Tracheostomy would probably be way out at the top, but includes very few babies in Canada. But I don’t think my guesses are worth much, we should really ask parents, and other members of society.

Another recent publication from the NICHD network only examined possible variations of BPD definitions at 36 weeks, not at other post-menstrual ages. (Jensen EA, et al. The Diagnosis of Bronchopulmonary Dysplasia in Very Preterm Infants: An Evidence-Based Approach. Am J Respir Crit Care Med. 2019)  They tested 18 different ways of analyzing different levels of respiratory support and FiO2 at 36 weeks for their accuracy in predicting post-discharge serious respiratory morbidity which was defined as death between 36 weeks and follow up or:

the occurrence of at least one of the following: tracheostomy placed any time prior to follow-up; continued hospitalization for respiratory reasons at or beyond 50 weeks PMA; use of supplemental oxygen, respiratory support, or respiratory monitoring (e.g. pulse oximeter, apnea monitor) at follow-up; or ≥2 re-hospitalizations for respiratory reasons prior to follow-up. Continued hospitalization at 50 weeks PMA is approximately 2 standard deviations above the mean age at discharge for extremely preterm infants included in Neonatal Research Network studies. Two or more re-hospitalizations represents the upper 75th percentile for re-hospitalization number among Network babies.

In this study, the best definition appeared to depend solely on the level of respiratory support at 36 weeks irrespective of the oxygen needs, using these diagnostic criteria, infants breathing in room air at 36 weeks PMA did not have BPD. Disease severity among the remaining infants was classified according to support: grade 1, nasal cannula at flow rates ≤2L/min; grade 2, nasal cannula at flow rates >2L/min or non-invasive positive airway pressure; and grade 3, invasive mechanical ventilation.

Using these definitions the percentage of babies with respiratory morbidity increased from 10% among infants without BPD, to 19% with grade 1, 35% with grade 2, and 77% with grade 3.

My comments about this study echo those for the CNN paper, the relative importance of these different aspects of SRM is variable, and an overall moderately good capacity to predict which infants may develop SRM is interesting, but would it be enough to be used as an interim outcome measure for clinical trials? Or as a marker of respiratory outcomes for quality control? In addition some of the measures are likely to be heavily influenced by practice patterns, some centers will use higher flow rates and lower oxygen concentrations, some will remove CPAP earlier than others, or extubate with higher ventilatory requirements.

One of the reasons for bringing this up now is a publication of a study aimed at reducing lung injury in the very preterm infant. Davis JM, et al. The role of recombinant human CC10 in the prevention of chronic pulmonary insufficiency of prematurity. Pediatr Res. 2019. I think this is the second small preliminary trial of Clara Cell Protein in very preterm babies, the previous one having 22 babies total (7 controls, 7 with low dose and 7 with higher dose Clara Cell Protein, called CC10 in the new study). It is a recombinant version of  a protein produced by human Clara cells, which has several potentially beneficial effects, including reduction of lung inflammation. In the previous trial there was a reduction of signs of lung inflammation with a single intratracheal dose, that study was underpowered for clinical outcomes, and didn’t really show any evidence of clinical benefit. The new trial included 44 babies in all, of 24 to 29 weeks and intubated for RDS. They were included in 1 of 2 sub-studies, the first randomized kids to low dose (1.25 mg/kg) or placebo, the second to higher dose (5 mg/kg) or placebo, given as a single intratracheal dose within 4 hours of surfactant, and at less than 24 hours of age.

The primary outcome of the study was survival without what they called CPIP, chronic pulmonary insufficiency of prematurity, which was defined as presence of any one of the following : (1) evidence of respiratory symptoms (e.g., coughing and wheezing) or use of respiratory medications by parental diaries or pulmonary questionnaires (CPIP-SS), (2) one or more re-hospitalizations for respiratory causes (CPIP-RH), (3) administration of respiratory medications (including oxygen) (CPIP-RM), and (4) at least one non-routine medical visit for respiratory causes (CPIP-DV).

These sound like things that are likely to be of some importance to families; the primary outcome was analyzed as a dichotomous, yes/no outcome, and in addition they recorded the number of criteria for CPIP that were satisfied.

These criteria are clearly much less severe than the outcomes of the 2 other studies discussed above, and almost all of the babies had CPIP, with no difference between placebo and CC10 groups, at either dose. Overall only 8 of the 44 babies survived without CPIP. About half of the babies in the trial did not have BPD by the usual criterion of oxygen needs at 36 weeks, and most of the babies who did not have BPD still had CPIP.

What should the outcome of interest be for studies trying to reduce lung injury in the very preterm infant? Is an interim outcome such as BPD, by any of these definitions, of sufficient predictive capacity to use it in place of clinically important pulmonary dysfunction in the first years of life? I think we have to be very careful that, even if we could come up with criteria that predict chronic pulmonary symptoms among the general preterm population, an intervention that acutely reduces the chances of satisfying those criteria will not necessarily improve long term pulmonary function. Post-natal steroids, for example, by reducing inflammation, reduce the number of babies who satisfy definitions of BPD which are based on oxygen needs, but there is no evidence that they improve long term pulmonary function.

In order to answer the question of the usefulness of “BPD” as a surrogate outcome for improvements in pulmonary health in clinical trials, Anna Maria Hibbs and her colleagues have performed a systematic review of large trials of BPD prevention that also published data on long term respiratory health outcomes Corwin BK, et al. Bronchopulmonary dysplasia appropriateness as a surrogate marker for long-term pulmonary outcomes: A Systematic review. J Neonatal Perinatal Med. 2018;11(2):121-30. The review found 5 trials, the DINO trial from Melbourne of DHA for preventing developmental impairment, which also looked at lung injury, the NO-CLD trial, the early iNO trial which disappointingly has no obvious acronym, SUPPORT, and the SOD the lungs trial (that is actually my own personal acronym for the trial).

All the studies reported BPD, using a variety of definitions, and all had some measures of longer term pulmonary health: all reported healthcare utilization,  (hospitalizations, and visits to the doctor or the emergency room); 3 reported respiratory illness, (e.g. asthma); 4 reported respiratory medication use (e.g. bronchodilators, steroids, diuretics, oxygen); one study reported 1 year mortality.

A consistent relationship between BPD and subsequent markers of respiratory morbidity was not seen in the trials studied. Only the NO CLD trial found a significant decrease in rates of BPD. This study also saw significant decreases in the use of respiratory medications including oxygen, but in no other outcome measures. The SOD study also found a significant decrease in respiratory illness requiring asthma medications, but failed to find a significant decrease in BPD. Similarly, the SUPPORT study failed to find a significant decrease in BPD, but did see significant decreases in measures of healthcare utilization and respiratory illness.

Not only therefore is BPD, using any definition, a very limited way of describing severity of early pulmonary injury, it is clearly of very low value as a surrogate for clinical trials aimed at improving pulmonary health.

This brings into even sharper relief the concerns about using BPD as part of the combined outcome of BPD or death. Mortality matters, BPD, not so much.

Chronic pulmonary dysfunction and respiratory illness in early life are important adverse outcomes for families, finding better ways to describe, predict, and prevent them is a priority.

Posted in Neonatal Research | Tagged , , , | 1 Comment

High-flow in non-tertiary neonatal units: Hunting for answers. #EBNEO

I think Brett Manley is going for the record as the person with the highest proportion of his publications in the FPNEJM, he now has 3, with 2 of them as first author. This is the HUNTER trial where babies in level 2b NICUs, as we could call them, i.e. nurseries with access to CPAP, but not prolonged invasive ventilation, were randomized to receive either CPAP or high-flow nasal therapy. (Manley BJ, et al. Nasal High-Flow Therapy for Newborn Infants in Special Care Nurseries. The New England journal of medicine. 2019;380(21):2031-40) Babies are not usually kept in such units if they are less than 32 weeks or less than 1200 g, so to be eligible for this trial they had to be at least 31 weeks gestation, >1200 g birth weight, and to need non-invasive respiratory support according to the attending pediatrician. This was performed as a non-inferiority trial, and was designed to be able to detect an increase in therapeutic failure from 17% with CPAP to 27% with high flow.

Babies in the high flow group were placed on 6 litres per minute of a heated humidified gas mixture via the Fisher-Paykell Optiflow device, which could be increased to a maximum of 8 lpm. CPAP was delivered using a bubble system and either a mask or prongs, at 6 cmH2O, which could be increased to a max of 8 cmH2O. If a high-flow baby failed they could be treated with CPAP at 8 cmH2O, if a CPAP baby failed they were out of the trial and a discussion with the regional NICU was expected.

There were over 750 babies randomized, and the primary outcome was treatment failure in the first 72 hours: which was defined as; if they got to maximal support and needed more than 40% oxygen for more than 1 hour (to stay at 91 to 96% saturation), or had a respiratory acidosis to less than 7.2, or had severe apnea.

Treatment failure occurred in 20.5% of the high flow babies and 10.2% of the CPAP group. Many of those who failed high flow were transferred to CPAP and about half of them stabilized, so in the end just over 5% of each group were intubated within the 1st 72 h, and around 6% in total in each group; there were twice as many pneumothoraces needing intervention in the CPAP group (4.8% vs 2.4%, 95% confidence intervals include compatibility with no difference). Slightly more babies in the high-flow group were transferred to tertiary NICUs, 13% vs 11%.

This is really important clinically relevant data for level 2 nurseries. The eventual clinical outcomes were quite similar in the two groups, so I think it would be acceptable to continue to use high-flow in such nurseries as long as you had back up CPAP readily available. There were  no clear advantages of high-flow demonstrated in this study but other studies have found that parents prefer high-flow, and they appear to be more comfortable for the babies.

The only quarrel I have with this study is how tortuous the language becomes with non-inferiority studies. We are told that high-flow was “not non-inferior” to CPAP. Why not just “was inferior”? There were substantially more treatment failures, which passed the non-inferiority margin, hence the intervention was inferior in terms of the primary outcome.

Posted in Neonatal Research | Tagged , , , | 1 Comment

Research Outcomes in Neonatology : must do better.

When planning a research project with neonatal patients the first question should be, what am I investigating? The PICO outline : standing for Patients, Intervention, Controls (or comparison) and Outcome, is a standardized way of asking the simple question. If I do X to babies with certain characteristics, what happens compared to not doing X?

‘What happens’ is called the outcome or outcomes of interest. You would think that asking that question in a simple way is very straightforward. Measuring ‘what happens’ in ways that are standardized and reproducible and easily understandable to the potential users of your research should be the goal. How are we doing as a neonatal community?

Regular readers of this blog will know that it is not that rare for a published article to be quite unclear about what exactly the primary outcome was, or how it was defined or measured.

Surely large multicentre trials do better? Not always! Webbe JWH, et al. Inconsistent outcome reporting in large neonatal trials: a systematic review. Archives of disease in childhood Fetal and neonatal edition. 2019.

In this systematic review the authors studied published RCTs in NICUs that had at least 100 babies per group over a recent 5 year period, 2012-2017. They found 76 such trials with over 43,000 total participants.

Across 76 trials 216 distinct outcomes were reported. The most commonly reported outcome was survival; reported in 67 trials (88%). The next most commonly reported outcomes were necrotising enterocolitis (53 trials (70%); bronchopulmonary dysplasia (50 trials (66%); sepsis (48 trials (63%) and retinopathy of prematurity (43 trials (57%). In relation to neurodevelopmental outcomes, visual impairment or blindness were only reported in 21 trials (28%) and 42 trials (55%) did not report any developmental outcomes. Even among the 10 trials involving the largest numbers of infants, major neonatal conditions were not universally reported. Of the 216 outcomes reported, 92 were only reported in a single trial.

Where trials reported the same outcomes, for example, retinopathy of prematurity, these may not be comparable if different outcome measures are used; for example, bilateral
retinopathy of prematurity stage ≥3 and retinopathy of prematurity needing surgery. Sepsis was recorded using 43 different outcome measures; bronchopulmonary dysplasia 16 outcome measures and necrotising enterocolitis 13 outcome measures. In relation to the 216 outcomes, 889 different outcomes measures were reported; of these, 639 were only reported in a single trial.

I have written many times about composite outcomes, and the problems with their use, this review shows their prevalence:

Composite outcomes were used in 41 trials (54%); the most
commonly reported composite being a composite of death and
bronchopulmonary dysplasia (13 trials (17%). This composite
was reported using six different measures and at two time points. There was heterogeneity among composites: 33 different
composite outcomes were reported using 69 incomparable composite outcome measures, with 58 of these outcome measures only reported in a single study.

I have also published a paper about parent and family involvement in neonatal researach, this article clarifies just how far we have to go:

Among 76 included trials and after reviewing published papers and protocols where available, we found that no trial reported patient or parent involvement in outcome selection.

Large clinical neonatal trials are usually clear about what the primary outcome was, and, most often, how it was measured, but if they are not comparable between trials it becomes very difficult to summarize the available literature. Even though certain trials may indeed need to study outcomes that are not universally collected, such as feeding trials, which may need to report outcomes of time to oral feeding competence, as one possible example. It would be very helpful if all trials report a core outcome set, with universally applied definitions and timing. That way different interventions can be summarized and compared, for their impact on the same outcomes, which should be outcomes which are important to families, not just to the medical team.

This study came from the COIN group (Core Outcomes in Neonatology) who are working to do just that. Hopefully the items that we measure in the future include measures of impacts of our care that are standardized and which make a difference in the lives of the babies we care for.

Other groups are also working on the same issue (ICHOM, for example, the International Consortium for Health Outcomes Measurement). I hope they all get together so that we do not have more than one Core Outcomes Set!

Posted in Neonatal Research | Tagged , | 3 Comments

Just do it! Who should go home on oxygen?

A new guideline from the ATS has been published, which gives guidelines for home oxygen therapy for children, one large group of which is, of course, babies with bronchopulmonmary dysplasia. Hayes D, Jr., et al. Home Oxygen Therapy for Children. An Official American Thoracic Society Clinical Practice Guideline. Am J Respir Crit Care Med. 2019;199(3):e5-e23.

The overall recommendation is that children with chronic hypoxemia should receive home oxygen therapy.

It is hard to argue with a recommendation like that! The big question of course is, how to define chronic hypoxemia? It should, I would think, be defined by a a level of oxygenation which has been shown to have an adverse clinical impact on the child, on their growth or development or chronic illness.

You can only really determine that by performing controlled trials where children with various levels of oxygenation receive either home oxygen or no O2. You would have to start with relatively higher levels of saturation to ensure safety, and then test lower levels until you find a clinical impact.

Of course, we don’t have good sequential data like that, but what do we have?

The guidelines authors reviewed the literature and found 11 observational studies comparing outcomes of BPD kids who had home oxygen therapy (which they abbreviate as HOT). Ten of them were eliminated as being biased, so the only evidence they have to support their recommendations is from a 1996 study. (Moyer-Mileur LJ, et al. Eliminating sleep-associated hypoxemia improves growth in infants with bronchopulmonary dysplasia. Pediatrics. 1996;98(4 Pt 1):779-83).

In that study infants who were already home on oxygen were admitted overnight to a clinical research unit and taken off oxygen for at least 8 hours. They were then divided into 3 groups depending on their saturations.

Group 1 consisted of 11 infants whose Sao2 decreased to between 88% and 91 % for more than 1 hour of sleep. Group 2 consisted of 34 infants whose sustained minimum Sao2 was 92% or greater for the entire recording. Group 3 consisted of 18 infants whose sustained minimum Sao2 was less than 88% for more than 1 hour of sleep. [HOT] for infants in groups I and 2 was stopped, but it was continued for infants in group 3. After a subsequent prolonged slep Sao2 evaluation, all but 6 infants in group 3 graduated to groups I and 2, and [HOT] was stopped.

Which sound like a reasonable design for a prospective observational study, my main question being what was done about intermittent desaturations? Certainly at discharge many of our babies have baseline saturations in room air above 92%, but have multiple brief, or not so brief, desaturations to below 88%. This report only refers to sustained desaturations which make up 12.5% of the 8 hour sleep study.

The age at which the trial of stopping HOT was done varied between 4 months average in group 1, 6 months in group 2, and 9 months in the infants who were initially in group 3. The major outcome reported for the infants is their growth, although hospitalisations were recorded, they don’t appear to be reported.

The study showed that group 1 infants, those with sustained saturations for more than an hour below 92% when O2 was stopped during an overnight study, had a fairly dramatic falling off of their growth after HOT was discontinued by the time of their next clinical visit, 6 to 8 weeks later. Their energy intake was the same, but their weight gain dropped from 16 g/kg/d to 4 g/kg/d. There were a total of 14 babies in that group.

So I guess a reasonable recommendation, based only on this one study including 14 relevant babies, could be that, with not much confidence, after a few months of HOT, oxygen could be stopped if a prolonged sleep saturation recording did not have a sustained period of more than 1 hour of desaturation to below 92%, and that infants with lower sustained saturations (to between 88 and 92%) risk a short term period of reduced growth if HOT is stopped, but it should state that there is no information about how prolonged that growth reduction might be or any other health effects.


This is the recommendation:

For patients with BPD complicated by chronic hypoxemia, we recommend that HOT be prescribed (strong recommendation, very low-quality evidence). Chronic hypoxemia is defined as either 1) greater than or equal to 5% of recording time spent with an SpO2 less than or equal to 93% if measurements are obtained by continuous recording or 2) at least three separate findings of an SpO2 less than or equal to 93% if measurements are obtained intermittently.

The ATS guideline also gives an instructive interpretation of what they mean by a strong recommendation:

Table 1. Meanings of the Strength of the Recommendations
A Strong Recommendation Conveys . . . A Conditional Recommendation Conveys . . .
It is the right course of action for >95% of patients. It is the right course of action for >50% of patients but may not be right for a sizable minority.
“Just do it. Don’t waste your time thinking about it, just do it.” “Slow down, think about it, discuss it with the patient.”
You would be willing to tell a colleague that he or she did the wrong thing if he or she did not follow the recommendation. You would NOT be willing to tell a colleague that he or she did the wrong thing if he or she did not follow the recommendation because there is clinical equipoise.
The recommended course of action may make a good performance metric. The recommended course of action would NOT make a good performance metric.

I find that table incredible!!

Based on very near to no data whatsoever, the ATS tell you that you should be like sheep and not even think about their recommendation, (“Just do it”: was this sponspored by NIKE?) and you should criticise your colleagues if they did something different. You are not supposed to even discuss starting O2 with the parents!! (Shared decision-making anyone?)

The recommendations are nothing like the only study that they use to support their recommendations! 5% of recording time at <94% is the recommendation based on a study which showed an impact (of questionable long term health significance) for children with a sustained saturation <92% for at least 1 hour, i.e. 12.5% of recording time.

This recommendation, if followed, will mandate thoughtless imposition of home oxygen therapy on tens of thousands of babies.

In addition (and of course I always find some additions) this is based on a single >20 year old observational study. By relying on this, if they followed their statement development guidelines, they imply that there are no randomized trials. But that is not the case, of course : Askie LM, et al. Oxygen-saturation targets and outcomes in extremely preterm infants. The New England journal of medicine. 2003;349(10):959-67.

In the first BOOST trial, 350 preterm babies with evolving lung disease born at less than 30 weeks were randomzied to higher or lower saturations if they still needed oxygen at 32 weeks. Saturations of either 95-98% or 91-94% were targeted, until the infant no longer needed oxygen. So this impacted on how many children went home on oxygen, what saturations they targeted at home, and how long they stayed on oxygen after discharge. 30% of the high sat target group went home on oxygen, compared to 17% of the low target group.

That study showed no health benefits of higher saturation targets, and no impact on growth. There was a hint of worse pulmonary outcomes with higher saturations. Of course, some babies in each group had acceptable saturations in no oxygen prior to discharge, but no clinical benefits were found in the overall study

In the STOP-ROP trial 650 very preterm babies with threshold retinopathy who needed oxygen were randomized to higher (96-99%) and lower (89-94%) saturation targets, which was continued until RoP was regressing, which was almost always by 3 months corrected age, by which time there were substantial numbers of babies still on oxygen, 47% vs 37%, mostly, but not all, at home (13% vs 7% remained hospitalized). During those 3 months there was no impact on growth.

Again in contrast to what is stated by the ATS guidelines, who claim that there are no adverse effects of home oxygen, infants in both of these trials note an increase in some adverse pulmonary outcomes. In BOOST there were more late pulmonary deaths, and in STOP-ROP there were more infants with “pneumonia/CLD events” in the group who had more prolonged oxygen and were more likely to go home of oxygen.

Proof of safety of the oxygen therapy thresholds mandated by the ATS (mandate is not too strong a word given their definition of what they mean by strong recommendation) does not exist. Lower thresholds might be safer, and would lead to fewer children needing home oxygen.

It is incoherent to have a strong recommendation which is supported by extremely low quality data, or indeed, as in this case, no data at all. A recommendation to discuss the possible pros and cons with parents, a clear outline of what those pros and cons might be, a more careful analysis of potential and proven impacts, positive and negative, and an outline of what data are needed to be able to provide evidence-based guidelines, and a commitment to fund such trials would serve our patients much better than this.

Of course some babies need and benefit from home oxygen, but providing HOT to many babies who might well not need it, and could potentially suffer adverse effects, is not the way to ensure that those who truly benefit will actually receive it.

Posted in Neonatal Research | Tagged , , , | 2 Comments