Do blood transfusions treat apnoea of prematurity?

There has for a long time been a thought that anemic babies with many apnoeas could benefit from a blood transfusion which would decrease their apnoeic spells. This idea has never been directly tested by an RCT. That is, a trial in which infants with apnoea were randomized to receive a transfusion or control, and the response accurately determined. I actually started such a trial when I was in San Diego, but only enrolled a tiny number of babies before leaving to return to Canada; the fellow who was involved finished at about the same time as me, and the project was sadly terminated. I admit that it is not ethical to randomize infants to a trial, with all the stress imposed, and the goodwill of parents involved, and then not have a mechanism to complete the trial, and I apologize to the parents and families involved.

The evidence we do have, therefore, comes from observational studies of various kinds and from secondary analysis of RCTs of blood transfusions at differing thresholds, in which the impacts on apnoea or on intermittent hypoxia (IH) have been recorded. Just to remind my readers, most IH is caused by apnoea, prolonged recordings of saturation are much easier than the prolonged multichannel recordings required to objectively quantify apnoea. I will use the terms interchangeably here, partly because the harms of multiple recurrent apnoea spells, are probably because of frequent desaturation leading to hypoxic injury, and resaturation, with consequent oxidative injury.

Interestingly, the 2 types of evidence give contradictory results. Observational studies tend to show a reduction in apnoea, or IH, after a transfusion, whereas RCTs don’t show a difference in apnoea, or IH, by randomized group; those with a haemoglobin maintained at a higher level do not have less apnoea/IH than those allowed to have lower Hgb.

This is an object lesson in the hazards of observational studies, especially for a condition which is very variable in severity (between patients and between days) and which eventually improves.

I actually use this example when I teach statistics and research design to fellows!

The figure below shows two columns of randomly distributed numbers which I generated, each connected by row number, with a mean of 0 and a SD of 1. If we take “0” to mean the overall average frequency of IH per hour in the sample, (it could be 4, for example), the numbers on the vertical axis are the number of Standard Deviations above and below the mean, this figure could be the number of IH per hour on day 7 and on day 14 of life of 200 preterm babies.

If one decides to give a treatment only to those babies who have more IH than the mean, which means you are selecting the babies in the top half of the distribution, then measure IH frequency again after the treatment. Then, even if the treatment had no effect whatsoever on IH frequency, the result you would get is shown below.

This is regression to the mean. There are very many examples of this, as a potential explanation of positive results in observational studies. Let me give you one example, of an uncontrolled study of of an apnoea treatment. Marlier L, et al. Olfactory Stimulation Prevents Apnea in Premature Newborns. Pediatrics. 2005;115(1):83–8. In this study, babies having recurrent apnoea were exposed to a nice smell, lavender wafted through their incubator, and they evaluated apnoea frequency afterwards. There were fewer apnoeas when the babies had a pleasant odour in the incubator! Without a randomized control group such data are worthless. Any variable condition will tend to get better if you start treating it when it is worse than average, whether you use something effective or give a placebo.

If you tend to give transfusions to anaemic babies who are having more apnoeas, then you are immediately creating exactly this situation. If transfusions have no impact on apnoea, then an observational study will show a significant reduction in apnoea frequency following transfusion.

Also striking is what happens if you ask the question, “do babies who have the most apnoeas have the greatest benefit from transfusion?”, then plot the initial apnoea frequency against the improvement after the treatment, using the randomly generated numbers in the figure above, this gives a correlation coefficient of 0.56 and a p-value of <0.0001.

Remember, these are from entirely random numbers, after an intervention with no real impact whatsoever on apnoea frequency! All you have to do is select the most severely affected babies to treat, they will seem to improve the most.

We can see this, as a potential explanation of results which claim to show a benefit of transfusion on IH, in several studies, such as this one (Kovatis KZ, et al. Effect of Blood Transfusions on Intermittent Hypoxic Episodes in a Prospective Study of Very Low Birth Weight Infants. J Pediatr. 2020;222:65–70). In that study, they examined IH before and after blood transfusion, as well as before and after transfusion of other blood products.

IH decreased from 5.3/h to 3.6/h after blood transfusion, and was unchanged at 4.6/h before and after the small number of transfusions of other blood products.

This is exactly the result you would expect if caregivers were more likely to transfuse anaemic preterm infants when they were having a greater than average number of IH, but give plasma, platelet, or other transfusions for reasons that have nothing to do with apnoea; even if there were absolutely no effect of transfusions on apnoea incidence or IH.

I am not picking this study out as a particularly egregious example, in fact, it is a better study than most, as it at least had the non-RBC controls. You would also see something similar if there was a real impact of RBC transfusions on IH.

If we look at the RCTs of blood transfusions in the preterm, which have compared different transfusion thresholds, there is no apparent impact on IH or on apnoea. This includes the most recent publication, which is a secondary analysis of an RCT. (Franz AR, et al. Effects of liberal versus restrictive transfusion strategies on intermittent hypoxaemia in extremely low birthweight infants: secondary analyses of the ETTNO randomised controlled trial. Arch Dis Child Fetal Neonatal Ed. 2025). The ETTNO trial was a randomized comparison of differing transfusion thresholds in infants <1000g birthweight that I have already discussed, which showed no impact on the primary outcome of “survival without NDI”. There was no impact in survival or on developmental progress, as measured by the Bayley version 2 MDI, results of which were identical between groups. The transfusion thresholds were a bit complicated in ETTNO, the high threshold group had 3 different thresholds according to postnatal age in stable babies, and 3 higher thresholds in “critically ill” babies. The low transfusion group had a different matrix of 6 transfusion thresholds according to postnatal age and being stable or not. Of note, one of the indications for being considered “critically ill” were 6 or more apnoeas/day requiring nursing intervention, or IH to <60% saturation >4 times per day.

This new secondary analysis compared IH frequency and severity according to randomized group. About 50% of the babies had good enough recordings of saturation for analysis, a subgroup who seemed representative of the whole sample.

There were no differences in any index of IH between groups.

In the PINT study, one of our secondary outcomes was how many babies in the low vs high transfusion threshold transfusion groups had “apnea requiring treatment”, which was 55% in the lower Hgb group and 60% in the higher Hgb group, in other words, the small difference was in the opposite direction and not in favour of an impact of Hgb on apnoea frequency.

I don’t think there are any secondary outcome data on apnoea or IH from the TOP trial, if anyone knows of any, please let me know. A much older trial (1984) of only 56 preterm infants reported apnoea frequency among infants randomized to either have their Hgb kept above 100 g/l, or to be transfused only for clinical indications, including surgery, but also including severe apnoea not responding to theophylline as a clinical indication. There were no differences in recorded apnoea frequency despite differing Hgb concentrations. The only controlled data which show a possible impact on apnoea are from the Iowa trial, of 100 preterm infants <1300 g (Bell EF, et al. Randomized trial of liberal versus restrictive guidelines for red blood cell transfusion in preterm infants. Pediatrics. 2005;115(6):1685–91) which had more apnoea spells in the lower threshold group, and more apnoea requiring nursing intervention, 0.4/day compared to 0.2/day. In that trial, infants were allowed to receive a transfusion, even if they were in the low threshold group, if they had multiple apnoeas, which is a possible confounder in analysing the meaning of that result.

I looked for systematic reviews of transfusion in the preterm to see if any had analysed the impact on apnoea, and was unable to find any other reliable data, but read my next post to see what I did find.

My take home message is that there are few reliable data to show that apnoea or IH is more frequent in infants with lower Hgb, nor any reliable evidence that RBC transfusion reduces apnoea or IH in the preterm.

If you transfuse babies who are having more apnoea, or more IH, than average they will usually have a reduction in their episodes.

But :

If you don’t transfuse babies who are having more apnoea, or more IH, than average they will usually have a reduction in their episodes.

The only way to resolve the issue would be to do a trial similar to the one that I started years ago. Enrol anaemic infants with apnoea or IH, randomize them to transfusion or control and obtain objective recordings of their responses. I have a strong feeling, based on my evaluation of the currently available data, that both groups will show a reduction in apnoea/IH, and that there would be little or no difference between the two groups.

Posted in Neonatal Research | Tagged , , , , | 1 Comment

Non-invasive high-frequency oscillation; worth the hassle?

Non-invasive HFOV can be delivered by a variety of different equipment and interfaces. The high flows and upper airway turbulence probably have an impact on gas exchange; It appears that the effective dead space of the oro-nasopharynx is washed out (De Luca D, Dell’Orto V. Non-invasive high-frequency oscillatory ventilation in neonates: review of physiology, biology and clinical data. Arch Dis Child Fetal Neonatal Ed. 2016;101(6):F565–F70), but how much transmission of the oscillatory pressures to the lung occurs is uncertain. Transmission does occur under some circumstances, however, as several groups have shown. In this cross-over study, for example, (Gaertner VD, et al. Transmission of Oscillatory Volumes into the Preterm Lung during Noninvasive High-Frequency Ventilation. Am J Respir Crit Care Med. 2021;203(8):998–1005) nHFOV was applied starting with a pressure amplitude of 20 cmH2O, then adjusted to give either a PCO2 of 40-60, or, if, the baby was already normocapnic, adjusted to the lowest pressure that gave visible chest wall oscillations, the article doesn’t state what were the eventual pressure amplitudes received. Nevertheless, using transthoracic impedance tomography, they were able to detect chest wall movements which were about 1/5 the amplitude of the babies tidal volume movements. They also showed that when the oscillations were switched on, there was a decrease in the amplitude of the infant’s own tidal respiratory movements, which I presume was a reflex reduction, secondary to an increase in CO2 clearance by the HFO, which would decrease endogenous respiratory drive.

The figure shows the amplitudes of impedance changes over a period of CPAP compared to nHFOV, i the upper panels, and the lower coloured pictures show that the oscillations were preferentially transmitted to the right lung, especially the central and “upper” regions (the babies were in ventral positioning).

It appears likely, then, that there are some pressure oscillations in the distal airways during nHFOV, which might lead to some gas exchange. It is possible, therefore that respiratory support of nHFOV may have some advantage over CPAP. Any possible advantage over nIPPV (with NAVA or fixed pressures and/or non-synchronised) is not so clear, and requires some clinical trials to confirm.

There have been a few recent trials of nHFOV, both as a mode of routine support post-extubation, and for primary respiratory support after birth in preterm infants. There are also now several systematic reviews and meta-analyses, there seem to be some SR/MA factories churning these things out, some of which appear to have been written with AI, and sometimes include non-existent references. One needs to be really careful these days, both as a reader and as a peer-reviewer. I now check much more carefully than in the past when I am peer-reviewing an article (both primary research and review articles) to be sure that the key references actually exist. Unfortunately I don’t read Chinese, and many references only seem to appear in Chinese databases. In this SR of post-extubation nHFOV, for example, Prasad R, et al. Noninvasive high-frequency oscillation ventilation as post- extubation respiratory support in neonates: Systematic review and meta-analysis. PLoS One. 2024;19(7):e0307903 there are numerous included RCTs for which I cannot find the original article, some appear to only show up when searching the Chinese medical publication database, so it is impossible for me to check the definitions of treatment failure, for example, or the characteristics of included babies.

That review shows a reduction in extubation failure when nHOV was compared to nCPAP

and a smaller advantage of nHOV compared to nIPPV

That review showed no difference in any other clinically important outcomes, such as lung injury as measured by BPD at 36 weeks, IVH, or the other usual neonatal complications.
It also is not clear if all the babies benefited from an optimal approach to extubation, with caffeine pre-treatment, evaluation of spontaneous breathing, and higher CPAP levels in those still requiring oxygen, all of which can reduce extubation failure rates, and might eliminate a possible advantage of nHFOV, or not. They are also all tiny or modestly sized trials, only one of which was individually significant, which inflates the probability of a chance finding.

In addition, as mentioned, there are some problems with availability of these publications. The reference list of this SR/MA has no entry for Zhu 2019, it is supposed to be reference 37, but reference 37 is a different publication, a non-controlled report of nHFOV use, and I can’t find any link to a trial of nHFOV authored by Zhu in 2019, including from searching the Chinese database, CNKI. Also Liang 2019, and several other references, does not have a URL link; I searched the CNKI database and found a link to the Liang article, which had an abstract in English (the things I do for my readers!), but the full publication is behind a paywall, so I have no idea how extubation failure was defined in that study.

The systematic review seems to show that nHFOV post-extubation leads to a decrease in extubation failure from an overall rate in these studies of 1 in 4 with CPAP, (which seems very high) to about 1 in 9. In the studies comparing with nIPPV, the overall rate decreases from 1 in 6 with nIPPV to about 1 in 9 with nHFOV. But, if you eliminate the trials that I cannot find, and the tiny trial with an extremely high failure rate in controls, there is no clear advantage to post-extubation nHFOV compared with nIPPV in extubation failure.

Failing extubation and being re-intubated is, in itself, an important outcome to parents (and I would guess to the babies!). If there were no other advantage in terms of lung injury, VAP, or length of stay, then that in itself would be worth the hassle. For now, I am not convinced of the advantage of nHFOV compared to nIPPV for routine support post-extubation in the very preterm infant (who has the highest chance of being re-intubated). Other interventions, including NIV-NAVA (Tome MR, et al. NIV-NAVA versus non-invasive respiratory support in preterm neonates: a meta-analysis of randomized controlled trials. J Perinatol. 2024;44(9):1276–84) also are unconvincing as evidence for routine use.

The current evidence shows that prior to extubation babies we should ensure therapeutic caffeine treatment, perform a Spontaneous Breathing Test, and if the baby passes the test, very preterm infants should receive either nIPPV, nHFOV, or NIV-NAVA, with a PEEP of 6 if in 21% oxygen, and 8 or 9 if they have residual oxygen needs.

Decreasing extubation failure, and the need for reintubation, with its resultant trauma to the airways, and trauma to the parents hopes, is an important goal for research, even if longer term pulmonary complications are not affected.


Posted in Neonatal Research | Tagged , , , , , , | 2 Comments

Avoiding IVs in moderately preterm babies

A new very large (for neonatology) RCT has just been published. Ojha S, et al. Full exclusively enteral fluids from day 1 versus gradual feeding in preterm infants (FEED1): a open-label, parallel-group, multicentre, randomised, superiority trial. Lancet Child Adolesc Health. 2025. Mothers were approached prior to preterm delivery, and babies were enrolled if they delivered between 30 and >33 weeks gestation, and were deemed clinically stable, prior to 3 hours of age. Prior to delivery the mothers either gave full written signed consent, or they verbally agreed to the study, in which case they had a full written consent later. This allowed babies to be enrolled early in life, at an average of about 1.5 hours. Twins were therefore randomized as a unit to the same group, there were 2088 babies from 1761 mothers.

About half of the babies already had an IV when they were enrolled. In the intervention group, babies then received “full-milk-feeds”, starting at 60-80 mL/kg/d (the published protocol states 60 mL/kg/d), whereas the control group had a maximum of 30 mL/kg/d enteral liquids on day 1.

This is described thus : “For infants in the full milk group, we started milk feeds within 3 h of birth at 60–80 mL/kg per day via a gastric tube and continued milk feeds without intravenous fluids or parenteral nutrition”. But this just isn’t true, 80% of the full milk group infants did have intravenous fluids on day 1, as their own figure shows. I think this is really just a question of wording, the full milk group babies were supposed to attempt full enteral nutrition, but large numbers of babies in this GA group will have expectant IV antibiotics for possible early onset sepsis, so they required IV access for good clinical care. 50% of the babies had an IV at enrolment (there is no mention anywhere in the publication or the supplementary materials of IV antibiotics). This also means that 300 babies in the full feeds group had an IV inserted, within the first few hours of birth, after they were assigned to the full enteral feeding group. This is never explained, or even mentioned or discussed.

They also never state what was done with feeding volumes for infants who had an IV running; were the IV fluids included in the fluid calculation? If a baby weighed 1.2 kg, for example, and had an IV running at 2 mL/h for their antibiotics, was the 40 mL/kg/d of IV liquid in addition to the enteral liquids? Or was that volume deducted from their feeds? This is an important detail given that 80% of the “full enteral” group had IV fluids.

After the initiation of the trial, the local care team could do whatever they wanted, in terms of increasing feed rate, or the source of additional feeds (donor milk or formula), or defining feed intolerance, or measuring gastric residuals(!), or timing of fortification. Full feeds were defined as at least 140 mL/kg/d for 3 consecutive days.

Forty percent of the “full enteral feeding” group had more than 24 hours of IV fluids, but again we have no idea how much of this was due to IV antibiotic use. The babies were all preterm, many had respiratory distress, more than a fifth had ruptured membranes for >24 hours, so I am sure that many had (and needed) IV antibiotics. On the other hand, there were 71% who were delivered by Cesarean section, and babies delivered by CS with intact membranes don’t need antibiotics.

I am sure some also needed IV dextrose for treating hypoglycaemia; we are told that the incidence of hypoglycaemia was the same between groups, but how many had an IV for low blood sugar on day 1 is not reported. In the supplemental data we learn that there were about 7% of babies in the full feeds group who “did not adhere to the protocol”, i.e. had intravenous fluids after 24 hours of age, who were in that situation because of hypoglycaemia. Also, 4% of the full feeds group had an IV after 24 hours of age for “other clinical reasons”, which I guess must include IV antibiotics, but that seems extremely low to me. 12% if them had an IV for not tolerating full feeds, and, as mentioned, 7% for hypoglycemia, In other words, nearly 90% of the babies did tolerate full feeds from birth.

The primary outcome was duration of hospitalisation, which was determined according to local practice. There was no impact of study group on the primary, just over 32 days in each group. There were also no differences in the secondary outcomes of NEC (which was rare, 4 cases vs 6 cases), or late-onset sepsis (which was uncommon, 3% vs 2%). Among gestational age subgroups, the primary and these secondary outcomes were similar.

There were differences in TPN use, number and duration of central line use, and the numbers of peripheral IVs inserted, as you would guess, these were all reduced in the early enteral feeds group.

My take away from this trial, and several other smaller trials, is that full nutritional support can be given, from birth, by the enteral route in a large proportion of preterm infants of 30 to 34 weeks, and if they have no other clinical indication for an IV access, one can completely avoid IVs. Infants who need IV antibiotics can usually have their antibiotics discontinued at 36 hours of age (because most of them have negative cultures), after which most of them can be on full enteral fluids. A number of recent trials, some of which I have discussed in the blog have shown the toxicity of TPN. These new data show that a large proportion of the 30 to 34 weeks babies (the majority of preterm babies in the NICU) can be managed without ever receiving TPN. They can also avoid the pain of IV insertion attempts, and the discomfort of IV infiltration episodes.

There does not seem to be any good reason to start at less than 60 mL/kg/d of enteral milk feeds in this group of babies on day 1. Some babies will have difficulty tolerating this, especially infants with IUGR, and in those babies you may need to slow down feeding advancement, or even sometimes to back down to smaller volumes or temporarily stop feeds. Some will also need IV glucose, but I can’t see any good reason for not at least trying to give full enteral nutrition from birth in these babies, even if they need IV access, and a small volume of crystalloid solution, for antibiotic administration.

Posted in Neonatal Research | Tagged , , , , | Leave a comment

Bob Bartlett RIP

I just learned of the very recent death of Dr Robert Hawkes Bartlett, May 8, 1939 – October 20, 2025. He was a surgeon who had been developing extracorporeal oxygenation systems for cardiothoracic surgery who realised that extracorporeal circulation could be used for prolonged support, and was willing to try it out for a baby who was dying.

He told the story in his Presidential address to the American Society of Artificial Internal Organs in 1985 Bartlett RH. Esperanza. Presidential address. Trans Am Soc Artif Intern Organs. 1985;31:723–6, of the first baby who received ECMO treatment.

“That child, treated in 1975 was.. a little girl. Her mother was just a girl herself. A Mexican peasant girl living in Baja who could neither read nor write and who realized, when she became pregnant in 1974, that her baby, if it lived at all, would fare no better. We all have hopes and dreams, and when we become parents our most fervent hope is that our children will live well, grow up bright and beautiful, and exceed the station of their parents, whatever that is. Poor Mexican mothers know that they can give the gift of opportunity to their new offspring in the form of United States citizenship by having the child born in this country. So it was that this young mother, consumed with the wish for a better life for her unborn child, crossed the border and set out for Los Angeles when her labor pains began. But as fate.. would have it, her water broke on the freeway and she took the next off-ramp to Orange County Medical Center. The baby was born – a perfect little girl- but something was wrong. The delivery had been difficult. The neonatologist tried to explain, “Mal respire. Mal grande. Intubation. Ventilator, Oxygen. Pressure. Hypoxia, Seizures.”

“The neonatologist knew that we were working with ECMO (rather unsuccessfully) with adult patients. Would we give it a try? The babe was dying. The arterial PO2 was 12. In the middle of the night, with the aid of a flashlight so as not to disturb the other patients, we tried to explain to the mother through an interpreter the ultimate in high tech procedures which had never been used successfully for an infant. She signed the consent form with an X, scared to death for her little girl and more scared that the official-looking form would bring recognition, deportation, perhaps imprisonment. She went in to see her baby girl, cyanotic, on a ventilator, with tense nurses and residents standing about. And the next day she disappeared, leaving her baby 2 gifts : a US citizenship and a name – Esperanza- Hope.

…we ligated the patent ductus arterosus and placed a catheter to monitor pressure in the pulmonary artery. This established the diagnosis of persistent pulmonary hypertension of the newborn. When the spasm finally relaxed and the blood flowed through the lung, our patient could be weaned off bypass, and off the ventilator. Soon she had a foster family.

The baby survived, and Ann Arbor started a program of offering ECMO for full term infants who were expected to die because of cardiorespiratory failure, usually hypoxic secondary to PPHN. They developed predictive criteria which were reasonably good at predicting which hypoxic babies under full intensive care would die, with over an 80% accuracy. But with ECMO they had over 80% survival.

Bob was criticized for not doing a randomized controlled trial, when introducing this new life-saving technology. Which could be likened to doing an RCT of parachute use when falling out of a plane (Yeh RW, et al. Parachute use to prevent death and major trauma when jumping from aircraft: randomized controlled trial. BMJ. 2018;363:k5094); but nevertheless there were many sceptics in many parts of the world who thought they could have saved these babies without ECMO. He listened to them, and designed a study which minimized the number of potential deaths (Bartlett RH, et al. Extracorporeal circulation in neonatal respiratory failure: a prospective randomized study. Pediatrics. 1985;76(4):479–87). The “randomized play the winner” trial was a unique approach to a trial design, where potential adverse outcomes (death) were extremely likely. In essence, the first baby was randomized, and depending on whether they survived or died, the successive randomizations were weighted to increase the chance that a baby would be in the group with survivors, or decrease the chance of being in a group where the previous infant had died.

This design was likened to randomizing by pulling a ball from a sack, within the sack one starts with a black ball (ECMO) and a white ball (standard care). If a baby was randomized, to ECMO for example, and then survived, then an extra black ball was added to the sack prior to the next randomization. Likewise if the baby was randomized to ECMO and died, then an extra white ball would be added, or if they were randomized to standard care and survived. That way the previous “winner” group would have more chance of being the group assignment for the next baby. As it happened, the first baby was randomized to ECMO and survived (so a 2nd black ball was added) the second baby was randomized to routine care and died (so a 3rd black ball was added). This progressively increased the chances of a subsequent baby being in the ECMO group, and another 10 babies were randomized to ECMO who all survived. This reached the pre-specified success criterion, and the trial was terminated.

If this had been a standard RCT then 0/1 compared to 11/11 would not be “statistically significant”; by Fisher exact test the p value is 0.08. But it wasn’t designed as such a trial, and the results did exceed the pre-specified criterion for advantage of ECMO, without consigning large numbers of babies to the inferior treatment, or, to put it less politely, to die.

The observational data reported prior to this trial were already convincing enough for Neil Finer in our centre, and he went off for a few months to train in ECMO, then returned to Edmonton to start the first Canadian ECMO program, a process I was delighted to have a small part in.

A couple of years later we held an ECMO conference in Lake Louise, at which I got to meet Bob Bartlett, a delightful, thoughtful, humble man, you can detect those characteristics in the kindness of his description above of the dilemma of the mother of Esperanza.

The conference we held was in the winter, and the schedule was designed so that we could go skiing in the afternoons. Bob was a much better skier than I was, and I remember him skiing down the slopes, of the most beautiful scenery on earth, with his Sony walkman playing his favourite tunes as he skied.

Dr Bartlett was a thoracic surgeon whose dedication to improving patient care saved tens of thousands of newborn babies. A page on the ELSO website is dedicated to his memory, and includes a link to a fairly recent video about the development of ECMO.

Posted in Neonatal Research | Tagged | Leave a comment

Neonatal Research Shorts : October 2025

Afifi J, et al. Atropine Versus Placebo for Neonatal Nonemergent Intubation: A Randomized Clinical Trial. J Pediatr. 2025;286:114719

I had thought this was a settled issue, Neil Finer showed many years ago that atropine alone decreased bradycardias during intubation. But as the authors of this new study point out, there is very little (or no) data about atropine as part of an intubation cocktail in the newborn. I have a bit of a beef with the introduction which suggests that the Kelly and Finer trial mentioned above was limited, as it did not “follow recommended premedication protocols”. But, when Neil Finer and Mark Kelly performed that study, there were no premedication protocols, and everyone in the world was intubating babies awake, and un-premedicated. Apart from this minor wording issue, the rationale for the study is reasonable. All the babies received fentanyl (1-2 microg/kg) and succinlycholine (2 mg/kg) premedication, and they were randomized to additional atropine (20 microg/kg) or placebo.

The primary outcome was a dichotomous, occurrence of severe bradycardia (<80 bpm for >10 seconds), there were 73 intubations with quite a large imbalance between the size of the groups, 49 placebo and 24 atropine. The randomization schedule was blocked, so I am not sure why there is such a big difference, which has an impact on the power of the study, compared to having 35 or so in each group. There was much more severe bradycardia among the controls, much more bradycardia <100, and much longer median duration of bradycardia, than among the atropine babies.

The premedication cocktail used is probably optimal, except for the dose of fentanyl; different doses of which have never been adequately compared. I am not sure that 1 microg/kg is adequate analgesia, my feeling is that larger doses, 2 to 5 microg/kg are needed for a procedure which is quite painful; but that feeling should be better investigated, and analysing pain in babies who have received a muscle relaxant is rather tricky! Alternatives to atropine, specifically glycopyrrolate, an analogue which has a greater safety profile as it doesn’t cross the blood-brain barrier, could well be preferable, but has not been adequately studied in the newborn.

Take Home Message: Premedication for neonatal endotracheal intubation should include atropine.

Ambalavanan N, et al. Early Intratracheal Budesonide to Reduce Bronchopulmonary Dysplasia in Extremely Preterm Infants: The Budesonide in Babies (BiB) Randomized Clinical Trial. JAMA. 2025.

This is the type of study that would normally warrant an entire post on its own, but it has the misfortune to appear after the similar PLUSS trial.

Differences to the PLUSS trial include that all the babies were intubated, there was no enrolment of babies receiving surfactant by LISA/MIST. In addition, all the babies were enrolled prior to their first dose of surfactant. They were all <29 weeks GA and <50 hours of age. They could receive a maximum of 2 doses, if they were retreated with surfactant <50 h of age.

Sample size was similar to PLUSS at 641 total, the initial plan was for 1160 infants, but the trial was stopped after around 50% for futility. In general, I think it is a mistake to stop for futility, but in this case, with the null results of PLUSS, which the investigators would have been aware of, the chances of finding important effects of budesonide became very unlikely.

There was a tiny difference in mortality, 15% budesonide vs 13% placebo, and no differences in BPD, 63% of survivors per group. The combined outcome was therefore close to identical in the 2 groups. Unlike the secondary analyses of PLUSS, there was no difference in outcomes in any subgroup, including the subgroup of babies with more severe lung disease (>50% FiO2).

Other secondary outcomes were also similar between groups, there was a small shift in severity of BPD; there was no difference in severe BPD, but a slightly fewer moderate and more mild BPD in the budesonide group compared to placebo, but all the 95% CIs included no difference. I think this puts the nail in the coffin of routine budesonide supplementation in very preterm infants. We can’t overcome the adverse impacts of preterm birth and ventilatory support with exogenous steroid treatment.

Take Home Message : There is no rôle for routine addition of budesonide to early surfactant replacement.

Singh G, et al. Dopamine versus epinephrine for neonatal septic shock: an open labeled, randomized controlled trial. J Perinatol. 2025. This is a single centre trial among term and late preterm infants (>34 wk) with septic shock, it was registered 1st of March 2023, first patient enrolled the 6th of March, and enrolled 80 babies with fluid refractory septic shock, that is they continued to have signs of shock (defined in the publication) despite up to 60 mL/kg of fluid in term babies and up to 30 mL/kg in the preterm. Enrolment seems to have been completed in under 2 years, with 206 cases of septic shock, of whom 108 were “fluid-refractory”, 28 had exclusions and 80 were randomized. It is hard for me to imagine an NICU with over 100 cases a year of septic shock among term and near-term infants. The authors give no bacteriology, but the infants were diagnosed with pneumonia (60%), meningitis (30%) or “parenteral diarrhea” whatever that is (7%). Infants who received epinephrine were more likely to have reversal of their shock at 60 minutes (78% vs 63%), and more likely to have reversal of shock within 40 minutes (65% vs 43%). However, almost all the babies died, with no difference between groups (85% vs 88%).

Take Home Message : There are very few data on which to base choice of inotropes in newborns with septic shock. Clinical outcomes are poor, but there are some indications of better haemodynamics with epinephrine than dopamine.

Rochow N, et al. Individualized Target Fortification of Breast Milk with Protein, Carbohydrates, and Fat for Preterm Infants: Effect on Neurodevelopment. Nutrients. 2025;17(11). A couple of years ago, the group from McMaster published a randomized trial of individualized fortification of mother’s milk, compared to standard fortification, among VLBW babies and showed improved growth outcomes. Milk was analyzed 3 times a week, and fortification adjusted according to the results of that analysis. When I compare those older results with our local practice, they had rather poorer growth in their comparison, standard fortification group (mean 2.3 kg at 36 weeks), than we do, Lapointe M, et al. Preventing postnatal growth restriction in infants with birthweight less than 1300 g. Acta Paediatr. 2016;105(2):e54–9, with an approach where we routinely target 165 mL/kg/day of milk fortified to 81 kcal/100mL. We rapidly increase volumes or fortification in case of poor growth, and the mean discharge weight, at a mean of 37.9 weeks, was 2.88 kg.

The advantage of the individualized fortification was greatest, as one might expect, in the subgroup of infants whose maternal breast milk had lower than average protein content. The babies with MoM protein content that was higher than the average had relatively modest impacts of the individualized approach.

Of course, our approach means that babies who need to have their fortification adjusted will have passed a period of poor growth, and babies may pass several days with inadequate protein or energy intakes. In contrast, the individualized approach could be considered more “prophylactic”, by adjusting fortification according to measured breast milk composition, one can ensure that recommended intakes are received throughout the hospitalisation, and hopefully avoid periods of poor growth. Or, at least, that would be the case if the intervention started early after birth, unfortunately in this study, the intervention started at an average of 24 days of life.

I have a lot of sympathy with the ideas behind the individualized fortification approach, based on the known variability of breast milk content, for women who deliver at term or preterm (for a very nice review see Gates A, et al. Review of Preterm Human-Milk Nutrient Composition. Nutr Clin Pract. 2021;36(6):1163–72). But, it is time-consuming and costly, and needs some specialized equipment, i.e. an infra-red analyser for protein and fat, and a lab system for the lactose. Do the short term impacts translate to longer term outcomes? 69 of the original sample of 103 infants were examined at 18 months corrected age with neurologic examinations and Bayley version 3.

As you can see here, all the Bayley scores were a little higher in the intervention (individualized fortification) group. The differences were actually smaller between the low protein controls, and the low protein intervention infants. You can also see that there are only 51 babies with Bayley scores in this table, I don’t know where the results went for the other 18 babies, who, according to the CONSORT flow diagram, had their BSID evaluations. In a table a bit further on in the publication, which shows the proportions of infants below certain threshold scores of their BSID scores, the lost babies re-appear.

Take Home Message : These results suggest that individualized fortification might have some long term benefits, but that is not yet proven, there were at least no adverse impacts shown.

Posted in Neonatal Research | Tagged , , , , , , , , , | 2 Comments

Predicting neurological and developmental outcomes. Why? How?

There are a huge number of publications correlating medium term outcomes (by which I mean outcomes around 1 to 2 years of age) with findings in the neonatal period. Most have concerned various approaches to brain imaging, although other studies have evaluated EEG, NIRS, early structured physical examinations, counting how many complications the baby had, the type of feeding they received…. I am sure my readers could construct a longer list.

There are several recent publications that have triggered this post, in the extremely preterm infant.

In the preterm infant, brain injury on imaging is very common, yet most preterm babies actually function very well. Why, therefore have we spent so much effort trying to refine brain imaging, to find better ways to predict outcomes?

The main point I want to make is the following:

Finding a statistically significant correlation is not the same as an individually useful prediction. Investigations seeking answers about the cause of neurological and developmental problems in the newborn might be satisfied to find a statistically significant correlation, that could be a reasonable research goal. But, just because there is a significant correlation doesn’t mean that a finding is useful to predict an individual child’s prognosis.

Let me give one older, illustrative, example, Tusor N, et al. Punctate White Matter Lesions Associated With Altered Brain Development And Adverse Motor Outcome In Preterm Infants. Sci Rep. 2017;7(1):13250. This study quantified the punctate white matter lesions (PWML) on MRI at term equivalent age, in a multicentre cohort of 500 preterm infants of <33 weeks. They examined the infants at 20 months corrected age, performed Bayley version 3 developmental screening and a neurological exam. 114 infants had PWML and a neuro exam, of whom 10 had CP with a GMFCS grade 2-5; 281 infants had no PWML and a neuro exam, of whom 2 had CP of those grades.

The incidence of CP among all those who showed up for a neurological examination, therefore, was 3.5%. If they had PWML on the MRI at term the incidence was 9%. So the difference in CP incidence was statistically significant, with an Odds Ratio of 6.6 (95% CI did not include 1, 2-22). But for an individual baby the finding was completely useless as a prediction!

If you found PWML on the MRI you could say with over 90% confidence to the parents, that the infant would not have disabling CP!

Even among those with PWML in the cerebro-spinal tracts, or those with large numbers of lesions, the individual prediction of CP was always under 30%.

Some years ago, an abnormal finding of what was called DEHSI, an abnormal white matter appearance of Diffuse Excessive High Signal Intensity, was reported as a frequent occurrence at term equivalent age in very preterm infants (Maalouf EF, et al. Magnetic resonance imaging of the brain in a cohort of extremely preterm infants. J Pediatr. 1999;135(3):351–7) it was initially thought to be a poor prognostic factor. However, further study suggested that it was not strongly associated with worse developmental outcomes. This new study, from a multicentre cohort of about 340 babies from Cincinnati of 32 weeks GA or less, (Derbie AY, et al. Diffuse white matter abnormality is independently predictive of neurodevelopmental outcomes in preterm infants. Arch Dis Child Fetal Neonatal Ed. 2025) used automated objective measurement of the signal intensity in the Centrum Semiovale. If you remember from your neuroanatomy, this is a large region of subcortical white matter superior to the lateral ventricles and the corpus callosum, which on each side of the brain is roughly the shape of half an oval! Increased signal intensity on T2 imaging, calculated by their algorithm as being more than 1.8SD higher than the mean density, they called Diffuse White Matter Abnormality DWMA. You can see from the outlined regions in the MRI below, you wouldn’t have spotted those regions by eye, it really requires their computerised algorithm.

These babies were then followed, with a good percentage returning for evaluation (nearly 90%), at 2 years for Bayley’s and neuro examination; at 3 years (also around 90%) they had cognitive testing done using a tool that is well validated, and somewhat predictive of later academic performance “the Differential Ability Scales, second edition (DAS-II) General Conceptual Ability (GCA) score”.

There was a strong correlation between several different factors, including the MRI findings, and motor scores, diagnosis of CP, and cognitive scores.

This kind of study always leaves me ambivalent. On the one hand, the finding that the extent of white matter injury correlates with motor, and, less strongly, with cognitive outcomes, is an unsurprising confirmation of the importance of abnormal brain development in very preterm infants. This is a very well done study, with infants from a wide range of gestational ages, excellent high quality follow up, and extensive statistical analysis.

On the other hand, what on earth are we supposed to do about it? The outcomes measured are largely outcomes that have little importance for parents. As the parents voices project has confirmed, they don’t care about Bayley scores, and CP with a GMFCS=3 is not considered by parents to be a serious adverse outcome.

In addition, the correlation with the outcomes is not even very close. The following figure shows the contribution of various factors to the cognitive outcomes of the infants. It shows that having antenatal corticosteroids (ACS), maternal breast milk (MMDD) and being a girl (SEX) had a substantial positive impact, whereas High-Risk Social Status (HRSS) had a major adverse impact, especially if combined with moderate or serious Brain Abnormalities (shown on the figure as msBA, defined by a Kidokoro score of over 7 on the term-equivalent MRI).

Those other factors were much more strongly correlated with the outcomes, than the DWMA. For example the relative impact of receiving antenatal steroids gave a ß value of +13.5 compared to about -2 for the DWMA, receiving maternal milk at discharge had a ß of +8 (which I think means that, after correcting for other factors, the cognitive score was on average 8 points higher among those who received maternal milk compared to those that did not, this is huge impact from something that we can do something about! Or, depending on how the statistics were done, it might be 0.8 SD higher: 1 SD was 20 points on the cognitive score, so 0.8 SD would be 16 points higher) As you can see from that graph above, the DWMA had a relatively weak association with cognitive score (all the results were similar for the motor composite score). The ß of about 2 means that, for each 1 SD increase in DWMA volume, there was an average of a 0.2 SD lower score on the Cognitive Composite, or 4 points lower.

Unlike some other markers on TEA brain imaging, the absence of DWMA does not appear to be very predictive of absence of CP, but I can’t find enough data to calculate the Negative Predictive Value in this publication.

I also find the way the results are presented to be questionable. The subtitles of the various sections of the results all use the term “prediction” such as in “prediction of motor performance”, “prediction of cognitive performance”, “prediction of CP”. These all presuppose that the prediction is useful, whereas they all were, in reality, very poorly predictive. It would have been better to title the sections “correlation with…”

There is a really good paper describing the methodology, goals, and interpretation of prognostic studies that I have been reading (Kent P, et al. A conceptual framework for prognostic research. BMC Med Res Methodol. 2020;20(1):172). It makes the important distinction between prognostic determinants and prognostic markers, and between various stages of exploratory and confirmatory research projects. This project would therefore be an exploratory study. Although the authors of this study have shown that there is a relatively weak association between the extent of DWMA and lower scores on the developmental screening tests at 2 to 3 years, we can’t tell from this whether the DWMA is a marker or is causative, and whether interventions aimed at reducing DWMA will improve outcomes.

To come back to the sentence in bold type above, the authors have shown a statistically significant correlation between having more white matter injury, using their algorithm on TEA brain MRI, and poorer motor scores at 2 years, lower cognitive scores at 3 years, and a diagnosis of CP (stage 1 or more). But, is that association useful in prognosis for an individual? They have shown that other factors are much more strongly associated with those outcomes, some of which are potentially modifiable on a group basis (working harder to ensure that mothers get steroids before delivery) or on an individual basis (maternal milk intake).

Another very recent study, from the same group, (Mahabee-Gittens EM, et al. Severity of punctate white matter lesions in preterm infants: antecedents and cerebral palsy prediction. Pediatr Res. 2025) analysed PWML, (the punctate lesions discussed above) and divided the 28 who had PWML into terciles of extent of lesions. 39 of the 339 infants in the study (12%) developed CP, of whom 6 were among those who had PWML.

To put it another way, the large majority of the babies who developed CP (33 of 39) did not have PWML. The majority of those with PWML, 22 of 28, did not develop CP.

This study also shows no significant correlation between the PWML and Bayley scores, on any of the 3 domains, including the motor composite.

This is a very unimpressive predictive capacity for the individual baby. Nevertheless, an accompanying editorial somehow uses those results to push the idea that all ex-preterm babies should have MRIs. They claim that doing so would allow early intervention, not mentioning that targeting early intervention to those with PWML on the MRI would fail to include the large majority of babies with CP. In fact, as usual, the study showed that the strongest predictor of poor developmental scores was Social Status, followed, in this analysis, by chorioamnionitis.

A much better way of targeting early intervention then, according to these data, is not to perform routine MRI, but to forget the MRI and routinely enrol infants with poor social status into early intervention programs.

Here’s another idea. Why not take the money that would be spent on routine MRI, and just give it to the poorest parents? (Bouchelle ZM, et al. Unconditional cash transfers to low-income preterm infants and their families: a pilot randomized controlled trial. J Perinatol. 2025) This pilot trial is investigating the impacts of giving unconditional money transfers to families that are at highest risk of having a baby with developmental delay. That is, according to the MRI study above, those with poor social status. Previous studies of giving money to families, with babies mostly born at term, (Gennetian LA, et al. Effects of a monthly unconditional cash transfer starting at birth on family investments among US families with low income. Nat Hum Behav. 2024;8(8):1514–29) showed that “mothers spent more time engaged in cognitively stimulating activities with their children. In addition, ~25% of the value of the cash gift was used on children’s books, toys, activities, clothing, diapers, and children’s electronic items/devices” which are all things likely to improve infant development, and which poor families struggle to obtain.

For 100 babies eligible for a TEA MRI, if they cost, say, $500 each, that $50,000 could instead be divided among the 10 families at highest social risk and the impacts on infant and family well-being might be dramatic. I think it would be ethically justifiable to have some minor conditions attached, such as participation in an early intervention program, which have been shown to improve cognitive and motor function in infancy (see the latest Cochrane Review Orton J, et al. Early developmental intervention programmes provided post hospital discharge to prevent motor and cognitive impairment in preterm infants. Cochrane Database Syst Rev. 2024;2(2):CD005495).

Instead of performing routine MRIs and searching for abnormalities with a weak association with CP or developmental delay, we should focus on ways to improve those delays and improve outcomes. The individual predictive ability of any finding on MRI at term-equivalent age is between low and extremely low. Such studies may well have value for research, but for improving outcomes of former very preterm babies they are useless.

Posted in Neonatal Research | Tagged , | Leave a comment

Caffeine is good for the preterm brain; might more caffeine be even better?

One of the pivotal RCTs in neonatology was the CAP study (Schmidt B, et al. Long-term effects of caffeine therapy for apnea of prematurity. N Engl J Med. 2007;357(19):1893–902). We performed that study because there was no data on the long term impacts of caffeine, and there was a worry that blocking adenosine receptors in babies having multiple hypoxic episodes might be a bad idea. Adenosine is an inhibitory neurotransmitter that is produced during hypoxia, and decreases the brain metabolic rate to protect against hypoxic damage. So giving caffeine to babies having a lot of apnoeas could potentially have been a bad idea.

As it turned out, caffeine, given for a few weeks in the neonatal period, to babies <1250g birth weight who, in the first 10 days of life were thought to have an indication for caffeine by the attending physician, had a lasting positive impact, with some fine motor benefits even out to 11 years of age (Murner-Lavanchy IM, et al. Neurobehavioral Outcomes 11 Years After Neonatal Caffeine Therapy for Apnea of Prematurity. Pediatrics. 2018;141(5)). Why is caffeine so beneficial? It could be because of a reduction in apnoea, and in the consequent intermittent hypoxia, it also appears to have effects on cerebral oxidative injury and on apoptosis. There is one article, for example, from a study in mice, which showed that caffeine reduced hypoxia-induced white matter injury (Back SA, et al. Protective effects of caffeine on chronic hypoxia-induced perinatal white matter injury. Ann Neurol. 2006;60(6):696–705).

The dose of caffeine that we gave was based on the data available at the time, it appeared to have a wide margin of safety, so the standard dose that we settled on, a 20 mg/kg load of caffeine citrate, followed by 5 mg/kg/d, could be increased to a maximum of 10 mg/kg/d if thought to be clinically indicated. I published an abstract, and presented orally, at the PAS meeting in 2010, which showed that infants who got a higher dose, compared to controls who had a higher “dose of placebo”, had the same benefits on long term outcome that we saw among the overall group. To my shame, I never followed up and wrote up the full publication. But, as I was writing this post I dug out the powerpoint presentation that I gave in 2010 of this secondary analysis. I thought I would give you all a treat and show some of the results. We had 1583 infants with data for these analyses, and the medians for caffeine exposures are shown below:

An example of the results is given below.

This shows the percentage of infants in each of the two groups who had a low Bayley (version2) MDI score, <85, at 18 months corrected age, divided by duration of treatment. Infants who had treatment longer than the median of 45 days, with either caffeine or placebo, were more likely to have low scores than those with shorter treatment. It also shows that the difference between caffeine and placebo groups was the same, regardless of duration of study drug administration. We did the same kind of analysis for maximum daily dose received, total accumulated dose, average daily dose, and the PMA at which caffeine was stopped. None of the analyses showed any impact of caffeine exposure on the advantages of the caffeine group. The ORs in the figure above are the Odds Ratios for “cognitive delay” between caffeine and placebo in the 2 subgroups.

For the much smaller numbers with cerebral palsy, the results were different: longer duration of caffeine use, and higher total dose received were associated with a much greater difference between caffeine and placebo groups.

When we looked at average daily dose, or maximum dose received there was no association with the impact of caffeine on CP.

In other words, infants in the placebo group were more likely to develop CP (GMFCS grade 1 or worse) if they received study drug for longer than the median, but those who received caffeine were not. This made us wonder if the impact of caffeine on motor function was by different mechanisms to the impact on cognitive development. The overall primary outcome (which was “death or NDI”) was not associated with any of the metrics of caffeine exposure; the outcome, as usual, was driven by lower Bayley scores, so the impact of duration of therapy on CP was not noticeable in the composite primary outcome.

I must acknowledge the collaborators in the CAP trial here, the co-authors of that abstract were myself, Robin Roberts, Barbara Schmidt, Elizabeth Asztalos, Aida Bairam, Arne Ohlsson, Koravangattu Sankaran, and Alfonso Solimano, as well as the other CAP investigators; and of course the PI and driving force behind the trial was Barbara Schmidt. The title of the abstract was : The Caffeine for Apnea of Prematurity (CAP) trial: analysis of dose effects.

As I mentioned above, the enrolment criteria included use of caffeine to aid extubation, to prevent apnoea, or to treat apnoea. Some infants therefore received caffeine early as prophylaxis against apnoeic spells, others later after apnoea had become evident. Peter Davis led the group to publish a subgroup analysis, (Davis PG, et al. Caffeine for Apnea of Prematurity trial: benefits may vary in subgroups. J Pediatr. 2010;156(3):382–7), showing that the subgroup who received caffeine for prevention of apnoea had fewer benefits, in terms of the long term advantages, than those who received it for treatment of apnoea or to assist in extubation. In contrast, the babies who received caffeine earlier (less than the median of 3 days of age) had more advantages than those who received it later (>=3 days), they had the biggest decrease in duration of oxygen treatment, and therefore in having a diagnosis of BPD, and more long term advantages.

The CAP study was performed in the days before we were so aggressive with non-invasive support, so large numbers of the early treated babies were intubated at the time they were enrolled. There is therefore some overlap with another secondary analysis in that paper, which was the impact of caffeine according to the respiratory support at the time of randomization. Intubated and CPAP babies had more benefit from caffeine than those not on any support.

As I was preparing this post I read a Systematic Review and Meta-analysis which purported to be an SR of prophylactic caffeine, compared to control groups. Miao Y, et al. Effect of prophylactic caffeine in the treatment of apnea in very low birth weight infants: a meta-analysis. J Matern Fetal Neonatal Med. 2023;36(1). It was, however, seriously flawed, and I won’t provide a link. The authors had included the CAP babies as if they had all received prophylactic treatment, and had also included, a second time, the babies in the CAP prophylactic subgroup from the Davis article, who were therefore double counted. That wasn’t all, 4 of the references in the reference list, supposedly to articles included in the SR, were not to the right articles, which meant that 4 of those that were actually included had no reference, and no way to find them, including one study (Ke H 2018) which apparently had over 1000 VLBW babies. A failure of peer review, I’m afraid.

A good quality SR, in comparison, is this analysis of caffeine and dose effects, on long term outcomes. Oliphant EA, et al. Caffeine for apnea and prevention of neurodevelopmental impairment in preterm infants: systematic review and meta-analysis. J Perinatol. 2024;44(6):785–801. The results of analyses for BPD, PDA, and survival, for caffeine compared to placebo, were almost entirely dependent on the CAP trial (weight of over 95%). When they analysed the dose comparison trials, higher doses led to less apnoea and less BPD.

Another reasonably good SR was a comparison of caffeine use by timing, early (<3 days) vs late (3 days or later). Karlinski Vizentin V, et al. Early versus Late Caffeine Therapy Administration in Preterm Neonates: An Updated Systematic Review and Meta-Analysis. Neonatology. 2024;121(1):7–16.It only included 2 small RCTs, each with about 90 subjects, so most of the data are from the 9 observational studies. All of the outcomes were better with early treatment, except for mortality. As the authors point out, mortality may be higher in the early treatment group because you have to survive at least 3 days to be in the late treatment group, i.e. there was a survival bias.

One study that will help a lot with the use of caffeine is the ICAF trial, which was presented at PAS in 2024, but has still not published their results. The intervention was restarting caffeine or placebo at 36 weeks, after the clinically used caffeine was stopped, until 42 weeks 6 days PMA. Outcomes were the number of Intermittent Hypoxia episodes, inflammatory markers, and MRI findings at about 45 weeks. It seems to be taking a long time to publish, as enrolment had already finished in June 2023. The initial sample size was planned to be 220, according to the registration documents, but only 160 were reported in the abstract. As it was an abstract there aren’t all the details about why the sample size was not achieved. Nevertheless, I think the benefits of continuing caffeine to 43 weeks that were reported are very interesting, and warrant a consideration of our usual approach. We could do with some more information, though, before prolonging caffeine for all babies <30 weeks. Are there risks of stopping caffeine at home after discharge? Are there benefits other than those reported, which have a clinical impact? The outcomes reported in the abstract are very interesting (go here for a review) there was much less intermittent hypoxia, but whether that translated into a useful clinical benefit is uncertain, what was an important benefit, however, was an 11 day shorter length of hospital stay. If that is due to the intervention, which seems probable, even though it was an “exploratory” analysis, then that is a potential major advantage of continuing caffeine much longer than we usually do.

The reason for ruminating about caffeine use is that we still don’t really know the optimal dose (perhaps higher doses would be better) the optimal timing (earlier may well be better, but the data are very weak) the optimal duration (longer might be better, ICAF will help, and hopefully there is along term outcome plan for the ICAF babies).

A new observational study (Ostrem BEL, et al. Cumulative caffeine exposure predicts neurodevelopmental outcomes in premature infants. Pediatr Res. 2025) examined the total dose of caffeine received in babies at UCSF. They calculated the average daily caffeine dose of a cohort of infants <32 weeks who had TEA brain MRI. Average dose was used to convert the babies into 3 tertiles of dose received, infants had Bayley version3 scoring at 30 months corrected age.

The infants receive a standard bolus dose of 20 mg/kg, the median maintenance dose was 7.6 mg/kg/d (IQR 6.3- 8.7) which is higher than the starting CAP trial dose, of 5 mg/kg. So, many of these babies had higher doses, with a range therefore between about 5 and 10 mg/kg. The 3 tertiles included 23 babies in each group. They also divided the total caffeine exposure by the number of days between birth and the due date, to give what they called average daily caffeine exposure. This averaged about 3.3, as it included a variable number of days that the infants received no caffeine.

As you can see from those violin plots, infants who had more caffeine had, in general, higher scores on motor and language domains, and slightly higher scores in the cognitive domain. The data were corrected for Gestational Age and duration of oxygen therapy. There was also, interestingly, very poor correlation between MRI findings and Bayley scores, which I will come back to in a future post.

These new data are again suggestive of a developmental benefit of caffeine in the long term, with infants receiving more caffeine having higher scores, after correction for potential confounders.

The major concern about higher standard doses that I have comes from the 1 trial which showed an adverse impact of very high doses. McPherson C, et al. A pilot randomized trial of high-dose caffeine therapy in preterm infants. Pediatr Res. 2015;78(2):198–204. In that trial the loading dose was a whopping 80 mg/kg, over a 36 hour period, compared to 30 mg/kg over 36 hours in the comparison group. Both groups received maintenance of 10 mg/kg/d. The high dose in that study led to more cerebellar hemorrhages on MRI, and had more abnormal signs at term, although longer follow up did not show a disadvantage of the high-dose group.

You may recall that the first neonatal comparison in the PLATIPUS platform is between 3 different caffeine dosing groups, with low dose being the standard of 20 mg/kg load followed by 10 mg/kg/d, medium dose being 30 mg/kg followed by 15 per day, and high dose being 40 mg/kg followed by 20 mg/kg/d. Treatment is up to 36 weeks followed by open label treatment according to the teams preference. The hierarchical composite outcome of that study should capture both the potential hazards suggested by the McPherson study, and the possible advantages of higher doses. Of course the optimal duration of therapy will remain uncertain until further studies similar to ICAF are performed.

Posted in Neonatal Research | Tagged , , , , | Leave a comment

Unethical research practice, fraud and abuse of trust.

One of the worst kinds of unethical research practice is to fail to publish results after a prospective study.

Parents consent to research for altruistic motives, in the belief that their baby’s participation will help the care of other, future, babies. Failing to carry through and publish, or at the very least, make results publicly available on the registration website, is an abuse of that consent. It is fraudulent to ask for consent for a prospective study and then to hide the results because they are not what you wanted to find.

This is one of the major reasons behind the mandatory registration of prospective research on publicly available registers. All prospective research, including essential details such as eligibility, sample size, interventions, and primary outcomes, must be registered, prior to enrolling patients. This is an important safeguard against investigators changing those details, without an openly available rationale. Clearly, sample sizes may change (almost always decrease) because of unforeseen circumstances, eligibility criteria may be adjusted, and even major outcomes may be changed. However, the initially planned primary outcome must always be reported, even if it is adjusted prior to analysis of the results. Any change in outcomes reported, or additional outcomes, after the results have been analyzed must also be clearly reported as being post hoc, and are always only hypothesis generating.

A recent study of a relatively non-invasive test of lung maturity (testing the L:S ratio on gastric aspirates (GAS), from a gastric tube that was inserted for clinical reasons) was performed, and some results have just been published by the clinical investigators. Heiring C, et al. Predicting Surfactant Need at Birth: Failed Validation of a Bedside Method Using Gastric Aspirates. Acta Paediatr. 2025. The method failed to adequately predict the need for surfactant… as far as we know.

This was a 4 centre study from Denmark among infants of <30 weeks gestation less than 45 minutes of age, who had not yet had surfactant. The primary outcome as noted on the registration page was the L:S ratio on the samples, the registration page notes : “The primary objective is to measure the L/S-ratio in fresh GAS using the AIMI 1.0/2.0 L/S POC Device and compare the L/S-ratio with the need for surfactant treatment aiming to validate the previously defined cut-off L/S-ratio for surfactant treatment and to determine if the cut-off L/S ratio needs adjustment before starting FAST 2 RCT”.

Of note, this is a really good acronym! It comes from the methodology used and the intervention expected Fourier trAnsform infra-red spectroscopy guided Surfactant Therapy. Or maybe, Fast Assessment of Surfactant deficiency to speed up Treatment.

Having performed the study, consented parents, and submitted the samples for testing, the patent holders who work with the company are refusing to allow publication of the pre-defined outcome. As the title of this post states, those partners in this study are clearly acting unethically, they are abusing the trust of the families who consented to the study.

Fortunately, we know about this because of the courage of the clinical investigators, and of the Editors of Acta Paediatrica, who have written and published the above article describing the study, documenting the dispute, and noting “disagreements over how the study findings should be reported and which findings to include. Specifically, the laboratory group proposed an unbalanced emphasis on lecithin (DPPC) alone as a predictor of surfactant treatment, based on post hoc analyses using an open dataset outside the framework of the agreed-upon protocol”. As I mentioned above, I don’t think that post-hoc analyses should be banned, but it is essential to focus on the results of the primary, pre-planned analyses. Anything that results from an inspection of the results, after they have been collected, is inherently unreliable, and must be submitted to further independent testing.

It appears that the company has “swivelled” and an abstract at the recent PAS-meeting appears to be reporting a study using the same device, but now discusses its use for predicting prolonged respiratory support (>6 h duration), not surfactant requirement. Firstly, I would caution the researchers at the Mayo clinic to ensure that they have a legally binding agreement to publish the results, especially the pre-specified primary outcome, otherwise they may find themselves in the same dilemma as the Danish researchers. Secondly, what is the purpose of this? How does that help? what would you do about it? The abstract doesn’t have enough detail to explain potential uses, but hopefully the investigators will make that clear in the future.

It is not unusual for a company which has invested in development of a new drug/technique/machine, and then finds it not to be very useful for the initial indication, to find something posthoc and swivel to that as an indication. It is an understandable reaction, one does not wish to lose the investment that has previously been made, and finding another indication might save the family jewels. One positive example of this is sildenafil, which was initially being developed as an angina treatment (where it seems to be effective, but may cause profound hypotension if the patient then takes their nitrates), but the company, post hoc, noted the frequent side effect of erections! And we all know what followed.

Research ethics approvals should include a legally binding agreement that the results will be submitted for publication, and that the publication must report the approved primary outcome, which must be identified as such. I don’t believe that is the case in many jurisdictions, but it should become the norm, to avoid situations such as this one.

Posted in Neonatal Research | Tagged , | Leave a comment

Caring for the most extremely immature infants

There have been multiple publications concerning this issue recently, many from the tiny baby collaborative.

The first 2 publications are about the overall approach to providing intensive care at extremely low GA:
Bernardini LB, et al. It’s the little things. A framework and guidance for programs to care for infants 22-23 weeks’ gestational age. J Perinatol. 2025. This is a discussion of the many issues that should be addressed in centres trying to improve outcomes for these babies, including a recognition that they face specific challenges, require particular attention to detail, have unique physiologic limitations, and deserve an integrated caring approach with a committed team which includes obstetricians, nurses, all the allied health professionals, and parents. Indeed, despite the well-documented differences in the clinical approach of centres with good outcomes, one thing they all have in common is a belief that these babies can do well! You may have to convince your obstetricians that a delivery at 22 weeks is not a miscarriage, but is an extremely high-risk delivery that deserves the best possible care.

Many of the important considerations in this graphic are, appropriately, a bit vague. It is hard to disagree that maintaining “stable BP”, for example, is important, but exactly how to do that, and what to do when the BP is drifting downward at 4 hours of age, is beyond the scope of this graphic, and, unfortunately, has almost no evidence base to determine best practice.

Whatever you do, though, you should try to keep doing the same thing. Protocolized care (which means developing and following protocols) is essential in order to provide quality consistent care. (Al Gharaibeh FN, et al. The impact of standardization of care for neonates born at 22-23 weeks gestation. J Perinatol. 2025). This report from a health care regional program, treating 30,000 annual deliveries shows the results of the progressive implementation of a program to support the care of such babies. Guidelines covering many different aspects of care of these infants were implemented. In the early period, none of the 19 infants of 22 wk GA, and 39 of the 45 23-week infants received intensive care. This increased progressively to the post-implementation epoch to reach 29/34 and 48/51. For some reason these data are presented as Odds Ratios, which makes no sense to me when they are the result of an active decision. More importantly, survival improved, even when limited to the infants receiving active treatment, and complications of prematurity were stable or improved. Length of stay of infants receiving intensive care was shortened.

There are many publications documenting the efficacy of protocolizing complex care. The extremely immature infant is a prime example of a group who require such an approach.

Isayama T, et al. Survival and unique clinical practices of extremely preterm infants born at 22-23 weeks’ gestation in Japan: a national survey. Arch Dis Child Fetal Neonatal Ed. 2024;110(1):17-22 This paper is a good example of that premise, despite doing a lot of different things in different centres (some of which I would immediately label as “unnecessary”, “excessive” or even “dumb”), survival is very high. Japanese centres have strict protocols that all the staff follow, such that survival in this cohort at 22 weeks was 63% of those resuscitated (the majority) and at 23 weeks, 80% of those resuscitated (all except 2 of 757 infants). Among questionable practices in their protocols, 128 of the 145 level 3 centres in Japan perform echocardiography at least 3 times a day in the first 3 days, they measure a variety of different variables, as shown below, but what exactly they do in response to AV valve regurgitation, for example, is unclear.

116 of the centres also perform head ultrasound twice a day. What on earth you do about the head ultrasound result I also have no idea, especially as redirection of care is extremely uncommon in Japan.

Most of their ventilated babies are sedated with phenobarbitol or morphine, many also use fentanyl, even though it is useless as a sedative. 90% give probiotics, and some use donor milk if there is insufficient maternal milk. I was surprised to see that 50% give formula in this situation, but it seems to happen rarely in Japan, from other studies I have seen. Nearly all of them, 86%, give glycerine enemas in the 1st few days after birth, often on multiple occasions.

It is hard to argue with success, and the survival rates, of a largely unselected population are excellent. However, some of the longer term outcomes from Japan are concerning. This brand new publication, for example, (Haga M, et al. Prevalence and risk factors for neurodevelopmental impairment in very preterm infants without severe intraventricular hemorrhage or periventricular leukomalacia. Early Hum Dev. 2025;206:106286). Shows rather poor outcomes among the babies of 22 and 23 weeks, even though they are selecting, for this publication, the infants who do not have severe IVH or PVL. The results are not directly comparable to those from other countries as they use a Japanese evaluation tool, the Kyoto Scale of Psychologic Development, but the statistical spread of the results is similar to other tests, being normalised with a mean of 100 and an SD of 15. The following selection from their results includes the classification of the developmental test result first, with normal being >84, and delay being <70. You can see that the results are quite concerning at 22 and at 23 weeks; then, in the table, appear the actual mean scores and the incidence of CP.

This contrasts with outcomes from other places, such as these national data from Sweden, which include infants with brain injury on ultrasound (Soderstrom F, et al. Outcomes of a uniformly active approach to infants born at 22-24 weeks of gestation. Arch Dis Child Fetal Neonatal Ed. 2021). Of note, not all these infants had formal developmental testing, but Moderate-Severe is similar to the “delay” in the above study.

It may be just my prejudices, but I like to think that the much less interventionist approach in Sweden, with fewer ultrasounds, and more focus on integrating parents, helps to lead to better long term development.

To go back to the obstetric part, the following article confirms the marked lack of evidence to support any intervention for the mother threatening delivery at 22 or 23 weeks. (LeMoine FV, et al. Considerations for obstetric management of births 22-25 weeks’ gestation. J Perinatol. 2025). The weight of the limited observational evidence is strongly in favour of steroid administration, however, and probably also magnesium sulphate.

Agren J, et al. Tiny baby math: supralinear implications for management of infants born at less than 24 weeks gestation. J Perinatol. 2025. This article is an explanation of the major impact of the tiny size of these patients on everything we do to them. The relative volume of fluid flushes leads to very high potential sodium administration rates and possible seriously excessive heparin doses. We need to develop better small equipment, be prepared to use 2.0 mm ETTs for example, and to reduce the volume of blood taken for lab testing, we can completely eliminate CRP testing for example (that’s my take, not theirs) and we can run electrolytes exclusively on the same whole blood sample that we use for the blood gas with no increase in volume (and also measure ionized calcium, glucose, lactate, total bilirubin).

Which brings us neatly to fluid balance management, firstly an analysis of current use of humidification (Stoll CM, et al. Approaches to incubator humidification at <25 weeks’ gestation and potential impacts on infants. J Perinatol. 2025) which shows that centres start at varying relative humidity, mostly over 75%, because there is no good evidence to decide what to start at, but that weaning can probably be quite fast. Using high humidity will help to avoid major trans-epidermal fluid loss, and the accompanying heat loss, from the latent heat of vapourisation (high-school physics!) and the consequent hypernatraemia.

That is confirmed by this study from Upssala (Naseh N, et al. Fluid Balance in Infants born at 22-23 Weeks’ Gestation: Trajectories and Associations with Outcomes. J Pediatr. 2025:114661), where they commence incubator humidity at 85%, and start IV fluids at 110 mL/kg/d at 23 weeks, or 120 mL/kg/d at 22 weeks. They adjust subsequently to aim for the following a) Maximum weight loss of 10–15% at a postnatal age of 3-5 days; b) Weight deficit of ~10% at 7 days, and regain of birth weight by 10–14 days; c) Plasma sodium <150 mmol/L; d) Initial sodium intake <4 mmol/kg/d. Which are similar to our goals, and I would think, many other centres.

I really don’t like the way this graphic is constructed, for one thing, I think the dotted lines, which are labelled as “Incidence (%)” are actually prevalence, i.e. the proportion of infants on that particular day with that diagnosis (if it was incidence it couldn’t go down again), and putting a continuous variable like weight loss on the the same graph as a discrete variable, like the proportion of a relatively small number of patients, which is then shown as a line, is really questionable. Acute Kidney Injury, here is defined by oliguria. Of the 69 included infants, 7 received insulin, the precise indications for which I couldn’t find, but there were more than 7 who had a blood glucose >20 mmol/L, so I presume it must be a persistent blood glucose >20.


As you can see from the error bars, weight loss of >15% was not rare, and was associated with increased mortality. They don’t mention IVH, but in our local data (which I haven’t yet published) the association of hypernatraemia (>145) with concurrent hyperglycaemia (>12 mmol/L) often led to severe hyperosmolarity, with peak osmolarity often >320, which was strongly associated with severe IVH.

Looking after these babies is a real challenge, many centres are now reporting survival at 22 weeks which is greater than 1/3, and in Japan is approaching 2/3, with dramatically better survival at 23 weeks, Compared to the rest of care for seriously or critically ill patients of other ages, that is very far from being futile! The quality of life of the large majority of survivors is excellent, even though many have challenges, in particular with speech development and executive function. Those challenges should be used as a lever to improve educational and other support services for these patients, not an excuse to deny them intensive care.

It seems that more and more centres are offering active intensive care to infants at these profoundly low gestational ages. I think it is often appropriate to give these babies a chance to “prove themselves”, but we must take into account other risk factors, in particular growth restriction and outborn status, which is often accompanied by lack of antenatal steroids, when we discuss the best approach with parents.

We owe it to these families to do everything we possibly can to improve their chances, with a dedicated team, who believe that these babies are worth the effort. A team which has examined the literature and their own practices in detail, have constructed clear protocolized care plans, and are prepared to follow them. Only then can these most immature babies get the care they deserve.

Posted in Advocating for impaired children, Neonatal Research | Tagged , , , , | 3 Comments

Hypotension and Shock. Optimising treatments

A new single centre RCT of permissive hypotension (PH) compared to “standard treatment” (ST) of very preterm infants 24 to <30 weeks GA, with a mean BP lower than their GA has just appeared (Alderliesten T, et al. Treatment of Hypotension of Prematurity: a randomised trial. Arch Dis Child Fetal Neonatal Ed. 2025). In the intervention, PH, group, infants only received treatment if they developed signs of poor perfusion. In the ST group they immediately received a fluid bolus and a dopamine infusion, followed by adding dobutamine, then epinephrine, hydrocortisone was given if the baby needed more than just dopamine. They don’t explicitly say what the goals of the cardiovascular support were, and what triggered increases in dose or the addition of other agents. I assume that the goal was to have a mean BP above the GA, but perhaps the goal was the GA+3 or something, it should have been stated. About 1/3 of the 40 ST babies only had a fluid bolus, and no further catecholamine support, about 1/4 of the 46 PH babies eventually had a fluid bolus. From table 2 it looks like there were 19 babies who had a fluid bolus and went on to receive catecholamines in the ST group, and 6 PH babies, at least, if the numbers in the table were the maximum doses received.

The table is poorly explained, the floating numbers in the right-most column are presumably p-values, and I assume that the first of them (0.004) refers to the whole block of data about what seems to be maximum intervention received, and not to the 0 vs 0! It seems that only 32 of the 40 ST babies actually received any treatment for their hypotension.

The babies were enrolled into the study between 2011 and 2018, and the study was eventually terminated, in part because of falling eligibility. This period probably covers the introduction of routine delayed cord clamping in preterm deliveries, which has led to a major reduction in the diagnosis and treatment of hypotension.

The interventions in the PH group presumably are for 15 babies who developed signs of poor perfusion, or a mean BP of GA-5 mmHg (according the discussion section they were mostly for low BP); but 12 of them had good perfusion after a fluid bolus, and they had no further intervention. Why there are, it seems, 8 ST babies who didn’t receive any intervention isn’t clear. What you can also see in this publication, is that the mean BP in the 2 groups was just about identical, despite the interventions in the ST group.

The primary outcome of the trial was developmental outcome using Bayley version 3 cognitive and motor scores at 24 months corrected age. Secondary outcomes included mortality and the usual NICU complications.

The major finding of the trial is that the 2 groups had almost identical long term outcomes. The mean scores, and the proportion <-1SD below the mean, and traditional combined outcomes, were all very similar between the groups, as were the acute complications, NEC and IVH. The mortality was slightly higher in the permissive hypotension group 6/46, vs 3/40, but they note that there were 2 PH babies who died after the intervention period because of LOS with G-negative organisms and shock.

I do not understand the presentation of the cognitive and motor scores; they are identical between the 2 study arms, but they are presented as being approximately 91 in the 2 groups. But, further down in the main results table, they are presented as “using age at BSID-III assessment corrected (CA) for prematurity” and the means are closer to 101. Surely the primary outcome was already the BSID scores corrected for prematurity?

The interventions, therefore were different to our HIP trial (Dempsey EM, et al. Hypotension in Preterm Infants (HIP) randomised trial. Arch Dis Child Fetal Neonatal Ed. 2021;106(4):398-403). In our trial the infants in both groups were also free of signs of poor perfusion, and they all received a fluid bolus, then the intervention group were started on 5 microg/kg/min of dopamine, compared to placebo in the controls. This meant that the BPs were different between groups:

We have recently published our long term outcomes (Marlow N, et al. Outcomes of extremely preterm infants who participated in a randomised trial of dopamine for treatment of hypotension (the HIP trial) at 2 years corrected age. Arch Dis Child Fetal Neonatal Ed. 2025) which showed no major differences, but with the proviso that we were very underpowered, and terminated the trial well before the planned sample size.

There was, if anything, a trend towards a better outcome in the intervention group.

In the new trial, one can really quibble about the interventions used. Although dopamine is most commonly used for this indication around the world, which is why we used it in the HIP trial, it acts as a pure vasoconstrictor in the preterm infant, improved perfusion of any organ has not been shown with dopamine use in the newborn. In particular dopamine is a cerebral vasoconstrictor, and has been shown in several models to decrease brain blood flow, or brain oxygenation.

Second line therapy with dobutamine is also debatable, dobutamine is a vasodilator, and does not reliably increase BP, so using it to treat hypotension is questionable (although it may be effective in improving perfusion).

There are a few concerns with this article, the delay between completion of follow up (2020) and publication is very long, the specific items that I have mentioned above, but most importantly they didn’t reference the original article which invented the term “permissive hypotension”! Dempsey EM, et al. Permissive hypotension in the extremely low birthweight infant with signs of good perfusion. Arch Dis Child Fetal Neonatal Ed. 2009;94(4):F241-4.

The sad fact is that we still don’t know what to do about the baby with reasonable clinical perfusion who has a numerically low BP. The totality of the current data suggests that it is entirely acceptable to just wait and see, especially in the current era of delayed cord clamping, where hypovolaemia is very unlikely.

Posted in Neonatal Research | Tagged , , , | Leave a comment