“families affected by HIE remain sidelined in advocacy and institutional representation. We are routinely excluded in neonatology priority setting, where patient stakeholders may be represented, but HIE voices are not. The exclusion of HIE families leads to research agendas that don’t reflect our questions, timelines that don’t reflect our realities, and policies that fall short of what our children and families need.”
I think she has a point, many of our current parent partners are families of preterm infants, understandably, as they often spend weeks or months with us. HIE babies usually have shorter stays in the NICU, but the impacts on the families are just as great, and the long term impacts are sometimes greater. We should make extra efforts to ensure their voices are heard. She also points out some of the deficiencies of the longer term follow up of these children
“some families access follow-up to age two or three, very few have support as their children enter school, face academic challenges, or develop seizures, behavioral challenges, anxiety, or sensory processing issues. Research continues to overemphasize early developmental scores through assessments that are showing to not be predictive of neurocognitive development later in childhood when administered at age 2 or 3, and underemphasize the very issues that families identify as most critical in daily life”.
Any regular reader of this blog will know how much I agree with the limitations of early behavioural screening tests. Longer follow up of these infants is essential to both get a better picture of the impacts of HIE, but also to help the families to find the resources they need.
She ends with the following :
Max is now thirteen. He’s full of curiosity, humor, and resilience. He plays basketball, loves sushi, and is fiercely proud of how far he’s come. But he’s still living with the effects of HIE. We all are. Our journey didn’t end at discharge. It’s ongoing—and so is the work. Let’s keep going, together.
One of the numerous major advances in neonatology during my career has been the introduction of therapeutic hypothermia for infants with Hypoxic Ischemic Encephalopathy (HIE). Mortality is decreased, by about 25%, and long term morbidity among survivors is also decreased, by about 33%. Those estimates of effect size come from the Cochrane review, which provides the following Forest plot (I’m sorry about the quality of the image, the version in the pdf of the review is much clearer, but it extends over 2 pages, with a page break in the middle).
The Cochrane review also analyzed the impacts of cooling after dividing the infants according to severity of HIE, confirming that moderate and severe HIE both benefit. Unfortunately, the long term outcomes have been reported mostly up to 2 to 3 years. In the Cochrane review, 6 year outcomes are only available for the NICHD trial, which reviewed 120 survivors at 6 years, CP, IQ <70, executive function score <70, and moderate/severe disability were all lower in the hypothermia group than the controls, but the differences were small (and not “statistically significant”). The Cochrane review dates from 2013, and in 2014 the TOBY trial from the UK published follow up to 6-7 y of age, they showed more babies surviving without disability in the hypothermia group and “Among survivors, children in the hypothermia group, as compared with those in the control group, had significant reductions in the risk of cerebral palsy (21% vs. 36%, P=0.03) and the risk of moderate or severe disability (22% vs. 37%, P=0.03)”. Executive function scores were also higher in the cooled babies, and full scale IQ was 5 points higher (NS).
Scores on motor function scales were progressively worse as the children aged. Behavioural problems became more prevalent. Cognitive scores were overall fairly stable, but there was a progressive decrease in cognitive scores in the subgroup of infants who had damage to the mammillary bodies on MRI.
The changes were all due to deterioration in cognitive scores : “no further CP, epilepsy, or ASD diagnoses were made, cognitive performance declined in 11 children (28.0 %; p = 0.002; Wilcoxon test)”.
These studies point out the importance of much longer follow up of these children. They are an interesting contrast to preterm infants, who, overall, tend to have improved scores on standardized tests over time. The data on behavioural issues from the Dutch study, particularly increased internalizing behaviours, was interesting to me, as I was not really aware of this as a problem after HIE, also, behavioural problems are really important to families, and they may also be amenable to interventions to improve them.
These data make me wonder about the 2 issues of mild asphyxia, and the late preterm infant. If the prevalence of adverse outcomes changes so much over time, it may be that our decisions about which babies to cool are being influenced by somewhat unreliable data. There is very little longer term outcome data from the RCTs of cooling, to 5 to 6 years and beyond, that we might be missing a measurable benefit of cooling in such subgroups.
It is vital that trials of cooling for HIE, in groups for whom it is not yet proven to be beneficial, continue to follow the participants at least until early school age, and preferably towards adolescence. Only then will we be able to develop reliable data on the risks and benefits.
An excellent acronym for this trial. Hopefully it will lead to a trend in acronyms based on European culinary specialities. Very preterm infants, n=151, of 23 to 32 weeks GA were randomized to receive delivery room CPAP with a face mask, or with a nasal mask in a single centre study from Monash in Melbourne. Delayed clamping was attempted, without respiratory support, or immediate clamping if the baby needed intervention. If the baby needed positive pressure ventilation, that was delivered by face mask, the same in the two groups. When the babies could be placed on CPAP, they randomly had either a face mask placed, or a nasal mask.
If the nasal CPAP was unsuccessful and the baby needed PPV, they were switched to a face mask. The authors supply some videos of the procedures, including this one of a baby started on nCPAP, then changed to face mask PPV.
The primary outcome of the trial was CPAP success defined as the “proportion of infants managed with CPAP only (ie, without positive pressure ventilation, intubation, chest compressions or adrenaline) between birth and transfer to the NICU. If a newborn received no respiratory support, that was considered success of the treatment group.” The proportion of CPAP successes are shown in the following table.
All the usual clinical outcomes were similar between groups. Admission FiO2 was lower in the nasal group.
It looks like the advantage of nasal, compared to face mask, CPAP was because more of the face mask group required PPV, 47/77, compared to 31/74 nasal mask subjects. This is consistent with previous findings that face mask application can cause apnoea. Stimulation of the trigeminal nerve area can provoke respiratory pauses, and bradycardia, and it seems that the nasal mask, creating pressure over a much smaller area around the base of the nose, does not have this effect. The authors note that the starting pressure was intended to be 5 to 8 cmH2O in the 2 groups, but that the clinicians started the nasal CPAP at an average of 1 cmH2O higher in the nasal group. The nasal group also had heated humidified gases, compared to the cold dry gases in the face-mask group. These 2 differences are potential confounding reasons for the difference between the 2 groups. But, because fewer nasal group babies had PPV, the peak inspiratory pressure applied was lower in that group than the face mask group.
Despite these limitations, it seems that there may well be significant advantages in applying a nasal mask, compared to a face mask, for CPAP in the delivery room in the extremely preterm infant. Although the authors did not show any improvement in clinical outcomes (the study was not powered for such outcomes), any intervention which decreases the need for PPV during transition is probably a good thing for lung protection.
To put this in context of previous research, the Cochrane review, Ni Chathasaigh CM, et al. Nasal interfaces for neonatal resuscitation. Cochrane Database Syst Rev. 2023;10(10):CD009102, which did not include data from this trial, showed a reduction in the need for intubation in the DR with nasal interfaces compared to face mask. The 5 trials included in the Cochrane review were fairly heterogeneous: one included term infants; the nasal interface was a short nasal prong in 2 trials, short binasal prongs in 3 trials; and with a different device generating the pressure in the 2 groups in 2 of the trials. The Cochrane review showed a reduction in DR intubation, of note, this new trial had very few DR intubations, 6 in the nasal group and 7 in the face mask group; adding this to the Cochrane MA will have little impact, the weight will be small, and the tiny difference is in the same direction as the current MA results.
The review also showed less babies needing chest compressions, but that outcome was entirely dependent on the one trial that included full-term infants.
In the Donaldsson 2021 study the large majority of infants (<28 weeks) in both groups >82% received PPV. In Kamlin’s study, about the same proportion received non-intubated PPV (just over 50%) but fewer were intubated, McCarthy et al don’t seem to report how many infants needed PPV.
My interpretation of this is that it would be preferable in the very preterm infant to avoid face masks for initial CPAP support in the DR. It appears that the advantages of nasal prong systems and a nasal mask are similar, overall there is a reduction in the need for intubation in the DR, and perhaps for PPV.
This RCT from Colm O’Donnell’s group at the national maternity centre in Dublin enrolled 200 babies <32 weeks. The idea was to determine if there was an advantage to routine immediate CPAP application, using a round Fisher-Paykell face mask, cold dry gases, and a t-piece resuscitator. The comparison, selective, group had face mask CPAP (using the same system) applied if they developed signs of respiratory distress after 5 minutes of age. Infants in both groups had standard resuscitation, with PPV being started if they were apnoeic, or if they had a heart rate <100. The primary outcome was the requirement for PPV. Although the difference in the primary outcome was small and not statistically significant, more babies in the selective group required PPV in the DR, in both GA strata, and nearly half of the selective group received early CPAP (before 5 minutes).
More extensive resuscitation (intubation, chest compressions) was very similar between groups, as were all the clinical outcomes after NICU admission. Although the study did not show any differences that were “statistically significant”, there were no benefits to delaying CPAP.
My interpretation of all this is that the very immature infant would probably most benefit from early CPAP applied with a nasal interface. I like the idea of using a nasal mask to avoid the potential trauma of inserting a nasal prong; prong insertion can usually be done gently, but sometimes, especially in the smallest babies it is a tight fit and probably hurts. A really useful trial would be to investigate routine early CPAP with a nasal mask, which can also be used for PPV, compared to using a face mask according to current NRP standards.
I subscribe to Google alerts, which sends me an email whenever the phrase “neonatal research” appears on a new website or a new post. I was interested, therefore to receive an alert about an article which, according the blog “Bioengineer.org”, showed a major genetic contribution to the occurrence of Necrotising Enterocolitis.
The blog post includes the following quote “Bai et al.’s study represents a landmark in neonatal research by providing compelling evidence for the heritability of necrotizing enterocolitis in very preterm infants. The twin study design elegantly disentangles genetic predisposition from environmental influences and firmly establishes a genetic foundation for this complex disease”.
This was intriguing, so I checked on the original article. (Bai R, et al. Genetic susceptibility to necrotizing enterocolitis in very preterm infants: evidence from twin data. Pediatr Res. 2025). A nice study, from a group of authors in China, one of whom is my good friend Shoo Lee, working with the Chinese Neonatal Network. They collected data on NEC incidence and chorionicity of twin pairs of less than 32 weeks GA (or <1500g). They found no difference in the likelihood of a coherent diagnosis of NEC between monochorionic and dichorionic twins. They did further analysis restricting to surgical NEC, or comparing early and late onset NEC, and found no difference between mono- and di-chorionic twins.
In other words, the actual findings of the study are exactly the opposite of what the post on that blog stated. The conclusion of the Bai et al authors was : “heritability does not play a major role in the development of NEC”.
I don’t think an actual human being, reading the article, could possibly have misinterpreted the findings quite as dramatically as whatever generated the blog post. The post is accompanied by the following cute image, which they note was AI generated. My only explanation for this dramatic misinterpretation of the original research article is that the post itself is also AI generated, and that the AI engine just loaded the title and some sub-headings from the results (which are, indeed, misleadingly worded as if there were positive findings : “Heritability contributes to NEC” and “Heritability contributes to certain subgroups of NEC”), without being able to realize that the actual results show that the analysis, of what should have been sub-titled “Heritability contribution to NEC”, was actually zero.
At least this is on an obscure blog, and will probably not cause any harm. In contrast, actual primary publications are also being generated by AI, reporting research that never actually happened. Government policy is also being influenced by review articles written by AI, which include non-existent research, or research which has been misinterpreted, often purposefully so, for partisan ends. This is a major issue for the future of medical research.
Ybarra M, et al. Low-Grade Germinal Matrix Hemorrhage-Intraventricular Hemorrhage and Concomitant Preterm Brain Injuries: Neurodevelopmental Outcomes at 3 Years of Age. J Pediatr. 2025:114713. Previous studies of the long term outcome of infants with germinal matrix or low grade IVH have been inconsistent. Some have shown an association with poorer developmental progress, and others have shown no impact. Some of this variability may be due to uncertainty about diagnostic criteria, with slightly larger amounts of intraventricular blood being classified differently. Some is probably due to the variable association with other brain injury, not readily seen on ultrasound, such as white matter injury, or cerebellar haemorrhages. We now routinely perform imaging of the posterior fossa, which was not easy with older ultrasound machines, but small cerebellar haemorrhages are still hard to see, without MRI.
In this cohort from Toronto, 175 infants <32 weeks GA had ultrasounds, they also had early cerebral MRI at 32 to 34 weeks, if they were stable, and then again at term equivalent age. Neurologic and developmental assessments were performed at 3 years (Bayley version III). As for the results, low grade haemorrhages had no correlation with outcomes, unless associated with either large cerebellar haemorrhages or more extensive white matter injury. It has always been fascinating to me that germinal matrix haemorrhage, which destroys the primary source of cortical neurones, has so little impact on long term outcomes. It speaks to the plasticity of the newborn brain, if the Germinal Matrix is injured, other parts of the brain take over neurone production.
Take home message : there was no apparent impact of GMH or small intraventricular haemorrhages without dilatation on long term development. Cerebellar haemorrhages, if large, are associated with delayed language development at 3 years, and white matter abnormalities, if extensive, are associated with motor delay, and cerebral palsy.
Interesting review article on the impact of pasteurization, using the standard (Holder) pasteurization method, as well as some information about alternatives. The dash (-) in the figure above means no effect, rather than deletion. As you can see there are multiple impacts of pasteurization, as well as the expected impact on bacteria, some bacteria are resistant to Holder pasteurization, so donor breast milk still has an impact on the preterm intestinal microbiome. Both by direct colonization with the surviving organisms, but also because of the impact of HMOs and other components of human milk which remain despite pasteurization.
The figure also shows, in the upper right third, some alternative pasteurization methods which have been investigated, and which all show lesser impacts on breast milk components, HTST (high temperature short treatment) HPP (high pressure pasteurization) and UV-C (UV-C!). These alternative methods are equally effective at reducing bacterial load in the donor milk, and hopefully can be used in the future to give donor milk which is closer to Mothers Own Milk.
Take home message : Holder Pasteurization has major impacts on the composition of human milk. Alternative methods should be investigated, and approved.
In this observational study, the authors correlated the diet of a cohort of preterm infants <32 weeks GA with the findings on MRI at term. The cohort was enrolled over a long period, including a couple of years prior to the availability in their centre of donor milk (DHM) 2012-2014, and several years afterward 2014-2022. They include babies who almost exclusively received Mother’s own milk (MoM) and those receiving mostly formula, as well as the group with DHM. Brain volumes were greater in the human milk groups compared to formula, and diffusion tensor imaging showed diffusivity differences also, in the Corpus Callosum and the PLIC (posterior limb of the internal capsule). As the authors note, there is no good evidence from RCTs that DHM leads to better clinical neurological or developmental outcomes than formula. Nevertheless, these data are consistent with a beneficial effect of human milk on brain development, shared by DHM, and MoM.
Take home message : human milk seems to promote larger brains.
One of the benefits of MoM is that it routinely contains probiotic organisms, usually including Bifidobacteria. In this trial, 70 preterm infants <32 weeks were randomized to control or to a supplement of Bifidobacterium animalis susp lactis. As often happens in some journals the article is written in somewhat strange English; one example : “Quality control and data analysis were conducted after instrument analysis, using assessment of the peak significante equation of standard curves”. They ran a statistical comparison of the baseline characteristics of the randomized groups. This is a practice that Pediatric Research should know is ridiculous. If the groups were randomized, why run such a statistical test? It is superfluous, potentially misleading, and the CONSORT statement specifically states that it should not be done, Pediatric Research is supposed to follow CONSORT guidelines.
I started to include this article in the post as I thought it was a demonstration of the possible anti-inflammatory impact of this Bifidobacterium on the preterm intestine. But I now realize that I haven’t got a clue what most of it means. This following figure for example, is supposed to show correlations between a large number of “metabolites”; about 30 were selected from over 250 that were found in the stools, including, for example, 34 different bile acids. These figures are supposed to show correlations, negative and positive, between “metabolites”.
The legend to the figure states “Red indicates positive correlation, blue indicates negative correlation, and the darker the colour, the stronger the correlation. a Probiotic group week 2 VS Control group week 2. b Probiotic group week 2 VS Control group week 2″…. What on earth is this supposed to mean? Aside from the fact that the potential of at least 62,500 comparisons were possible, is this comparison within the 2 groups, or between the 2 groups?
What it seems to show is that they measured a huge number of molecules, the concentrations of some of them were correlated with the concentrations of others. But so what?
They also show dramatic differences in serum TLR4 concentrations between groups. I am unsure if circulating concentrations of TLR4 are of any interest; TLR4 is normally attached to granulocytes, as part of the receptor complex which recognizes lipopolysaccharide. Nevertheless, published serum concentrations range between the pg/mL range to the mg/mL range, with these new results being intermediate in the ng/mL range. Such enormously variant normal ranges (over 1 million fold differences) make me very sceptical about any results. Serum TNF-α and IL-Iß were also dramatically lower in the Bifido group. They also give exactly the same data, regarding clinical complications, in table 3 and a completely superfluous figure 10.
Pediatric Research used to be a journal that only published high-quality research, although they were rarely clinical studies, which was previously one of my criticisms of the journal. If this is typical of the quality of what is currently getting through peer review and editorial control, then the journal has fallen far indeed.
Take home message : Pediatric Research is no longer the high-quality source it once was.
This is a report of the 3 year outcomes of babies from a cluster RCT of a quality control initiative in Japan. The original publication showed no impact of the QI program (INTACT), so the authors combined the groups for this publication describing their neurological and developmental outcomes. Babies were VLBW and ranged from 22 to 31 weeks GA. Below is a selection from the extensive results, “severely delayed” refers to being <70 on the cognitive subscale of the Kyoto Scales of Psychological Development. The KSPD seems to have a similar mean to the BSID ver3, when tested on the same infants, but have a wider distribution, so a score <70 was considered severely delayed.
There was very little blindness or deafness, so, as usual, it was cognitive delay which was responsible for most of the infants who were classified as “NDI”. Unfortunately, the authors don’t report many things which matter to parents, in particular there is no mention of behavioural problems. They do have a table that they call “functional outcomes” but that is actually a report of the medical interventions being received by the infants, at 3 years of age, including home oxygen, NG tube feeding, anticonvulsant medications, etc. All of which were rare.
Take home message : The majority of survivors at every gestational age, even the most immature, do not have “moderate or severe NDI”. There is a progressive increase in “moderate or severe NDI” as GA decreases.
This is a very interesting trial evaluating the usefulness of clinical assessment of the circulation in adults with septic shock in a large international multicentre trial. Patients with suspected sepsis, who required norepinephrine after 1 litre fluid bolus, and had an elevated serum lactate, were randomized. A standardized method of measuring capillary refill time was agreed upon,
CRT was assessed by applying firm pressure to the ventral surface of the distal phalanx of a finger, using a glass microscope slide. The pressure was increased until the skin was blank, maintained for 10 seconds, and then released. The time required to return to the normal skin color was measured with a chronometer and a refill time longer than 3 seconds was defined as abnormal
and the algorithms were activated if the cap filling time was abnormal in the CRT-PHR (cap refill time- personalized haemodynamic resuscitation) group.
As you can see, if the CRT was >3 seconds, you first check the pulse pressure, and if it is >40 mmHg, then you check the diastolic BP, which may lead to increasing norepinephrine dose; the next stage may be to give more fluid to see if there is a response, and then progress to bedside echocardiography, which may lead to specific treatments, or more fluid, or eventually to low dose dobutamine.
The control group had “standard care”, CRT was recorded but the algorithm was not followed.
The primary outcome was a hierarchical composite: (1) all-cause mortality within 28 days, (2) duration of vital support (vasoactives, mechanical ventilation, and kidney replacement therapy) truncated at day 28, and (3) length of hospital stay truncated at day 28.
The trial was analyzed by the Win Ratio. 1400 patients were randomized, as it was not a paired study (one way of using the Win Ratio), but individually randomized, they stratified the patients by APACHE score, then, within strata, every patient in group 1 was compared with every patient in group 2, to determine if they won or lost. There were therefore 244 000 paired comparisons. The CRT-PHR group won 49% of the comparisons, compared to 42% of the control, usual care group. The remaining 9% were exact ties.
This exceeded the limits for statistical significance; mortality was identical at 26.5%, but there were more ICU free days, and shorter hospital stays in the CRT group. The table of interventions shows that more of the CRT group received vasopressin, more received dobutamine, and they received less fluid; at 6 hours of treatment, their CRT was shorter, and serum lactate was lower.
The analysis is illustrated below in the 2 strata of the Apache Score (a higher score predicting higher mortality); this showed a greater difference in the sicker patients.
I found this fascinating. In terms of the intervention being investigated, trial design, and analysis methodology.
Many of my readers will know of my concerns about the way we analyse composite outcomes in neonatology. Comparing “death or BPD”, “death or NDI”, “death or hiccups”, between randomized groups, as if they were of equal importance, and as if we were always sure that they would change in the same direction with an intervention. This trial is one of a growing trend to using hierarchical composites, with death being given the greatest weight in the analysis, followed by other clinical outcomes in descending order of importance. Clearly an example to be followed in neonatology.
Combining such clinical signs with the direction of change in serum lactate (the absolute value doesn’t help much in the first couple of days as it is often high after birth), urine output (also not much use immediately after birth), level of activity etc, seems to me to be likely to be important in determining treatment in septic babies also. But we have very few good randomized trials of treatment approaches in septic newborns.
This trial gives us some pointers of how we could reasonably design such a trial, with a structured algorithm of interventions, including clinical pointers and targeted functional echocardiography in some patients, and how to design and analyse the primary outcome. We could develop a consensus algorithm (it couldn’t really be evidence-based) and test against usual care, with a hierarchical composite outcome including death and brain injury and duration of intensive care support, for example.
Unfortunately all of the large confirmatory studies have been completely null, without a hint of a benefit. Including LIFT-Canada, which is in submission so I won’t go into any details, but I can say that we did not show a benefit of bLF.
There continue to be some trials which do seem to show an effect of bLF, including this very new trial (Plaza-Astasio V, et al. Preventing Sepsis in Preterm Infants with Bovine Lactoferrin: A Randomized Trial Exploring Immune and Antioxidant Effects. Nutrients. 2025;17(19)). Just over 100 VLBW infants were randomized to bLF supplementation or control, prior to 72 hours of age, and followed for LOS, as well as lab tests of antioxidant and immunologic effects. LOS was defined as “Laboratory confirmed sepsis” after 72 hours. The authors followed the NeoKisses definitions, which, as far as I can tell, include so-called “clinical sepsis” without a positive blood culture, but in the supplementary materials of this new study there are the same number of organisms listed as the episodes of sepsis, that is 11 in the bLF group and 21 in the placebo group. In other words they showed a reduction in culture-positive sepsis.
The authors note that their breast feeding rates were lower than some of the other large trials, at around 75% compared to over 90% in the large trials, and suggest this as a possible explanation for the difference of their results compared to the larger RCTs. That seems to me doubtful, if bLF was only effective in formula fed babies, then they could not have shown such a large decrease. Ochoa and her collaborators have published an IPD meta-analysis of the VLBW infants enrolled in their 2 trials (see below) which suggested that the impact of bLF was much greater among babies with low human milk intake (11% bLF, 21% controls). Although they do indeed show that, what is strange is that their analysis shows that LOS was much more frequent in babies with a high human milk intake, either with bLF (35%) or in their controls (39%), which is hard to understand. Another secondary analysis, of the data from ELFIN and the original Manzoni trial, showed similar reductions in LOS by bLF among breast-milk fed and formula fed, or mixed feeds babies. The reductions in LOS by bLF were very small and consistent with random variation in ELFIN. The interaction term was not significant, suggesting that the reduction in LOS was similar regardless of feed type.
The authors of the new study also note that their control frequency of sepsis was high, which is again true, a 40% incidence of LOS in a group of infants with a mean GA of 30 wks is extremely high. Having a higher baseline frequency of an abnormality will generally tend to make the impact of an intervention seem greater (see my recent posts on regression to the mean), but that doesn’t mean that such an impact would disappear completely when the incidence is lower.
One other difference that they do not mention is the source of bLF; the newly published trial used DicoPharm, just as did Manzoni. Akin’s study used the same product and also showed a reduction in culture-positive sepsis. Theresa Ochoa in her 2 studies used a product from Tatua ™ in the first study, derived from pasteurized milk, which had no effect on culture-positive sepsis, and a product from Friesland Campina in the other trial, which seems to be extracted by freeze-drying and not heat treated. The second trial showed a decrease in culture positive sepsis (from 11 to 8%, NS) not shared by the first study. Other studies either don’t mention the source of the bLF (Kaur et al) or I cannot obtain them as they aren’t in PubMed, or any other database that I can access (Liu, Tang, Dai). Another new study, from Egypt, randomized only formula fed infants (Ellakkany N, et al. Influence of bovine lactoferrin on feeding intolerance and intestinal permeability in preterm infants: a randomized controlled trial. Eur J Pediatr. 2024;184(1):30). They had an enormously high rate of LOS in the controls (60%) and a lower, but still extremely high, incidence in the bLF treated infants, 43.3%. The preparation they used was produced in Egypt, and I can’t find any details of how it was prepared. Finally I found one other trial, performed in Pakistan in infants with an average GA of about 34 weeks, (Ariff S, et al. Evaluation of Bovine Lactoferrin for Prevention of Late-Onset Sepsis in Low-Birth-Weight Infants: A Double-Blind Randomized Controlled Trial. Nutrients. 2025;17(11)). with a product from Hilmar in the USA which appears to have been prepared from freeze-dried milk (and perhaps not heat treated), they had an 8% incidence of culture-positive LOS in controls, and a combined 6% in the 2 treatment groups (with 2 different doses of bLF); total n of about 300.
There are lab studies showing that pasteurization decreases the biologic activity of bLF. bLF is degraded by heat treatment, it aggregates, and bind iron less well (Remadevi R, Mead D. A Study on the Bioavailability of Lactoferrin under Pasteurisation at Different Conductivities and Solid Contents. Journal of Food Research. 2025;14(2)). It could well be that heat-treatment of milk, prior to extraction of bLF, causes sufficient structural changes in the molecule for it to no longer have the multiple beneficial effects on bacterial proliferation that have been documented. This might be one reason why donor human milk (which is always pasteurized, usually by Holder pasteurization, the only method approved by HMBANA) is less effective at decreasing NEC than Mothers own Milk.
There are, however, known to be major differences in the biologic activity of different sources of bLF. One study examined 10 different bLF sources, and compared several different aspects of structure and activity between them, as well as their own bLF and human LF (Lonnerdal B, et al. Biological activities of commercial bovine lactoferrin sources. Biochem Cell Biol. 2021;99(1):35–46). There were major differences between bLF sources. As one example, they examined uptake of the LF by Caco-2 cells, and whether the LF transported iron into those cells
The details of what that means are not that important here (partly because my own understanding is limited, but also because it isn’t certain what this particular aspect has to do with their biologic effect of decreasing infections), but what this does show is that different sources of bLF are extremely different. They also found very variable degrees of contamination of the bLF product with other proteins, the Hilmar product, as one example “contains a relatively low concentration of Lf and relatively high concentrations of a-S1-casein, a-S2-casein, and J domain- containing protein”, whereas the Dicopharm product had lots of LF and relatively less of the other proteins.
I think, before we give up completely on bLF supplementation as a potential way to decrease LOS in the preterm, there is room for another study, investigating specifically the Dicopharm product, which has been consistently associated with decreases in culture-positive LOS. It may be that the story of bLF to prevent LOS still has a twist in the tale.
After my post on regression to the mean, and its importance in studies of apnoea therapy, I was thinking of other examples. Some which have been most evident to me are those which impact on areas of medicine that I have researched myself. One example, from many years ago now, looked at the haemodynamic effects of dopamine in sick preterm infants. Seri I, et al. Regional hemodynamic effects of dopamine in the sick preterm neonate. J Pediatr. 1998;133(6):728–34.
This study was performed during the 1st 2 days of life, a period when blood pressure normally gradually increases, and when renal vascular resistance falls dramatically. These known baseline changes are an additional confounder in the results of non-controlled studies. The subjects were preterm infants with what they termed “compensated shock”, that is they had a BP between the 10th and 90th percentiles, but were oliguric (<0.6 ml/kg/h of urine) and/or had slow capillary filling. They were all given dopamine, with echographic indices performed before and after.
What you can see is that overall mean BP increased, after doses of dopamine between 2.5 and 7.5 microg/kg/min
And an index of renal vascular resistance, the pulsatility index in the renal artery, decreased
These are actually changes that you would expect over time in the first hours of life. The time difference between the 2 measurement was relatively short, at about 30 minutes, one could argue, perhaps, that the changes are too quick to just be postnatal adjustment. Maybe they were caused by the dopamine?
Interestingly, the authors also presented results after, post hoc, dividing the infants into responders (who had a >10% increase in mean BP) and non-responders.
This shows that, the “responders”, panel A, had a lower mean BP before dopamine treatment, of about 35, and it increased to about 43 afterwards. The “non-responders”, panel D, had a mean BP, before and after dopamine, of just over 40 mmHg.
This is exactly what you would see if the results are entirely due to regression to the mean. Those with lower BP than average will tend to have an increase after any treatment, including placebo. It would be surprising, in an observational study such as this, for them to have given dopamine to babies with a higher BP than average.
Having said that, dopamine will in some circumstances, I think it is clear, increase BP, probably not by much at a dose of 2.5, but there is enormous variability in dopamine kinetics (and pharmacodynamics); some infants might have an increase in BP at low doses, and some have no effect at very high doses. Dopamine is, however, an effective vasoconstrictor, and any increase in BP is entirely due to vasoconstrition in the newborn. In this study, however, both “responders” and “non-responders” had a decrease in renal vascular resistance, why would this be? As I mentioned above, renal vascular resistance is known to decrease dramatically after birth; this study, for example shows an 88% decrease in RVR over the first 2 weeks of life, most of which is in the first 2 to 3 days.
I also subjected the animals to an infusion of Fenoldopam, as selective agonist of vascular dopamine receptors, which showed absolutely no renal vasodilatation.
These examples demonstrate, yet again, that one has to be very sceptical about the results of observational studies of the responses to an intervention. Whenever we treat a baby who has a problem which varies in intensity, be it apnoea, low blood pressure, oxygen requirements, oliguria, or anything else that you can think of, unless you randomize and treat only half of the infants, one can never know if any changes which are seen are due to the intervention, or just regression to the mean. Babies with BP lower than average will always tend to have higher BP the next time you measure it. Babies with low urine output will always tend to have higher urine output after an interval.
Controls, controls, controls. Preferably randomized controls. They are essential for determining the impacts, efficacy and safety of our interventions.
I mentioned in my previous post, an issue with meta-analyses; there have been several I have read recently which are very problematic. They seem to be produced by groups that have little concern for the quality of their product.
This article is free access, which means that someone paid about $4600 US to put this misleading nonsense on line. Springer journals commonly publish poor quality articles under their pay-to-publish model, and really, if any peer reviewer worth his salt had read this, it is immediately evident that there are huge issues. Just to take one other minor example, they state that the ETTNO trial did not describe the means of randomization, I guess the SR authors just didn’t read the methods which actually describes them in more than the usual detail : “The random sequence was computer generated with variable block size (2-10) using the software RandList version 2.1 (DatInf)”.
They also have weighted some outcomes in a way that the small, unobtainable Chinese trials (n of between 70 and 180) have much more weight in the analysis of duration of oxygen (for example) than the large ETTNO trial (n>1000). This is presumably because of the minuscule SD of the data from those trials, for example Wang 2013 apparently showed a duration of oxygen in the liberal transfusion group of 14 days (SD2) compared to 18 days (SD3). This study of 86 babies has a 40% weight in the analysis as a result, compared to ETTNO, given an 11% weight, probably because the SD of the duration of oxygen therapy is realistic, 50 days (SD33). My guess is that the supposed SD of duration of O2 therapy in Wang, and the other trials with extremely narrow distributions, is actually an SEM, but as the articles are inaccessible there is on way to check that.
To explain further, continuous outcomes in meta-analyses are usually weighted by the inverse of the variance. This is done so that articles with more precision in their estimates (usually the larger trials) have more impact on the calculated overall mean effect. When the variance (however it is reported) is very small, then the article might have an outsized impact on the MA, which is why it is so important to be sure that the data are reliable, and that the reported variability in the data is really a SD, and not a SEM.
If the analyses were redone, giving appropriate weight to the larger trials, then there would be no impact of transfusion threshold on respiratory outcomes.
This matters. Individual carers could give transfusions to preterm infants with the expectation that they will shorten the duration of oxygen therapy, or positive pressure respiratory support, based on this erroneous SR/meta-analysis.
Recently, when I do a lit search, I often find more Systematic Reviews and Meta-Analyses that there are original trials. I think there are academics who think its so much easier to just recycle the results of someone else’s research than to perform a trial themselves.
Reputable journals should be very careful about publishing SR/MA. They should ensure that the SR was registered, and follows PRISMA guidelines and ensure that they are not just re-performing reviews that have already been well done. They should require that the authors provide pdf copies of the original trial publications with the submission, so that peer reviewers can verify the accuracy of what is being presented. Peer reviewers should ensure that the articles included really exist, that they are trials of the intervention being evaluated, and that the results are accurately analysed.
A related issue is the question of whether the original data are reliable or not. I have read, and reviewed, articles which seem to have been written by AI, and which are probably entirely fictitious. Others have probably skewed their results to be more positive, or have reported different outcomes to those planned when they found something interesting post hoc. A new tool has been developed to try and counter these issues, called INSPECT-SR, which is available as a preprint. (Wilkinson J, et al. INSPECT-SR: a tool for assessing trustworthiness of randomised controlled trials. medRxiv. 2025:2025.09.03.25334905). The tool gives multiple checks to perform when writing an SR, as an attempt to eliminate data which are not reliable. It is a very sad that the integrity of published trials has to be questioned, but it is a reality of our current state of affairs.
Determining the integrity of a Systematic review is even more difficult, as one often does not have access to the original trials, to see if they have been accurately interpreted. The 2 SRs that I have recently criticized, one about Caffeine in the newborn, and this one about transfusions, are both addressing issues for which I was an author of one of the major included trials. My involvement made it immediately obvious to me that there were serious errors in interpretation, and that the SR/MA was very flawed. Systematic reviews of other issues, that I have had less direct involvement with, may have been just as flawed, but it could have escaped my notice.
There is much pressure in some academic circles to “publish or perish”, and to get something, anything, in to print. In some countries medical students are expected to publish an article prior to being awarded their MD degree. In others, junior academics cannot advance unless the combined weight of their output,when printed, exceeds a certain number of kg (or at least it seems that way). Journals now have a major interest in publishing anything that is submitted along with a cheque. Springer seem to be particularly egregious among the older established publishers, but some newer groups, like the Frontiers journals and MDPI have an extremely uneven profile, some of their titles being clearly predatory pay-to-publish journals, and others having higher standards.
It is incumbent on us, in the present day, to be sceptical of everything that we read, primary research and SRs. Pre-registration of trials and SRs, and data sharing are essential to ensure the integrity of the research on which we base our clinical decisions for critically sick babies.
There has for a long time been a thought that anemic babies with many apnoeas could benefit from a blood transfusion which would decrease their apnoeic spells. This idea has never been directly tested by an RCT. That is, a trial in which infants with apnoea were randomized to receive a transfusion or control, and the response accurately determined. I actually started such a trial when I was in San Diego, but only enrolled a tiny number of babies before leaving to return to Canada; the fellow who was involved finished at about the same time as me, and the project was sadly terminated. I admit that it is not ethical to randomize infants to a trial, with all the stress imposed, and the goodwill of parents involved, and then not have a mechanism to complete the trial, and I apologize to the parents and families involved.
The evidence we do have, therefore, comes from observational studies of various kinds and from secondary analysis of RCTs of blood transfusions at differing thresholds, in which the impacts on apnoea or on intermittent hypoxia (IH) have been recorded. Just to remind my readers, most IH is caused by apnoea, prolonged recordings of saturation are much easier than the prolonged multichannel recordings required to objectively quantify apnoea. I will use the terms interchangeably here, partly because the harms of multiple recurrent apnoea spells, are probably because of frequent desaturation leading to hypoxic injury, and resaturation, with consequent oxidative injury.
Interestingly, the 2 types of evidence give contradictory results. Observational studies tend to show a reduction in apnoea, or IH, after a transfusion, whereas RCTs don’t show a difference in apnoea, or IH, by randomized group; those with a haemoglobin maintained at a higher level do not have less apnoea/IH than those allowed to have lower Hgb.
This is an object lesson in the hazards of observational studies, especially for a condition which is very variable in severity (between patients and between days) and which eventually improves.
I actually use this example when I teach statistics and research design to fellows!
The figure below shows two columns of randomly distributed numbers which I generated, each connected by row number, with a mean of 0 and a SD of 1. If we take “0” to mean the overall average frequency of IH per hour in the sample, (it could be 4, for example), the numbers on the vertical axis are the number of Standard Deviations above and below the mean, this figure could be the number of IH per hour on day 7 and on day 14 of life of 200 preterm babies.
If one decides to give a treatment only to those babies who have more IH than the mean, which means you are selecting the babies in the top half of the distribution, then measure IH frequency again after the treatment. Then, even if the treatment had no effect whatsoever on IH frequency, the result you would get is shown below.
This is regression to the mean. There are very many examples of this, as a potential explanation of positive results in observational studies. Let me give you one example, of an uncontrolled study of of an apnoea treatment. Marlier L, et al. Olfactory Stimulation Prevents Apnea in Premature Newborns. Pediatrics. 2005;115(1):83–8. In this study, babies having recurrent apnoea were exposed to a nice smell, lavender wafted through their incubator, and they evaluated apnoea frequency afterwards. There were fewer apnoeas when the babies had a pleasant odour in the incubator! Without a randomized control group such data are worthless. Any variable condition will tend to get better if you start treating it when it is worse than average, whether you use something effective or give a placebo.
If you tend to give transfusions to anaemic babies who are having more apnoeas, then you are immediately creating exactly this situation. If transfusions have no impact on apnoea, then an observational study will show a significant reduction in apnoea frequency following transfusion.
Also striking is what happens if you ask the question, “do babies who have the most apnoeas have the greatest benefit from transfusion?”, then plot the initial apnoea frequency against the improvement after the treatment, using the randomly generated numbers in the figure above, this gives a correlation coefficient of 0.56 and a p-value of <0.0001.
Remember, these are from entirely random numbers, after an intervention with no real impact whatsoever on apnoea frequency! All you have to do is select the most severely affected babies to treat, they will seem to improve the most.
IH decreased from 5.3/h to 3.6/h after blood transfusion, and was unchanged at 4.6/h before and after the small number of transfusions of other blood products.
This is exactly the result you would expect if caregivers were more likely to transfuse anaemic preterm infants when they were having a greater than average number of IH, but give plasma, platelet, or other transfusions for reasons that have nothing to do with apnoea; even if there were absolutely no effect of transfusions on apnoea incidence or IH.
I am not picking this study out as a particularly egregious example, in fact, it is a better study than most, as it at least had the non-RBC controls. You would also see something similar if there was a real impact of RBC transfusions on IH.
If we look at the RCTs of blood transfusions in the preterm, which have compared different transfusion thresholds, there is no apparent impact on IH or on apnoea. This includes the most recent publication, which is a secondary analysis of an RCT. (Franz AR, et al. Effects of liberal versus restrictive transfusion strategies on intermittent hypoxaemia in extremely low birthweight infants: secondary analyses of the ETTNO randomised controlled trial. Arch Dis Child Fetal Neonatal Ed. 2025). The ETTNO trial was a randomized comparison of differing transfusion thresholds in infants <1000g birthweight that I have already discussed, which showed no impact on the primary outcome of “survival without NDI”. There was no impact in survival or on developmental progress, as measured by the Bayley version 2 MDI, results of which were identical between groups. The transfusion thresholds were a bit complicated in ETTNO, the high threshold group had 3 different thresholds according to postnatal age in stable babies, and 3 higher thresholds in “critically ill” babies. The low transfusion group had a different matrix of 6 transfusion thresholds according to postnatal age and being stable or not. Of note, one of the indications for being considered “critically ill” were 6 or more apnoeas/day requiring nursing intervention, or IH to <60% saturation >4 times per day.
This new secondary analysis compared IH frequency and severity according to randomized group. About 50% of the babies had good enough recordings of saturation for analysis, a subgroup who seemed representative of the whole sample.
There were no differences in any index of IH between groups.
In the PINT study, one of our secondary outcomes was how many babies in the low vs high transfusion threshold transfusion groups had “apnea requiring treatment”, which was 55% in the lower Hgb group and 60% in the higher Hgb group, in other words, the small difference was in the opposite direction and not in favour of an impact of Hgb on apnoea frequency.
I don’t think there are any secondary outcome data on apnoea or IH from the TOP trial, if anyone knows of any, please let me know. A much older trial (1984) of only 56 preterm infants reported apnoea frequency among infants randomized to either have their Hgb kept above 100 g/l, or to be transfused only for clinical indications, including surgery, but also including severe apnoea not responding to theophylline as a clinical indication. There were no differences in recorded apnoea frequency despite differing Hgb concentrations. The only controlled data which show a possible impact on apnoea are from the Iowa trial, of 100 preterm infants <1300 g (Bell EF, et al. Randomized trial of liberal versus restrictive guidelines for red blood cell transfusion in preterm infants. Pediatrics. 2005;115(6):1685–91) which had more apnoea spells in the lower threshold group, and more apnoea requiring nursing intervention, 0.4/day compared to 0.2/day. In that trial, infants were allowed to receive a transfusion, even if they were in the low threshold group, if they had multiple apnoeas, which is a possible confounder in analysing the meaning of that result.
I looked for systematic reviews of transfusion in the preterm to see if any had analysed the impact on apnoea, and was unable to find any other reliable data, but read my next post to see what I did find.
My take home message is that there are few reliable data to show that apnoea or IH is more frequent in infants with lower Hgb, nor any reliable evidence that RBC transfusion reduces apnoea or IH in the preterm.
If you transfuse babies who are having more apnoea, or more IH, than average they will usually have a reduction in their episodes.
But :
If you don’t transfuse babies who are having more apnoea, or more IH, than average they will usually have a reduction in their episodes.
The only way to resolve the issue would be to do a trial similar to the one that I started years ago. Enrol anaemic infants with apnoea or IH, randomize them to transfusion or control and obtain objective recordings of their responses. I have a strong feeling, based on my evaluation of the currently available data, that both groups will show a reduction in apnoea/IH, and that there would be little or no difference between the two groups.