Long term outcomes; the 2 year Bayley tells us very little

The Bayley Scales of Infants Development were created to screen babies for developmental delay, and can be used as one way of identifying children with potential problems, and then determining whether they might benefit from intervention. Unfortunately they have become a way of measuring outcomes of neonatal interventions, and are often used to determine whether such a neonatal intervention is of benefit or not.

The 5 year follow up of the CAP (Caffeine for Apnea of Prematurity) trial was fascinating for me, we compared the resuts of 18 month Bayley scores (version 2) to IQ testing done at 5 years. Only 18% of babies who had a Bayley score below 70 at 18 months had an IQ score below 70 at 5 years.

I have added 2 lines to the graph that we published. The babies whose scores are represented by the dots to the left of the blue line are those who had a Bayley MDI <70. The children under the red line are those with an IQ score <70 at 5 years. The black line is from the original and shows the regression between the Bayley score and how much it changed compared to the IQ at 5 years, which shows that the lower the Bayley scores the more, on average, that they increased by 5 years.

A new publication from the ELGAN study group has done a similar thing, but in a lot more detail, and has defined adverse outcomes at 10 years of age, comparing the results of the Bayley version 2 scores (BSID) and motor evaluation (Gross Motor Functional Classification Score) at 2 years of age to IQ tests and other evaluations at 10 years. (Taylor GL, et al. Changes in Neurodevelopmental Outcomes From Age 2 to 10 Years for Children Born Extremely Preterm. Pediatrics. 2021)

At 2 years, they defined profound NDI as BSID-II MDI ,50, PDI ,50, or GMFCS 5 and moderate to severe NDI as BSID-II MDI 50 to 70, PDI 50 to 70, GMFCS 3 to 4, bilateral legal blindness, or bilateral hearing loss requiring amplification, the others were considered “none to mild”.

At 10 years the expert panel they put together came up with these definitions : moderate impairment (IQ 55–70, GMFCS 3, bilateral hearing loss requiring amplification, bilateral legal blindness, Autism Spectrum Disorder level 2, or epilepsy), severe impairment (IQ 35–54, GMFCS 4, or ASD level 3), or profound impairment (IQ ,35, GMFCS 5, or ASD level 3 combined with IQ 35–54). They had data at both times for just over 800 babies <28 weeks gestation.

The first publication I remember that made this clear to me was by Maureen Hack (Hack M, et al. Poor Predictive Validity of the Bayley Scales of Infant Development for Cognitive Function of Extremely Low Birth Weight Children at School Age. Pediatrics. 2005;116(2):333-41). She showed that, of 78 babies with a 20 month MDI <70, only 29 had an 8 year IQ score <70. She also noted that the babies who were less likely to improve were those with neurosensory abnormalities.

The new study also seems to show that infants with profound “NDI” at 2 years who had severe motor problems GMFCS 4 or 5 were less likely to improve. Looking at their table 3, most of the babies with severe GMFCS scores either stayed profoundly impaired or worsened from 2 to 10 years, only a few improved from profound to moderate or severe.

Overall these data are rather encouraging, the proportion of babies with adverse outcomes, and, in particular, severe or profound impairments is much lower at 10 years than the 2 year evaluation would suggest, and importantly, for individual patients predictions are quite unreliable.

Looking at things from another point of view, a publication from the NICHD network examining the outcomes at 2 years of age (Rysavy MA, et al. The relationship of neurodevelopmental impairment to concurrent early childhood outcomes of extremely preterm infants. J Perinatol. 2021) compared the Bayley version 3 results of close to 3,500 babies of 22 to 26 weeks gestation to other outcomes such as hospital readmission, surgery in infancy, feeding problems leading to gastrostomy tube placement, medication use, and medical equipement needs at home.

Although many of those outcomes were more frequent among infants with so-called “Neuro-Developmental Impairment” or NDI they also occurred in infants without this label, and some outcomes such as re-hospitalisation for respiratory illness or surgery were very similar across groups of infants with no “NDI” to severe “NDI”.

As the authors of this study note, many of the outcomes that are reported in this paper are things which are important to parents, and impact their families, but are usually not collected or presented in detail.

They also note the following:

NDI does not have a consensus definition. Published definitions vary widely across studies. Small variations in the definition of NDI can have a substantial influence on its rate in a population and on its association with specific variables. Despite this, studies of NDI in children born extremely preterm are used as the basis for recommendations to make treatment decisions, including whether to direct care toward survival or palliation, and are frequently used as a component of the primary outcome of major clinical trials.

You will have noted that I usually put “NDI” in quotation marks as it is a term that I think should be abandoned. Most infants at 2 years of age are labeled as NDI because of a low score on the Bayley scales. But a low score on a developmental screening test is NOT an impairment. Although of some value for identifying infants who might benefit from further evaluation or intervention, they should not be used to determine whether a baby’s life is worth living.

The Bayley 4 is coming, I think it was released in 2019, I don’t know if the standardisation will prove to be more reflective of the general population, but I know it was re-standardized, or if it will be more predictive of longer-term impairment. But I doubt if any test designed to detect developmental delay in early childhood can predict school difficulties or persistent intellectual difficulties. Using such scores well beyond their initial intended purpose, such as for a definition of profound impairment, and then using the chance that a baby may have “profound impairment” in decision-making, is an enormous mistake.

Posted in Neonatal Research | Leave a comment

Coagulopathy and intraventricular haemorrhage

Intraventricular haemorrhages continue to be a source of concern to families of very preterm babies, and to all of us; severe hemorrhages are associated with poorer outcomes, especially bilateral extensive periventricular haemorrhagic infarction.

This is one of my occasional series of reviews of a particular neonatal issue, that are not particularly triggered by a new publication but by a question about current clinical practice. The question being: “When we find a serious intracranial haemorrhage in an at-risk very preterm baby, should we perform coagulation studies?” and of course the related questions “how should we interpret the results?” and “how should we respond to the results?”

Reviewing the literature for this is frustrating, there are not a large number of publications but they are contradictory.

Some studies have found an association between serious IVH and prolonged clotting times, whereas others have not. Some have shown associations between genetic abnormalities relating to clotting factors, whereas others have not.

Earlier studies from the ’70s and ’80s tended to use the same normal values for babies of all gestational ages. But babies with haemorrhages are generally less mature, and less mature babies have more prolonged clotting times, even in the absence of an IVH. For example, Christensen et al showed that babies under 28 weeks had longer PT and aPTT than babies of 29 to 34 weeks. Neary et al enrolled more extremely immature babies and showed that at 23 to 24 weeks the mean PT was 22.6 s with a Standard Deviation of 7 seconds (at 24 to 25 weeks the mean was 21s), the aPTT was a mean of 83s, with an SD of 37 s in the most immature babies, and 72 s (SD 21s) at 25 to 26 weeks, this study only included babies without serious IVH. Another study by the same group also showed very low concentrations of factors II and IX, and a low concentration of factor VII in the preterm

As more immature babies have more IVH and also have more prolonged clotting times in the lab, then, by chance, there will seem to be an association between coagulopathies and IVH, unless you use gestational age-adjusted normal values, or adjust statistically for GA. It is of course possible that the reason more immature babies have more IVH is because they have disrupted coagulation. The only way to be sure would be to have large numbers of babies with and without IVH of each gestational age and determine within GA strata whether babies with longer clotting times are more likely to have IVH.

We also should remember that as well as lower concentrations of procoagulant factors, newborns, adn preterm newborns in particular also have lower concentrations of anticoagulant factors (Protein C, Protein S, and Antithrombin). Full term newborn at least are at higher risk of thrombotic complications as a result.

In general, among very preterm babies, I find that those studies which prospectively obtained cord blood, or early blood samples prior to the occurrence of IVH show no difference between babies with and without haemorrhages (such as Neary et al 2015). Those which studied coagulation tests taken later on, such as at 48 hours, show more prolonged clotting times in babies with IVH, which suggests that perhaps prolonged coagulation tests are a result, rather than a cause, of IVH. This is illustrated by a study by Beverly et al. Babies who had “grade 4 IVH” had a mean of 28.7 weeks compared to those without IVH (32.3 weeks), cord blood aPTT was somewhat longer in those who eventually developed an IVH, but only by 10 seconds, at 48 hours, those without an IVH had an aPTT which had shortened by 10 seconds, whereas the IVH babies had not changed much, and the difference was then “statistically significant”.

The answer to the first question I asked therefore is “not clear”, it doesn’t appear that there is much difference between the baseline coagulation studies of very preterm babies who go on to develop IVH compared to GA matched controls without IVH, so routinely measuring coagulation studies is of no proven value. Results should probably be evaluated compared to GA appropriate standards if you do obtain them, recognizing that among the most extremely immature infants’ values are not well documented (Neary’s study for example only had 9 babies of 23-weeks).

If you find clotting times which are even longer than GA appropriate standards, what should you do about it? There are four RCTs of Fresh Frozen Plasma use analyzed in the Cochrane Review, which appears to have been last updated in 2004. That review includes all of the interventions which could be classified as “volume expansion” but has one of their comparisons as FFP versus no treatment. The analysis shows no overall difference in IVH between groups, but none of the studies had coagulopathy as an entry criterion. It is interesting that one small controlled trial (Beverley 1985) seemed to show a reduction in IVH with FFP administration, but did not show that FFP improved coagulation study results! The 2 groups had practically identical PT and aPTT after either FFP or no treatment, and only slightly higher fibrinogen concentration after FFP.

To answer my initial questions, then: “When we find a serious intracranial haemorrhage in an at-risk very preterm baby, should we perform coagulation studies?” I would say this is of no proven value, disturbed coagulation studies may be a result rather than a cause of IVH; of course, the occasional patient with congenital coagulation disorders that may warrant intervention may be missed if we never do such tests. Haemophilia A in Canada has an incidence of about 20/100,000 male births, and Haemophilia B is about 4/100,000, Von Willebrand’s is about 12/100,000 births, and other factor deficiencies together are somewhere between Haemophilia A and VWD; so overall about 0.05% of boys and 0.025% of girls.

Given the low risk of missing a congenital disorder, I think it is reasonable to limit coagulation screens to babies with something unusual in their presentation, such as an unexpected IVH, or bleeding in other sites in addition to the IVH; unless, of course, new data in the future demonstrates a clear link between disturbed coagulation studies and the development of IVH.

“How should we interpret the results?” I think we should use GA appropriate standards, with the proviso that different laboratories give different results; but differences between good quality labs are nowhere near the same order of magnitude as the differences between mildly, moderately, and extremely preterm infants. So if your lab does not have good normals for a 25-week infant (for example) you won’t go far wrong using published normal values.

“How should we respond to the results?” I think that unless a baby is actively bleeding we have no evidence that responding to abnormal coagulation studies will improve any outcomes. If we do find a result which is outside of GA appropriate normals, then replacement of the missing factors could possibly be the right thing to do, but volume expansion may be hazardous, so I think giving the replacement as slowly as feasible, or giving factor concentrates if a specific factor deficiency is discovered, should be the approach.

It is also possible that haemorrhages cause disturbances of coagulation which then lead to progression of the grade of IVH, but that also is unproven, and seems to me to be unlikely, as the pathophysiology of the worst haemorrhages appears to be venous infarction rather than continued bleeding.

A large prospective study of very early coagulation studies among babies under 27 weeks gestation, with enough numbers to determine with more certainty the link between coagulopathies and IVH would be great. It is getting harder to obtain cord blood now that delayed clamping is the norm, so such a study would be difficult to complete, but it would really help in the clinical care of our tiniest, at-risk, babies.

Posted in Neonatal Research | 7 Comments

Back to blogging

I have been unable for various reasons to blog for a while, but getting back in the saddle is a good feeling, and I plan to try and blog at least once a week. There have been a few blog posts that have been partially completed over the last 3 months, some of which I will work on and post in the next few days. In the meantime, as an added treat, here are a few pictures of birds! The first 3 are from western Europe, the other 3 are Québecois.

Avocet with worm
Greenshank and Snipe
Pileated Woodpecker
Purple Finch
Posted in Neonatal Research | 5 Comments

The HIP trial, treating hypotension in the extremely preterm infant.

I guess the HIP trial should be about screening for congenital dislocation, but in reality, it stands for Hypotension In Preterm infants. The long road to the unsatisfactory conclusion of the trial started more than 10 years ago, in discussions with Gene Dempsey, after he left his fellowship with me in Montreal, we continued to discuss how best to answer the questions about the treatment of hypotension, which then was, and still is, a major controversy. We wanted a large pragmatic trial comparing usual therapy, which was to give a bolus of saline and then start dopamine if the mean BP remained below the gestational age in weeks, to an approach where treatment was only given if there were signs of poor perfusion. We realized that it would be ethically difficult to randomize, to placebo, babies with clinical signs of poor perfusion (even though we have no good idea how to treat hypotensive babies with poor perfusion either!) so the only eligible babies should be those without such signs.

The protocol that we thought best would be to allow rescue treatment for enrolled babies if they became profoundly hypotensive or developed other signs, but preferably with something other than dopamine.

Fortunately for me and the other investigators, Gene Dempsey didn’t leave this as a pipe dream, but put together an application to the EU under the FP7 scheme, and was successful in securing funding. Gene’s brilliance and his ability to get things done have recently been rewarded by Ireland’s first and only chair in Neonatology! Congratulations Professor Dempsey.

Part of the justification for granting these funds was that we should develop a specific neonatal formulation of dopamine. As you might imagine the clinician/investigators didn’t care a bit about that, we would have been happy to study off-the-shelf dopamine made by anybody, but as dopamine is not approved for use in the newborn in Europe, (or Canada or the USA in fact) we were mandated to perform the study as a basis for a neonatal approval for the drug.

Unfortunately, that led to enormous delays in getting the study started, as the company we were working with wanted a formulation without preservatives (as did we, if possible), and a manufacturer told them it was possible, but then it turned out not to be stable, so we waited for a formulation with a preservative but designed for newborns, and finally, as we were starting to get worried about running out of time, we received approval to use off-the-shelf dopamine, and had a one-year extension to the grant. Once we actually started recruitment it was slow going, which was somewhat expected, but the PIP (pediatric investigation plan) that had been approved by the EMA forced us to restrict the study to only babies with arterial catheters in place. I understood the restriction, non-invasive blood pressures are very unreliable in very tiny babies, especially when they are hypotensive! I will usually try to put in an arterial catheter in a baby if I start some inotrope/vasopressor treatment, but in my centre, generally for babies at 25 weeks and above we don’t otherwise routinely put in arterial catheters, we restrict them to those who have clinical compromise, who wouldn’t actually be eligible for the trial!

The PIP also mandated that babies have a head ultrasound prior to enrolment to document a lack of severe abnormalities. Again I can see the point, serious head ultrasound abnormalities were part of the primary outcome, but the reality of clinical practice is that babies are often hypotensive in the first few hours of life, and head ultrasounds are often not available when the decision to treat is made.

So in our centre, despite screening babies for many months we weren’t able to randomize anyone. Other centres were more successful, either because they used more arterial catheters or had more hypotensive babies, and often the attending neonatologist performed the head ultrasounds. Over the 2 years that the trial was recruiting there were over 800 babies less than 28 weeks gestation born in participating hospitals, 307 of whom had an arterial catheter in place, 91 of those were eligible for the trial, and finally 58 were randomized. We had planned for 830 babies, which would have been feasible if we had 3 times as many centres and twice as much time.

Finally, the grant expired and we had to terminate the trial, which was a huge disappointment. However, it is nevertheless the largest prospective RCT of hypotension in the preterm and the largest placebo-controlled RCT of dopamine use in newborn infants. (Dempsey EM, Barrington KJ, Marlow N, O’Donnell CPF, Miletin J, Naulaers G, et al. Hypotension in Preterm Infants (HIP) randomised trial. Archives of Disease in Childhood: Fetal and Neonatal Edition. 2021). I recount all these difficulties for anyone planning to study similar issues in the future. In older studies, about 50% of extremely preterm babies were hypotensive, whereas in HIP a bit less than 33% of extremely preterm babies with arterial catheters in place satisfied our criteria for enrolment, which were: 1. Less than 28 weeks 2. mean arterial pressure below the gestational age in completed weeks for at least 15 minutes 3. a head ultrasound without serious IVH.

In the 2 groups, the study drug (dopamine or placebo) was given in increasing doses (up to a maximum of 20 mcg/kg/min) until an acceptable BP was achieved, (above the GA) or they developed profound hypotension or signs of shock. That was quantified as a mean BP more than 5 below the treatment threshold, or at least 2 of the following: BP more than 3 below threshold; capillary filling time more than 4 seconds; serum lactate over 4 mmol/L.

The primary outcome was survival without severe brain injury on ultrasound, which was slightly more frequent among placebo infants, 69%, than among dopamine babies, 62%. Mortality was almost identical (7 babies vs 6), and other outcomes were also similar, obviously the final trial size was severely underpowered.

The only difference of note between groups was that the placebo babies were more likely to get back-up treatment 66% vs 38%.

The study illustrates some of the difficulties in doing such a study. We really wanted to do a pragmatic study that reflects current clinical practice, which was not consistent with the requirement to have a head ultrasound prior to starting treatment, at least in most North American sites where radiology organizes the imaging. Similarly the restriction to only babies with arterial lines while justifiable, I think does not reflect universal practice.

How could we do another study to get sufficient power to answer the questions? I think that a grant from an agency that commonly funds clinical trials in newborns, with a focus on just getting a clinical answer without concerns about specific neonatal formulations would help. Using the clinical research networks that are in place or currently developing would really help, although we have made some great relationships in planning and performing this trial.

The long term outcome of these babies is being collated and analysed at present, this was a co-primary outcome, and will also, of course, be greatly underpowered, but might be able to give us a hint as to whether avoiding dopamine in the first few days of life has a major impact on development. There will also be other publications addressing side studies, the first of them, which will be appearing soon, first author Liesbeth Thewissen, evaluates the impacts of hypotension and its treatment on cerebral NIRS signals. Watch this space!

Posted in Neonatal Research | 5 Comments

More about Prebiotics

I don’t know if there is an “official” definition of prebiotics, but I think of them as molecules present in the diet that promote the growth of probiotic organisms. I believe that originally the term was applied only to molecules that are not digested by humans (or I guess another animal being studied) and are only digested by commensal organisms, including those that have a positive health benefit. I have seen the term used more widely however to include such molecules as lactoferrin which is partially degraded by humans to produce lactoferricin, and which can be absorbed and have direct impacts on health. It also is not, I believe digested by any probiotic organisms.

Although lactoferrin is a very interesting molecule, it probably shouldn’t be considered a prebiotic, the most interesting prebiotic molecules are in fact carbohydrates, including a number of oligosaccharides.

Even though the Human Milk Oligosaccharides (HMOs) are not digested and have no direct nutritional benefit for human babies, they together form the 3rd most abundant component of breast milk, after lactose and lipids. There are very many of them which appear to have importance for the growth of probiotic bacteria.

To get much further into biochemistry than I am really qualified for, the HMOs are a group of compounds which consist of various combinations of glucose, galactose, fucose, N-acetylglucosamine and N-acetyl neuraminic acid. The last mouthful in that sentence is one of a group of molecules, sugars with 9 carbon atoms, which are collectively known as sialic acid; N-acetyl neuraminic acid is also sometimes itself referred to as sialic acid, just to make it a bit more confusing.

This is a wordy preamble to introduce the fact that there is one of the HMOs which looks like it might be extremely important in the pathophysiology of NEC, or rather in the prevention of NEC and that is a molecule known as…. take a deep breath…. Disialyllacto-N-tetraose

Here is schematic of what it looks like

The purple diamonds are the N-acetyl neuraminic acid residues, the blue circle is glucose, the yellow circles are galactose, and the blue square is N-acetylglucosamine. A few years ago now the idea that this particular HMO might be very important arose from a number of studies including an animal model of NEC. (Autran CA, et al. Sialylated galacto-oligosaccharides and 2′-fucosyllactose reduce necrotising enterocolitis in neonatal rats. Br J Nutr. 2016;116(2):294-9), which was followed by a multicentre cohort study (Autran CA, et al. Human milk oligosaccharide composition predicts risk of necrotising enterocolitis in preterm infants. Gut. 2018;67(6):1064-70) showing that among mothers who were provding breast milk to their babies, the infants who nevertheless developed NEC had much lower concentrations of that particular HMO in the breast milk they were receiving.

The figure on the left shows cases with Bell stage 3 NEC in red squares, Bell stage 2 as yellow circles and the other grey dots are the matched controls.

This has just been confirmed in an independent cohort, (Masi AC, et al. Human milk oligosaccharide DSLNT and gut microbiome in preterm infants predicts necrotising enterocolitis. Gut. 2020) which again showed much lower DSLNT concentrations in the breast milk of babies who went on to develop NEC. In this study they also analyzed the intestinal microbiome and showed that babies who developed NEC had lower Bifidobacterium longum concentrations.

This work has a number of implications, for one, I wonder whether screening donor mother’s milk for the concentration of DSLNT would be feasible, and whether selecting milk with higher concentrations might enhance the protection that donor milk provides to babies whose mothers cannot produce all the milk they need.

Of course, the question of whether supplementation of the infants’ diet with DSLNT might prevent NEC is going to be the next issue. It appears that it can be synthesised, but I have no idea about the potential cost of synthetic DSLNT or whether it could be extracted from human breast milk.

Prebiotics have actually been tested in clinical trials for NEC prevention, but here you have to be very careful, and realize that not everything that has been tested are actually prebiotics according to the definitions above, and none of the studies have tested any of the most likely effective HMOs, such as DSLNT.

A new network meta-analysis, for example, (Chi C, et al. Effects of Probiotics in Preterm Infants: A Network Meta-analysis. Pediatrics. 2021;147(1)) includes 5 articles that they state studied a prebiotic. One of the studies did not actually include a prebiotic, one of the studies included a group receiving lactoferrin, and the 2 others that are easily available studied inulin. The 5th studied a “fructo-oligosaccharide” which I think is also inulin. None of these molecules are the prebiotics that we need to be studying. This network meta-analysis did, however, report, based on 45 publications including over 12,000 participants, findings that confirm those of other reviews that NEC is reduced by probiotic supplementation and that combination preparations appear to be more effective.

I think the next stage ought to be a trial of babies receiving breast milk (either maternal or donor) and a multi-strain probiotic mixture including B. longum subsp infantis which randomizes the infants to prebiotic or placebo. The prebiotic could either be a mixture of HMOs, or DSLNT, how we can obtain these I am quite unsure.

Posted in Neonatal Research | Tagged , , , | Leave a comment

When should we transfuse preterm babies, and why? Redux.

The TOP trial has just been published in the FPNEJM (Kirpalani H, et al. Higher or Lower Hemoglobin Transfusion Thresholds for Preterm Infants. N Engl J Med. 2020;383(27):2639-51). It was a multicenter, non-masked RCT among 1800 babies of less than 1 kg birthweight, between 22 weeks and <29 weeks and <48 hours of age. They had not had a previous red cell transfusion unless they had needed an emergency transfusion before 6 hours of age (which happened in about 5%). The infants were randomized to a higher or lower transfusion threshold, and the primary outcome was survival without Neurological impairment or developmental delay at about 2 years corrected age.

The study was therefore almost twice as large as the ETTNO trial that I posted about earlier this year, with similar entry criteria, except that in ETTNO babies could be enrolled up to 72 hours of age, and 25% of them had already had at least one transfusion. The average difference in haematocrit between the 2 groups in ETTNO was about 3% from week 3 to week 10, equivalent to about a haemoglobin difference of 1.1 g/100mL. This was a smaller haemoglobin separation between groups than TOP (average 1.9 g/100mL)

The primary outcome of TOP was not different between higher and lower transfusion threshold groups, and no part of the primary was different. Important, pre-specified secondary outcomes were also all just about identical between the 2 groups. This included brain injury as diagnosed on ultrasound, bronchopulmonary dysplasia (very slightly more frequent in the group transfused at a higher threshold, 59% vs 56%) and necrotising enterocolitis, 10% in each group.

The other trials with similar treatment comparisons are the aforementioned ETTNO, and PINT, as well as the Iowa transfusion trial. The Iowa trial was a little different in that it included babies up to 1300 gr (just up to 1 kg in the other trials) and had the same transfusion thresholds throughout the study, depending only on respiratory status and not changing with postnatal age. Here are the thresholds for the 4 trials, converted where necessary into Haemoglobins (g/100mL) and rounded to the nearest 0.5:

Or presented as Haematocrit, rounded to the nearest 1%.

The definition of “Sick” and “Not Sick” are somewhat different between the studies. For TOP it was entirely respiratory they used “a higher threshold when respiratory support was warranted. Respiratory support was defined as mechanical ventilation, continuous positive airway pressure, a fraction of inspired oxygen (Fio2) greater than 0.35, or delivery of oxygen or room air by nasal cannula at a flow of 1 liter per minute or more).”

In EttNO being sick meant “having at least 1 of the following criteria: invasive mechanical ventilation, continuous positive airway pressure with fraction of inspired oxygen >0.25 for >12 hours per 24 hours, treatment for patent ductus arteriosus, acute sepsis or necrotizing enterocolitis with circulatory failure requiring inotropic/vasopressor support, >6 nurse-documented apneas requiring intervention per 24 hours, or >4 intermittent hypoxemic episodes with pulse oximetry oxygen saturation <60%”

In PINT it was just respiratory support “assisted ventilation, continuous positive airway pressure, or supplemental oxygen” without further specification.

As I have ranted on about before, this makes no sense. Why do we think that a preterm infant with a saturation of 92% in 30% oxygen needs to have a higher haemoglobin than a baby with this saturation in 21% oxygen? Or if they are intubated? Maybe if they are intubated on high-frequency ventilation with a very high mean airway pressure there might be enough impact on their cardiac function to limit tissue oxygen delivery, but in the majority of patients, moderate respiratory disease or respiratory support should have no impact on tissue oxygenation or transfusion needs.

Infants with a limited cardiac output might need to have a higher haemoglobin to maintain oxygen delivery to the tissues, but I actually think that is unlikely to be a common problem; perhaps in septic shock, or with a cardiomyopathy, but most babies can probably increase their cardiac output to respond down to quite low haemoglobin concentrations. The ETTNO trial inclusion of needing cardiovascular support makes much more sense than the other criteria for demanding a higher threshold.

As you may know, the PINT outcome study showed no major difference in long term development between the high and low threshold groups, but, there were some minor differences in Bayley Scores, which appeared to favour the high threshold group, the proportion of survivors with a Bayley II MDI less than 70 was 18% in the high group vs 24% in the low group, so being extra careful they also analysed the proportion of survivors who had an MDI <85, which looked different between groups, 34% high threshold vs 45% low threshold. As a result, there remained a concern that perhaps a higher threshold would be preferable, these 2 new studies demonstrate that is not the case. Transfusion thresholds in the low columns above are consistent with good practice, and will lead to fewer babies being transfused without measurable adverse effects.

One other thing that I noticed is that the Iowa trial showed some differences in apneas between the groups, with the babies who received fewer transfusions having more apneas, and more severe apneas.

In both TOP and ETTNO, with the difficulty in clinical research of accurately quantifying apnea, the only data point they give that is relevant is when the caffeine was finally stopped (with nearly 100% of these babies having received caffeine) in the 2 trials caffeine was stopped at about the same time in the groups, suggesting that persistent apnea is not more common if you let the haemoglobin fall to these levels.

In these 2 linked blog posts I have tried to answer the question of when to transfuse, and have avoided the question of “why?”

The “why” should surely be to prevent complications or improve outcomes. The “why” on a physiologic basis is to improve oxygen-carrying capacity, when that oxygen-carrying capacity is too low to allow adequate oxygen delivery and when this leads to tissue hypoxia. There is no sign from these new data, when analyzed together with the older information, that transfusing above the Low Transfusion Thresholds is of any benefit, and I think we are way above the threshold where tissue hypoxia becomes an issue, further data of clinical situations where a transfusion is necessary would be really helpful (Do babies in shock benefit from a transfusion?). There is also, of course, no clear evidence of any harm from an approach to transfusion which follows either the high or low thresholds, or something in between!

With delayed cord clamping, initial blood work done on whatever blood is left in the placenta when possible, and restricted blood sampling throughout hospitalisation, we should be able to dramatically reduce transfusion requirements. Then we should ask the parents of infants at-risk of needing a transfusion whether they would prefer that their infant receives erythropoietin (or darbepoetin) in order to reduce the probability even further. We can ask them that while still reassuring them that blood transfusion is extremely safe.

Posted in Neonatal Research | Tagged , , , | 1 Comment

Should all asphyxiated babies have MR spectroscopy?

MRI post-asphyxia, and post-rewarming, seems to be more predictive of long term outcomes than MRI at term for preterm infants. Imaging and analysis of the Apparent Diffusion Coefficient in the PLIC (posterior Limb of the Internal Capsule), for example, has reasonably good predictive value, and I like the fact that it is relatively objective, I can look at an MRI on a hospital computer, put my cursor on the PLIC and the software gives me a number; the lower the number the worse the outcome, in simple terms. I am not any good at interpreting MRIs by looking at the images, so getting a nice clear number appeals to me (and usually confirms what I think clinically). I can’t say it has ever really helped me in the clinical care of a baby.

We haven’t always been getting spectroscopy with our MRIs, but the new framework for practice from the British Association of Perinatal Medicine recommends the following

Where possible, Proton (1H) MRS Lactate/N acetyl aspartate (Lac/NAA) of the basal ganglia and thalamus should be performed with the MRI at 5-15 days after birth. This is the most accurate predictor of outcome in babies who have undergone TH therapeutic hypothermia.

The two references they give for that statement are 2 of the only three studies I am currently aware of. Mitra S, et al. Proton magnetic resonance spectroscopy lactate/N-acetylaspartate within 2 weeks of birth accurately predicts 2-year motor, cognitive and language outcomes in neonatal encephalopathy after therapeutic hypothermia. Arch Dis Child Fetal Neonatal Ed. 2019;104(4):F424-F32 and Lally PJ, et al. Magnetic resonance spectroscopy assessment of brain injury after moderate hypothermia in neonatal encephalopathy: a prospective multicentre cohort study. Lancet Neurol. 2019;18(1):35-45. The third one being Barta H, et al. Prognostic value of early, conventional proton magnetic resonance spectroscopy in cooled asphyxiated infants. BMC Pediatr. 2018;18(1):302.

This figure shows basically what we are discussing, placing the voxel of interest in the thalamus, and then looking at the proton spectrum.

According to the framework for practice, the lactate to NAA ratio is highly predictive of adverse outcomes. Mitra et al (from where I took that image) refers to the Lac+Thr/tNAA (total NAA) which I think is the same thing as what Lally et al call the Lactate-NAA ratio. Barta et al also report that they calculated the Lac/NAA ratio, but that it was not one of the 3 ratios that adequately discriminated between the “good and poor outcome groups”.

The study by Lally defined an adverse outcome as death (I think there was only 1 death in this cohort) or a score on the Bayley 3 language or composite of <85 or cerebral palsy (GMFCS 2 or worse), seizure disorder or deafness, and has this figure :

Which shows a number of interesting things: firstly there are only 12 babies with an “adverse” outcome for the NAA concentration data (probably because it takes substantially longer to get this result, and perhaps because it is not available everywhere), there are about 26 adverse outcomes for the metabolite ratios; secondly, all of the measurements and calculated ratios show an overlap between the “normal” and “adverse” outcome babies. According to this study, the sensitivity of a Lactate-NAA ratio >0.22 is 88% and the specificity is 90% with the area under the ROC curve being 0.94. According to the same trial, an absolute NAA concentration <5.6 mmol/kg brain wet weight has a sensitivity of 100% and a specificity of 97% with an ROC curve area of 0.99.

If we look in detail at the results of Mitra et al, they included 55 infants with 16 deaths and 20 with an abnormal motor outcome, 19 with an abnormal cognitive outcome and 21 with abnormal language outcome by 2 years of age. By abnormal outcome they mean either death or a score on the Bayley3 composite of <85. They show this figure which I just can’t understand; if death or a low Bayley score on the cognitive composite was abnormal, then there should be 35 red dots, and 20 green dots, but there aren’t. There are actually more green dots than red, and even though it is difficult to count them there are more than 20 green dots.

In addition, the dot colours are the wrong way round for the language outcome. More importantly, the cut off they use for determining what is abnormal is different to the threshold used by Lally et al, a Log10 of -0.4 is a Lac+Thr/tNAA ratio of about 0.4. The ratio used as the threshold value by Lally et all (0.22) gives a Log10 value of -0.65, which, looking at these figures, if you used the threshold that Lally et al used for these babies in the Mitra study, it would classify many of their normal babies in the abnormal group.

Looking at these data I really do not see that there is enough data to support performing MR spectroscopy on all babies with HIE.

  1. We can with some degree of confidence, based on about 45 babies, state that if the Lac/NAA ratio is low, then outcomes are probably going to be worse than if they are higher. But what threshold ratio we should use to make this determination is not clear, and it doesn’t seem to work in Hungary.
  2. Does it really matter if a Bayley 3 score is <85? I think that dividing babies up into the “normal” and “abnormal” is unhelpful, and creating a category that includes 16% of the non-asphyxiated population and calling them “abnormal” is questionable. Dichotomizing the richness of human development should be avoided.
  3. What is the clinical value to an individual or their family that is added by performing MR spectroscopy? Does knowing the Lac/NAA ratio help the family in some way? Does it help them to access services, or prepare for the future, or is there some other benefit?
  4. Does MR spectroscopy adequately differentiate between the babies with very severely abnormal outcomes (such as an inability, or very restricted ability, to communicate) and babies who have no, minor, or moderate disability?

As far as I can see currently, perhaps spectroscopy can help us to perform asphyxia intervention studies with shorter follow up. If we can confirm that interventions that have no impact on spectroscopy have little or no clinical benefit, then MR spectroscopy as a biomarker for research might be very valuable to screen out things that are unlikely to improve outcomes.

The absolute NAA concentration, based on a very small sample, might be a better biomarker, but it currently takes an extra 25 minutes in the magnet, if you have the right 3T magnet and the software.

As for clinical use as a routine after HIE, the data regarding NAA concentration or the metabolite ratios are based on very small numbers of babies, with outcomes that are not very important.

If future studies can focus on extremely adversely affected babies, and if they use the same thresholds to classify studies as abnormal or not, then we might in the future have enough data to support using routine spectroscopy as a clinical tool.

Posted in Neonatal Research | Tagged , | Leave a comment

Diuretics as Anticonvulsants?

In recent years there has been a lot of interest in neonatal seizures and how to treat them. Older studies confirmed that phenobarbitone (or phenobarbital, I never know these days) appears effective, but with limits; many babies have a partial response, and many more continue to have electrical seizures even after they stop clinical convulsions. Other agents have limited efficacy, phenytoin is not as effective as a second line agent compared to giving more phenobarb and levetiracetam appears not to be living up to the initial hope that it would be a valuable neonatal agent.

Neonatal neurones are different.

In contrast with the adult brain, immature neurons actively accumulate chloride via the electroneutral NaK2Cl transporter known as NKCC1. Under these conditions, GABAA receptor activation results in a net efflux of negatively charged chloride ions, which depolarizes the membrane. The size of the chloride flux is important, and smaller anion effluxes may not trigger action potentials. However, if the membrane is depolarized sufficiently to trigger action potentials and open voltage-gated calcium channels, GABA action is clearly excitatory. Under both conditions, GABAA receptor activation may still be inhibitory by virtue of strongly depolarizing glutamate-mediated activity. The importance of the shunting effect of GABA is well established by the finding that when all GABAA receptors are blocked, the net effect is proconvulsant in the neonatal brain. Thus, synaptically released GABA has a dual action, both excitatory and inhibitory, in the immature nervous system.

That paragraph is (slightly) adapted from Dzhala VI, et al. Bumetanide enhances phenobarbital efficacy in a neonatal seizure model. Ann Neurol. 2008;63(2):222-35. The title of which reveals what this post is all about. Loop diuretics work by inhibition of the NK2Cl cotransporters in the thick ascending limb of the loop of Henle. All of them work on this ion pump but with differing affinity. Bumetanide has the highest affinity and thus requires the lowest dosage; in terms of clinical efficacy there is not much to choose between the agents, toxicities may differ depending on other possible effects, ethacrynic acid, for example, is hepatotoxic and may well cause more deafness. I always wondered, in oliguric babies, whether the fact that you need to excrete fewer millimoles of bumetanide than of furosemide might make it more effective, as the ion pump is only on the luminal side so you have to excrete the molecule before it can work (Oliveros MMD, et al. The use of bumetanide for oliguric acute renal failure in preterm infants. Pediatr Crit Care Med. 2011;12(2):210-40). In any case, it was previously thought that bumetanide might be more selective for the NKCC2, but that is probably not the case. The choice of bumetanide rather than furosemide for anticonvulsant effects might just have been an arbitrary choice at first, but most of the animal studies have been with this molecule.

Including the study by Dzhala et al; in that study, isolated slices of neonatal rat hippocampus and intact neonatal rat hippocampal preparations were induced to have convulsions by perfusing them with very low magnesium. That preparation leads to electrical phenomena which are identical to seizures. They then “treated” their preparations with drug combinations.

Phenobarbital failed to abolish or depress recurrent seizures in 70% of hippocampi. In contrast, phenobarbital in combination with bumetanide abolished seizures in 70% of hippocampi and significantly reduced the frequency, duration, and power of seizures in the remaining 30%

I am aware of 2 studies examining the effect of bumetanide on human neonatal seizures. The new one is the stimulus for this blog post (Soul JS, et al. A Pilot Randomized, Controlled, Double-Blind Trial of Bumetanide to Treat Neonatal Seizures. Ann Neurol. 2020). In this study, a diverse group of 43 term and late preterm newborns with seizures, (about half HIE, the remainder with strokes, intracranial haemorrhage or “other”) who had already received at least 20 and less than 40 mg/kg of phenobarb and were being continuously monitored with video-EEG were randomized to receive either a further 5-10 mg/kg of phenobarb and a placebo or the phenobarb plus a dose of bumetanide (increasing doses 0.1, 0.2 or 0.3 mg/kg). This was considered to be a pilot trial, and therefore the outcomes of interest are to do with the feasibility of a full trial, in this instance the safety and kinetics of bumetanide, with an exploratory outcome of the effects of bumetanide at the 3 doses on seizure burden.

There were 27 babies who received bumetanide and 16 controls, there was some baseline imbalance with all the stroke babies being in the bumetanide group. Also by chance, there was a higher seizure burden among the bumetanide babies than the controls, mean of 2.5 minutes of seizures per hour compared to 1.1 at baseline; 3.3 compared to 1.6 during the last 2 hours prior to drug administration.

That baseline imbalance means that we should be careful analyzing the data. The phenomenon of regression to the mean implies that we should expect a larger reduction in seizure burden among the bumetanide babies than the controls, just because they started with a higher burden. At first glance, that seems to be what they showed, seizure burden decreased by 1.2 with bumetanide and by 0.1 in controls during the first 4 hours after the study drug. When they looked at the second half of that 4 hour period, the controls had no improvement in seizure burden compared to pre-treatment, whereas the bumetanide babies had a decrease. When analyzed by dose, the higher dose of bumetanide looks more effective.


With the proviso of small numbers, in a situation where the natural history is very variable, and the baseline imbalance, this pilot suggests a potential role for bumetanide in neonatal seizures. The animal model I referred to above is of questionable relevance, I just did a quick search for other models which are potentially more relevant to the babies we see, by which I mean an intact neonatal mammal with an asphyxial injury. The two articles I found have differing results; Cleary RT, et al. Bumetanide enhances phenobarbital efficacy in a rat model of hypoxic neonatal seizures. PLoS One. 2013;8(3):e57148. this study gave phenobarb and then bumetanide to neonatal rats before a hypoxic insult that normally causes seizures (95% of the rats had at least 5 seizures). Bumetanide alone had an impact on seizure numbers but only at 0.3 mg/kg, phenobarb alone had an impact, phenobarb plus either dose of bumetanide had an additive impact in reducing seizures.

In the other study (Johne M, et al. Phenobarbital and midazolam suppress neonatal seizures in a noninvasive rat model of birth asphyxia, whereas bumetanide is ineffective. Epilepsia. 2020) phenobarb worked when given before the asphyxia, but not afterwards, and bumetanide did not improve the efficacy of phenobarb when given either before or after the asphyxia. However, that study outcome was the proportion of animals with seizures, which was about 100% with phenobarb or phenobarb+bumetanide. Midazolam, given after asphyxia, did decrease the proportion of rats with seizures.

In fact, these two studies aren’t necessarily in conflict, both showed that close to 100% of rats have seizures after asphyxia which doesn’t change when pretreated with either phenobarb or phenobarb+bumetanide. The Johne study doesn’t report the numbers of seizures.

To return to human beings, the other trial of bumetanide was terminated very early, it was a dose-finding trial which only included babies with asphyxia and seizures who had received 20 mg/kg of phenobarb, (Pressler RM, et al. Bumetanide for the treatment of seizures in newborn babies with hypoxic ischaemic encephalopathy (NEMO): an open-label, dose finding, and feasibility phase 1/2 trial. Lancet Neurol. 2015;14(5):469-77). The trial was stopped after 14 babies were enrolled, because they did not reach efficacy and because of some hearing loss. Of note, all the babies except one also had an aminoglycoside and several did not have seizures during the 2 hours baseline recording prior to receiving the bumetanide with an extra 10 mg/kg of phenobarb.

If you actually look at the results in detail, all of the babies who had the higher doses of bumetanide of 0.2 (n=6) or 0.3 (n=1) mg/kg of bumetanide, and who had seizures during the baseline period, had a major reduction (>50%) in seizure burden during the subsequent 4 hours, only 1 of them (who was having 16 minutes of seizures per hour) had another drug (midazolam) during that period. As has been pointed out by Marianne Thoresen (Thoresen M, Sabir H. Epilepsy: Neonatal seizures still lack safe and effective treatment. Nat Rev Neurol. 2015;11(6):311-2), that looks to me like a signal for efficacy. In the NEMO trial, 3 babies of the 11 survivors had hearing loss. In the new trial, there were 2 of 26. In many studies, around 10% of babies who survive therapeutic hypothermia for HIE have hearing loss. The frequency of hearing loss in NEMO is not enormously different.

Putting all this together, I think there remains a major possibility that bumetanide (or potentially other loop diuretics if they penetrate the brain) is a useful additive to therapeutic phenobar levels in the neonate. Particularly in HIE and perhaps in srtoke patients.

Adverse effects of diuretics are few and relatively easy to treat, in infants with asphyxial oliguria in my experience (I don’t like to say “in my experience”, but there seems to be very little data) they don’t seem to cause significant diuresis and I don’t use them for that indication. If we can avoid treating asphyxia with antibiotics, and particularly aminoglycosides, then we can probably significantly reduce the chance of hearing loss with loop diuretics.

More trials please.

Posted in Neonatal Research | Tagged , , , , | Leave a comment

Do Sub-Ependymal Haemorrhages cause cerebral palsy?

The germinal matrix is a region in the immature brain where a large proportion of cortical neurones are formed before they migrate out to form the neo-cortex. It is intensely metabolically active as it is producing hundreds of thousands of neurones per minute. It has a very high blood flow as a result and large fragile vessels that bleed easily. Such germinal matrix/sub-ependymal haemorrhages are associated in most studies with no discernable effects on long term neurological, developmental or functional problems. This has always fascinated me, how can such a critical region of the developing brain be severely damaged in a way which is clearly visible on head ultrasound, without much effect on brain development? The plasticity of the newborn and especially the preterm brain is remarkable.

A new publication from the amazing group in Melbourne (Hollebrandse NL, et al. School-age outcomes following intraventricular haemorrhage in infants born extremely preterm. Arch Dis Child Fetal Neonatal Ed. 2020) suggests that such bleeds might indeed have impacts. Using data accumulated over many years they present the outcomes at years of age of 499 babies born at <28 weeks. The results are drawn from 3 cohorts that the group of studied from Victoria state in Australia, from 1991, 1997 and 2005.

The high-quality follow-up has been maintained over those time periods and includes IQ testing, academic achievement and executive function evaluation, motor function tests, and examination for cerebral palsy. The authors note that there was no appreciable difference in the results between the 3 time periods.

There were decreasing trends with worsening grade of IVH for multiple birth and antenatal corticosteroid treatment, and increasing trends for male sex, receiving surgery, postnatal corticosteroids, bronchopulmonary dysplasia and cystic PVL

In terms of IQ, academic achievement and executive function there was no apparent difference between babies with grade 1 or 2 IVH and control babies without IVH.

Babies with grade 3 or 4 IVH had a higher proportion of babies with IQ scores <-2SD, 22% and 42% respectively compared to 12% without IVH, and many more with at least one academic skill below the term norms. (There are only 23 babies with grade 3 and 12 with grade 4 IVH in these cohorts).

Executive function did not change with grade of IVH.

Although the methods mention the term-born controls on 2 or 3 occasions, they don’t present any comparisons between the term controls and the preterm babies. I think they mention the term controls (always one of the strengths of the publications from this programme) mostly as they are the source of the standardized scores for IQ and academic achievement.

The main finding of the study, to my mind, is the association between low grades of IVH and motor abnormalities, that is: any motor dysfunction, cerebral palsy, or a low MABC score. All are more frequent with grade 1 and in particular grade 2 IVH than among babies without a haemorrhage.

This is consistent with some other studies, but not all. As in all observational studies, we cannot be sure that this association is causative. The analysis was not adjusted for many complications of neonatal care that are known to be associated with poorer outcomes, such as postnatal steroids, BPD, late-onset sepsis, NEC, sex, or surgery. There was however a similar proportion of boys in the no IVH, grade 1 and grade 2 groups, so it would probably make no difference to adjust for sex. There was more BPD (44% controls, 56% grade 1 and 2 combined) and postnatal steroids (32% vs 46%), however, in those 2 groups, both of which are themselves associated with CP and could possibly account for these findings. One of the other studies that examined this issue (Payne AH, et al. Neurodevelopmental outcomes of extremely low-gestational-age neonates with low-grade periventricular-intraventricular hemorrhage. JAMA Pediatr. 2013;167(5):451-9) examined 1472 infants <27 weeks in the NICHD network. They did not show an impact of low-grade IVH on motor outcomes and had a much smaller difference in BPD between the grade 1 and 2 babies (53% with BPD) and the controls without IVH (47%), and no difference in postnatal steroid use (14% in each group). In other studies (EPIPAGE for example) grade 2 but not grade 1 haemorrhages were associated with CP in adjusted analyses.

My evaluation of all this is that it appears that low grade IVH may have an association with motor dysfunction in some cohorts, but it is not clear at all to me that it is causative. It could well be that babies who are somewhat more unstable in the first 3 days are more likely to have low-grade haemorrhage and more likely to develop BPD, more likely to have received postnatal steroids, and probably at somewhat greater risk of white matter injury. All of which may predispose them to develop motor problems. Low-grade IVH could be considered a potential marker for infants that require extra effort to ensure they get followed up and are evaluated repeatedly to see if they need intervention.

Posted in Neonatal Research | Tagged , , | Leave a comment

Does intravitreal bevacizumab adversely affect long term development? Two simultaneous systematic reviews say yes, or no.

A reliable answer to the above question would require a large multicentre RCT comparing intravitreal bevacizumab (IVB) to laser, powered for long term outcomes. Such a trial does not currently exist.

As a result, 2 groups have just published systematic reviews of the observational studies that have compared outcomes of non-randomized cohorts with bevacizumab compared to either no treatment or to laser therapy.

The 2 reviews come to opposite conclusions!

Kaushal M, et al. Neurodevelopmental outcomes following bevacizumab treatment for retinopathy of prematurity: a systematic review and meta-analysis. J Perinatol. 2020, state “Bevacizumab treatment for severe ROP is associated with increased risk of cognitive impairment and lower cognitive and language scores in preterm infants”.

Tsai CY, et al. Neurodevelopmental Outcomes After Bevacizumab Treatment for Retinopathy of Prematurity-A Meta-Analysis. Ophthalmology. 2020. on the other hand state “severe NeuroDevelopmental Impairment risk was not increased in ROP patients after IVB treatment. Bayley-III scores were similar in the IVB and control groups, except for a minor difference in motor performance”.

In addition to cohort comparisons there is a small amount of data (n=16) from one of the centres involved in the BEAT-RoP RCT. All of the other studies reviewed were comparisons of non-randomized cohorts.

Such studies are fraught with potential bias. In our centre, for example, we were at first only using IVB for babies with BPD who were extubated and fragile and who we really didn’t want to re-intubate for laser surgery. They were therefore higher risk than laser treated babies. We also were using IVB mostly for babies with posterior disease who are not necessarily comparable to babies with zone 2 retinopathy.

Why would 2 almost simultaneous systematic reviews produce diametrically opposite results?

The first thing I did was to look at the tables with the included studies. Kaushal et al includes 13 studies, whereas Tsai has 8, three of which are not in Kaushal.

The discrepancies seem to be because Tsai included 2 studies that compared outcomes of babies who received IVB to babies with no treatment, and in one of those cases to a second control group of babies without retinopathy. You would think that such studies would show a difference in outcomes between IVB and control but in fact they showed very little. Those studies were not eligible for the review of Kaushal et al.

Kaushal includes 2 studies only reported as abstracts, which were not in Tsai’s publication list, and included 2 studies published in 2020 which may have appeared after Tsai finished their literature review. In addition 2 of the studies in Kaushal’s review only supplied mortality data, and one other does not appear to have supplied any data used in their analyses. The main difference in data sources, therefore, seems to be that Kaushal included Zayek et al and Arima et al from 2020, whereas Tsai included the above-mentioned studies with untreated controls.

As for the results, the definitions of “Severe Neurodevelopmental Impairment” are similar in the 2 reviews, and both reviews conclude that the 95% CI include an RR of 1.0, but Tsai’s analysis includes 5 studies and an RR of 1.52 (95% CI 0.91, 2.54) whereas Kaushal includes 3 studies (only 2 of which are in the Tsai analysis) and an RR of 1.33 (95% CI 0.74, 2.39).

As for the scores on the cognitive composite of the Bayley 3 evaluation, the Kaushal review, based on 6 studies, shows that cognitive scores are 1.8 points less with IVB than laser (the figure axis title wrongly states that this result “favours IVB”) 95% CI -3.5, -0.1; whereas Tsai et al also have 6 studies of IVB vs laser (only 3 of the studies are in both reviews) and a difference in cognitive scores of 1.69, 95% CI -4.9, +1.6. The 2 studies in Tsai’s review that compared laser to no treatment are calculated separately as showing little difference with wide confidence intervals (-2.6, 95% CI -8.2, +3).

In a similar way, but with a more marked difference, the scores on the Bayley 3 language composite are lower in the Kaushal review, 5.4 points less with IVB than laser (95% CI -9.2, -1.6), but in the Tsai review the difference in scores is only 1.36, (95% CI -5.5, +2.8).

What does this all mean? Basically, I don’t think you can rely on the results of these SRs to give an answer to the question. Systematic reviews of observational studies suffer from the same problems as the observational studies they are based on. Differences in characteristics of the babies treated with either therapy are likely, and, no matter how the data are adjusted, such biases remain.

Long-term visual outcomes are clearly better with IVB, with much lower rates of severe myopia. I think all that you can say about long term developmental and neurological outcomes is that there remains a concern that there could be adverse impacts of IVB, but the data collected so far are conflicting. I think we should give parents a choice when retinopathy treatment is required, informing them that for aggressive or posterior disease there are advantages of IVB, and also major unknowns for the long term. Of course the ophthalmologists treating the babies have to agree to that also!

Clearly, the large multicentre RCT, powered for long term outcomes, that I mentioned at the beginning of this post, is needed. These systematic reviews suggest that such a trial should be powered to find a 5 point difference in cognitive scores on the Bayley version 3, which would need close to 150 patients per group, or alternatively a 10% difference in the proportion of children with neurological impairment or developmental delay, in this high-risk group that would need somewhere in the region of 200 babies per group, depending on what the hypothesized baseline rate is. Those sample sizes seem achievable to me without too much difficulty, and I think this should be considered a priority for our community.

Type 1 RoP with plus disease (A) and after laser surgery (B). Hwang CK, et al. Outcomes after Intravitreal Bevacizumab versus Laser Photocoagulation for Retinopathy of Prematurity: A 5-Year Retrospective Analysis. Ophthalmology. 2015;122(5):1008-15.

Posted in Neonatal Research | Tagged , , | 4 Comments