Results
Database searches found 288 results in total (Figure 1). Citation
manager automatically removed 27 duplicates, leaving 261 titles and
abstracts which were independently reviewed by two authors. This
resulted in assessment of 58 full texts. Six authors were contacted to
clarify key information and of which, three authors responded to provide
clarification. This gave 22 publications with sufficient detail to be
included in the final analysis and risk of bias assessment (Figure 2)
(39-60). No amendments were made to the registered protocol.
Characteristics of all included publications are summarized in Table 1.
Across the 22 included manuscripts there were results for 897 penicillin
allergy cases (median 28, range 2 – 158), Table 1. The majority of
cohorts were from Europe (n=20, 91%) and two (9%) from the USA. Nearly
all studies, 95% (20 of 21 that included this information) were based
in dedicated specialist Allergy Centers/Units. Time interval from most
recent reaction to time of BAT was reported in 19 (83%) studies, with
the maximum time for any one study up to 540 months. Time from sample
collection to sample processing was only reported in nine (41%) of the
studies. Of these, one (11%) reported “immediate” analysis, one
(11%) reported “<2 hours”, and six (67%) reported
<24 hours. Penicillin allergy definition was based on European
allergy diagnostic criteria as outlined by EAACI or ENDA in eight (36%)
of studies. Clinical history and at least one of skin test results or
sIgE or drug provocation tests was used in a further 11 studies (50%).
History alone was used in three (14%).
Sensitivity and specificity values for all 22 included studies, their
risk of bias and applicability concerns are presented in Figure 2. The
SI threshold for positivity varied across the publications (2, 2.5 and 3
were all used). An estimation of a summary receiver operator
characteristic (SROC) curve was generated using results from all 22
studies (Figure 3). The Higgins’ I2 of heterogeneity
was 55.3% with a 95% Confidence Interval (CI) 27.9% - 72.4%,
indicating moderate between-study heterogeneity, and
tau2 equal to 0.2522 with a p-value <0.0001
of the Cochrane Q statistics suggests the result is statistically
significant. Twelve of the studies used an SI of 2 as positive threshold
for the diagnostic test. This allowed calculation of a summary point
sensitivity of 51% (95% CI, 46% – 56%), and specificity of 89%
(95% CI, 85% – 93%), AUC 0.666, I2 14.4% (95%
CI, 0% - 54%), tau2 0, p =0.30 (Figure 4).
From the 22 manuscripts reporting both on sensitivity and specificity,
six reported results for two different BAT assay types. Of the 28 assay
types reported, 18 (64%) used flow cytometric analysis of activation of
basophils collected directly from the patient. Four (14%) measured
sulfidoleukotriene production. Two (7%) manuscripts measured histamine
release, indirect and a direct observation of where basophils morphology
was examined under a microscope to determine if basophils had been
activeted. The different methods had similar sensitivity and specificity
profiles as can be seen in comparison of SROC curves (Figure 5) and as
seen by an even spread across the SROC curve of all 22 studies in Figure
3.
The minimum number of basophils required for a sample to be analyzed was
reported in 16 studies (73%) and was lowest in the very earliest BAT
studies (from 1963 (54) , 1964 (59) and 1986 (53)) where only 20
basophils were required to be seen per mm2 of the
microscope field. In the immediately analyzed whole blood assays that
used flow cytometry there was a median value of 500 basophils required
per sample, with a range from 200 -1000. Eleven studies that used an SI
threshold of 2, and had details of the minimum number of basophils used
in their assay, allowed an estimated summary points for sensitivity and
specificity to be generated, Figure 6. The use of a minimum of 1000
basophils (sensitivity 0.47 (95% CI, 0.39-0.56) and specificity 0.89
(95% CI, 0.78 - 0.95)) per test did not seem to confer any improvement
in sensitivity or specificity over the use of a lower minimum of 500
(sensitivity 0.47 (95% CI, 0.22 – 0.73) and specificity 0.91(95% CI,
0.84 – 0.95)).
All studies were of high or at least unclear risk of bias. The most
frequent source of potential bias was due to the patient selection
process with 14 of 22 studies (64%) rated as high risk in this domain
(Figure 2). This was largely due to the fact that most studies did not
specify how patients were identified, or if consecutive patients were
used, and several only looked at a very few selected patients. Although
careful patient selection may induce bias to the results, as a result of
this most of the studies did accurately identify patient with immediate
penicillin allergy.
In keeping with GRADE guidance on grading the certainty of evidence in
diagnostic test accuracy, we have considered the domains of imprecision
and publication bias (32). There was considerable inconsistency in the
reported sensitivity (ranging from 0.23 to 0.94) with minimal
overlapping of the 95% CI (Figure 2). This did however improve when we
considered only those studies with a positive SI threshold of 2 (Figure
4). Specificity was found to be fairly consistent (ranging from 0.67 to
0.99). The specificity also demonstrated extensive overlapping of 95%
CI (Figure 2), suggesting good consistency. Although there was variation
in CI width for the reported sensitivity, the majority of studies (16 of
22, 73%) showed a 95% CI that was entirely above the sensitivity of
0.19 seen with sIgE, which is the clinical comparison which we hope to
improve upon with BAT. The 95% CI for specificity were much narrower
than for sensitivity, demonstrating no need to lower the grading of the
certainty of the evidence based on imprecision.
Publication bias was assessed for all 22 using a funnel plot, Figure 7.
Subjectively this does not seem to be symmetrical, which suggests that
there may be evidence of publication bias. However it is recognized that
funnel plots may overestimate publication bias in meta-analyses of
diagnostic test accuracy (36). Given that papers both for and against
the use of BAT in clinical practice were published, studies were in the
majority not funded by organizations with for-profit interest, and the
authors are unaware of any unpublished studies in this field, it was
felt that publication bias was not a reason to downgrade the certainty
of evidence. Although one study showed BAT was more likely to be
positive in those with a severe reaction (41), this is no clear
relationship between the degree of BAT positivity and the severity of
the index allergic reaction across other studies. We felt this work did
not show any sensitivity-specificity relationship, and have therefore
not upgraded the certainty of evidence. Overall GRADE certainty of the
evidence for sensitivity is “very low”, and for specificity is
“low”, suggesting “the true effect might be markedly different from
the estimated effect”. This was deemed as grounds of marked
inconsistency in the sensitivity grading.
Discussion
This work primarily highlights the significant heterogeneity of methods
used in BAT and results gained by the use of BAT in penicillin allergy.
As a summary point should only be completed using methods with the same
positive threshold, our primary finding from this work is the flow
cytometric analysis with an SI threshold of 2, BAT in penicillin has an
estimated summary point sensitivity of 51% (46% – 56%) and
specificity of 89% (85% –93%). When compared to sIgE, anotherin vitro diagnostic recognized for use in penicillin allergy
diagnosis, BAT shows improved sensitivity (sIgE sensitivity of 19.3%
(95% CI, 12.0%-29.4%)) but less specificity (sIgE specificity of
97.4% (95% CI, 95.2%-98.6%)) (13).
Limitations
One recurring theme across all 22 papers included was that there was a
significant risk of bias through patient selection (Figure 2). The
majority of papers only included final results on patients with definite
immediate allergy compared to control groups with no history of allergy
and able to tolerate oral penicillin. This aids clarity in understanding
what a diagnostic test is showing, but it is not applicable to clinical
practice, where indeterminate results and alternate diagnosis, such as
delayed drug hypersensitivity and chronic spontaneous urticaria,
complicate the clinical picture. Future work to overcome this issue
should be undertaken, with prospectively collected consecutive samples
from participants with suspected penicillin allergy who are all offered
the gold standard specialist work up.
Another limitation is that, while many of the studies confirmed that
patients were classified according to the European Academy of Allergy
and Clinical Immunology (EAACI) or European Network for Drug Allergy
(ENDA) guidelines, not all participants will have had exactly the same
series of tests as part of this assessment. For example, a participant
with a positive skin prick test or sIgE will not have gone on to
undergone a DPT. Furthermore, it is now well documented that skin
testing can also lead to false positives with a recent meta-analysis
reporting a summary sensitivity of 31% (95% CI, 19%-46%) and a
specificity of 97% (95% CI, 94%-98%), (13). It is also relevant the
definition of an “immediate reaction”, ranging from any reaction
within 30 minutes of drug administration (45, 50) to those occurring
within 24 hours (61).
It is understandable that the majority (91%) of these participants were
recruited from Allergy Centers, when they have had an outpatient
referral for assessment. While there has been work looking at
de-labelling inpatients with DPT, no studies reported BAT results from
an inpatient setting. Future work is required to explore if BAT can be
used in different clinical settings, such as an emergency department, or
admissions ward, or other outpatient facility, other than a highly
specialized allergy clinic.
The time between the last reaction and BAT assessment also varied widely
between studies, and also within studies. It was therefore not possible
to undertake any sub-group analysis to comment how time from the last
BAT reaction may have influenced the BAT outcome. As one potential use
for BAT would be to “rule-out” penicillin allergy in a person with a
distant history of reaction, it would be important to know if a BAT
result is reliable many years after the last penicillin exposure. A
study published by Fernandez et al. showed that BAT reactivity decreased
significantly even over a four-year study period (62). Only 1 of 41
patients was BAT positive at the four-year mark. It does suggest that
perhaps the clinical utility of BAT as a “rule-out” test may be
limited to those that have been referred to an allergy service as soon
as possible after the reaction. This may well compliment the current
shift in practice toward direct DPT in low-risk patients with a distant
history of penicillin reaction. BAT could be used in more severe
reaction settings, such as severe intraoperative reactions where
multiple drugs are given at the same time. If a BAT is negative, it may
provide reassurance to allow a patient to undergo DPT and be
de-labelled. However, with its high specificity, BAT may be a good
“rule-in” test and, if positive, could save patients from having a
potentially harmful positive DPT. Future studies looking at the use of
BAT as a diagnostic test should be clear about the time from reaction
for the samples analyzed, as this may have a significant effect on the
BAT outcome. Further work exploring the clinical relevance of the
negativisation rates of BAT is warranted.
Analysis of methods
To denote a positive BAT result, Salas in 2018 used an SI of 1.5 based
on a ROC curve analysis comparing penicillin allergic and control
results. However, Dreborg commented in 2018 that this was a concern, as
SI should be at least 2 (63). The EAACI 2015 position paper calls for an
effort to be made to standardize the BAT assay, and as such future work
should keep an SI of 2 as the positive threshold to allow comparison of
results across different groups.
Information on minimum number of identified basophils required for any
single BAT test was not available for all studies. The subgroup analysis
showed the summary sensitivity and specificity were extremely similar,
suggesting no difference between the use of 500 or 1000 basophils. Using
a lower minimum would be much more efficient when working on a precious
resource, such as basophils, isolated from whole blood. This is
clinically useful information as using a lower minimum required number
of basophils will increase the chances of collecting a usable sample
from a patient.
Abauf et al, in 2008, compared CD63 and CD203c as markers of basophil
activation and suggested that CD203c was potentially a better marker
(39). This was repeated more recently by Heremans et al in 2022, who
also showed similar results, suggesting CD203c may give a slightly
improved sensitivity (60). The subgroup analysis comparing these two
methods against results from CD63 showed no statistically significant
difference between the methods.
The study by Molina et al. (49) looked at the use of a novel dendrimeric
antigens (DeAns) as carrier molecules for benzylpenicilloyl and
amoxicilloyl in dense and stable hapten-carrier conjugates. This did not
provide any diagnostic benefit above the use of benzylpenicilloyl,
amoxicilloyl or free penicillin in BAT in this small sample.
Clinical use
A questionnaire sent out in 2007 to allergists across the world
suggested 54% of responders used BAT in the work up of drug allergy
hypersensitivity (64). A 2018 world-wide survey of the cost of allergy
assessment found the median cost for BAT at \euro90, with only DPT
costing more than BAT at \euro190 (25). A cost analysis from the same
group concluded that, despite the cost, widespread penicillin allergy
testing with ST and DPT would be cost saving due to the use of more
targeted antibiotics, fewer courses of antibiotics, fewer outpatient
visits and fewer hospital days on those admitted (65). The current role
for BAT in clinical practice would therefore be to decrease the number
of DPT that need to be performed with their associated cost and risk.
The current order in which BAT is suggested to be used in penicillin
allergy is before ST for patients with a high-risk history, and after ST
for low-risk patients (28). As the sensitivity of BAT was better than
skin prick testing (51% vs 30%), and the specificity slightly lower
(89% vs 97%), this paper would support the use of BAT to improve he
sensitivity of allergy investigations, and reduce the number of patients
requiring DPT to exclude penicillin allergy (Figure 8).
Some studies suggest that the use of both sIgE and BAT together improves
sensitivity (41, 66). However, this opinion is not universally held, as
some groups have shown no improvement in sensitivity with the use of
sIgE and BAT together, and do not support the use of both methods (67).
The 2020 EAACI position paper suggests that “it is advisable to performin vitro tests in addition to ST in high-risk patients in order
to improve the sensitivity of the allergy workup and thus reduce the
need for DPT (moderate/strong)”, but does not clarifying if one or both
tests should be done, or which test is preferred (16). BAT shows clearly
improved sensitivity above sIgE (51% vs 19%), (13). However, including
BAT and sIgE with their respective specificity of 89% and 97%, would
still mean a small proportion of patients may erroneously be considered
positive for penicillin allergy after optimal assessment, despite being
able to tolerate penicillin. For BAT to become a routine part of the
diagnostic work up for penicillin, it must either have a sensitivity
that is high enough for it to be used as a screening test, or a
specificity higher than skin test or sIgE (>97%).
Alternatively, another potential use of BAT could be as in vitrodiagnostic option for identifying clavulanic acid-specific allergy.
Since hypersensitivity reactions to amoxicillin-clavulanic acid
co-association is very common, being able to determine if it is
clavulanic acid eliciting the allergic reaction would rescue amoxicillin
use as single drug formulation. To date there is no commercially
available sIgE to clavulanic acid. In two recent studies BAT was able
successfully diagnose clavulanic acid allergy in an adult population
(41, 68). This is another way that BAT can be used to accurately
determine true amoxicillin or clavulanic acid allergy.