Haplotype analysis and genetic association study
We used in silico data generated from a previously reported
fine-mapping GWAS on genetic predictors of penicillin allergy using the
Illumina Immunochip array (Illumina Inc. CA, USA) that covers the HLA
loci.11 We extracted from the full dataset the
genotypic data corresponding to the genetic variants located in theHLA-DRB3 locus and its vicinity. The HLA-DRB3 locus is not
referenced according to the GRCh38/hg38 build, but rather according to
the Homo sapiens chromosome 6 genomic contig, GRCh38 reference
assembly alternate locus group ALT_REF_LOCI_220,21with the following coordinates: hg38
chr6_GL000251v2_alt:3,934,009-3,947,126. Since the genomic positions
of the Illumina Immunochip array were initially reported according to
the NCBI36 build, we used the Liftover tool22 from the
UCSC Genome Browser database to convert the genomic position of theHLA-DRB3 locus from the GRCh38 reference assembly to the NCBI36
build (chr6:32,571,675-32,584,792). Because of the complex structure and
high level of linkage
disequilibrium (LD) in the HLA locus,23 we considered
all the genetic variants located in the intergenic region between theHLA-DRA and HLA-DRB5 genes that included theHLA-DRB3 gene. Sample quality-control measures included: sample
call rate (>90%), overall heterozygosity, and relatedness
testing. We assessed cryptic relatedness using identity-by-descent
analysis. Genetic variants were removed from the primary analysis if
they had a call rate <90%, a significant departure from
Hardy-Weinberg equilibrium (exact HWE-P <
10ā4 among controls), or a minor allele frequency
<5%. We performed the genetic association analysis according
to the allelic model. We completed the haplotype association analysis
using a moving window with a fixed width of 4 markers. We performed
LD-pairwise analysis on all adjacent pairs of genetic variants using a
matrix output for both the expectation-maximization (EM) algorithm and
the composite-haplotype method.24,25 We used Dā values
in the LD plots. We estimated haplotype frequencies using the EM
algorithm with maximum EM iterations of 50 and an EM convergence
tolerance of 0.0001.26 We compared haplotype
frequencies using the Chi-squared test and reported the corresponding
OR, the 95% confidence interval, and the associated P -value for
each haplotype. Given the exploratory nature of our analysis, we
considered a genomic region as potentially relevant if it encompassed
genetic variants that were significantly associated with the risk of
delayed hypersensitivity to penicillins in both per-variant and
per-haplotype association analyses. All statistical analyses were
performed using the SNP & Variation Suite (Golden Helix, Inc., Bozeman,
MT, USA).