The IPD-IMGT/HLA Database – New developments in reporting HLA variation
Introduction
The Immuno Polymorphism Database (IPD) is a set of specialist databases related to the study of polymorphic genes in the immune system. The primary database within the IPD project is the IPD-IMGT/HLA Database, which provides a locus-specific database for the hyper polymorphic allelic sequences of the genes in the HLA system, also known as the human Major Histocompatibility Complex (MHC). Within the MHC is the HLA complex, which is a region of ∼4 mb on the short arm of chromosome 6 (6p21) containing >220 genes, many of which contribute to immune defence against infection [1]. The core genes of interest in the HLA system are 21 highly polymorphic HLA genes, whose protein products mediate human responses to infectious disease and influence the outcome of cell and organ transplants. The MHC is one of the most complex and polymorphic regions of the human genome [1]. These highly complex genes encode the hyper-polymorphic HLA class I (HLA-A, -B and -C) and class II molecules (HLA-DP, -DQ, and -DR) which diversify immunity within human populations and form the major genetic barriers to clinical transplantation of cells, tissues and organs. The ontology of the HLA alleles, has been continuously developed since 1968 [2], when the initial work in this area was initiated. Thirty years later, the IPD-IMGT/HLA Database was established to serve this purpose [3]. The IPD-IMGT/HLA Database continued the work of an initial phase of development when HLA class I and II sequences were published by the WHO Nomenclature Committee [4], [5], [6], [7], [8]. In the 17 years since the IPD-IMGT/HLA Database was first released [9], over 21,000 submissions have been accepted for inclusion in the database, leading to the definition of 14,000 HLA alleles by the end of November 2015. The first public release of the IPD-IMGT/HLA Database was made in December 1998. Since its inception, the database has been updated every three months, with over 66 major releases, to include all the publicly available sequences officially named by the WHO Nomenclature Committee.
A driving force behind the development and continued success of the IPD-IMGT/HLA Database is its use by the transplantation community. The HLA molecules, whose polymorphic variants are stored in the database, play a key role in transplantation, with the success of kidney and bone marrow transplantation correlated with the degree to which donors and recipient are HLA matched. It has been shown that HLA matching is recognised as a critical determinant of outcome for patients receiving unrelated donor hematopoietic stem cells for haematological disorders [10]. This has led to progressive improvements in the level of resolution achieved by HLA class I and II typing methods. HLA alleles are now routinely characterised using molecular typing methods and, moving forward, next-generation sequencing techniques [11], [12], [13]. The typing of HLA focuses on distinguishing differences at both the synonymous and non-synonymous level for the nucleotide sequences encoding the protein domains of HLA class I and II, which bind peptides and interact with variable lymphocyte receptors. The consequence of these improvements has required the development of a nucleotide sequence database that is both accurate and comprehensive.
Details of the IPD-IMGT/HLA Database have been published previously and a more detailed description of the database its structure and content are available elsewhere [14], here we report some of the new additions to be the database.
Section snippets
IPD data sources
IPD receives submissions from laboratories across the world. These submissions are curated and analysed, and if they meet the strict requirements, an official allele designation is assigned. The IPD-IMGT/HLA Database is the official repository for the WHO Nomenclature Committee for Factors of the HLA System, and is the only way of receiving an official allele designation for a sequence. Sequence submissions come from a variety of sources; the majority are from laboratories involved in clinical
Tools available at IPD-IMGT/HLA
The project provides a large number of tools for the analysis of HLA sequences. These tools are either custom written for the database or are incorporated into existing tools on the European Bioinformatics Institute (EBI) website [22], [23], see Table 1. We are constantly adding new tools and resources to the database, and list here a number of recent developments.
DPB1 T-Cell Epitope Algorithms
The IPD-IMGT/HLA Database also collaborates with clinicians to provide web-based versions of published algorithms, which have a
Discussion
The IPD-IMGT/HLA Database provides a centralised resource for the study of the HLA system, whether this is clinically or scientifically focussed. The database and accompanying tools allow the study of HLA alleles from a single site on the World Wide Web. It aids in the management and development of HLA nomenclature, providing a continuing and updated resource for the WHO Nomenclature Committee. The challenges for the database are to keep up with this increase in submitted sequences, keep pace
Funding
This work was supported by Histogenetics; One Lambda Inc., part of Thermo Fisher Scientific; Conexio; DKMS; Abbott Molecular Laboratories Inc.; the American Society for Histocompatibility and Immunogenetics; Fujirebio, Illumina; Olersup SSP; LabCorp; Lifecodes + Immunocor Gamma; the European Federation for Immunogenetics; Zentrales Knochenmarkspender-Register Deutschland; Anthony Nolan; the Asia–Pacific Histocompatibility and Immunogenetics Association; BAG Healthcare; Be the Match Foundation;
List of abbreviations (alphabetical)
CWD – common and well-documented
DDBJ – DNA DataBank of Japan
EBI – European Bioinformatics Institute
EMBL – European Molecular Biology Laboratory
ENA – EMBL-European Nucleotide Archive
FTP – file transfer protocol
HGVS – Human Genome Variation Society
HML – Histogenetics Markup Language
INSDC – International Nucleotide Sequence Database Collaboration
IPD – Immuno-Polymorphism Database
MHC – Major Histocompatibility Complex
NMDP – National Marrow Donor Program
SBT – sequence based typing
SNP – single
Acknowledgements
The authors would like to thank Angie Dahl of the Be The Match Foundation, for her continuing work in securing on-going funding for the IPD-IMGT/HLA Database. We would like to thank all of the individuals and organisations that support our work financially.
Finally the authors would like to thanks Paul Flicek and the European Bioinformatics Institute for technical and infrastructure support.
References (40)
- et al.
Nomenclature for factors of the HLA system, 1991
Immunobiology
(1993) - et al.
Nomenclature for factors of the HLA system, 1991
Hum. Immunol.
(1992) - et al.
Frequency and targeted detection of HLA-DPB1 T cell epitope disparities relevant in unrelated hematopoietic stem cell transplantation
Biol. Blood Marrow Transplant
(2007) - et al.
Effect of T-cell-epitope matching at HLA-DPB1 in recipients of unrelated-donor haemopoietic-cell transplantation: a retrospective study
Lancet Oncol.
(2012) - et al.
Nonpermissive HLA-DPB1 disparity is a significant independent risk factor for mortality after unrelated hematopoietic stem cell transplantation
Blood
(2009) - et al.
A T-cell epitope encoded by a subset of HLA-DPB1 alleles determines nonpermissive mismatches for hematologic stem cell transplantation
Blood
(2004) - et al.
The impact of amino acid variability on alloreactivity defines a functional distance predictive of permissive HLA-DPB1 mismatches in hematopoietic stem cell transplantation
Biol. Blood Marrow Transplant
(2015) - et al.
HLA class I sequence-based typing
Hum. Immunol.
(1993) - et al.
Common and well-documented HLA alleles: report of the Ad-Hoc committee of the American society for histocompatibility and immunogenetics
Hum. Immunol.
(2007) - et al.
Gene map of the extended human MHC
Nat. Rev. Genet.
(2004)
Nomenclature for factors of the HL-a system
Bull. World Health Organ.
Development of the international immunogenetics HLA database
Hum. Immunol.
HLA-DRB nucleotide sequences, 1990
Immunogenetics
HLA class II nucleotide sequences, 1991
Tissue Antigens
HLA class I nucleotide sequences, 1991
Tissue Antigens
The IPD and IMGT/HLA database: allele variant databases
Nucl. Acids Res.
Diverging effects of HLA-DPB1 matching status on outcome following unrelated donor transplantation depending on disease stage and the degree of matching for other HLA alleles
Leukemia
High-throughput, high-fidelity HLA genotyping with deep sequencing
Proc. Natl. Acad. Sci. USA
A multi-site study using high-resolution HLA genotyping by next generation sequencing
Tissue Antigens
HLA typing for the next generation
PLoS One.
Cited by (86)
Beneath the radar: immune-evasive cell sources for stroke therapy
2024, Trends in Molecular MedicineHLA-A∗02-gated safety switch for cancer therapy has exquisite specificity for its allelic target antigen
2022, Molecular Therapy OncolyticsCitation Excerpt :However, the extensive polymorphism of HLA alleles engenders a risk that binders may cross-react with related allelic products. Indeed, HLA is one of the most polymorphic loci known, with thousands of closely related alleles in the human population.11 Therefore, it is important that a Tmod blocker component intended for the clinic be tested thoroughly to understand the limits of its specificity.
Meta-analysis of tumor- and T cell-intrinsic mechanisms of sensitization to checkpoint inhibition
2021, CellCitation Excerpt :Grantham distances between HLA gene allele pairs were calculated using the same procedure described in Pierini et al. (Pierini and Lenz, 2018), utilizing the Grantham distance metric originally designed for investigating protein evolution from physiochemical differences in amino acid sequences (Grantham, 1974). Aligned protein sequences for HLA alleles were obtained from the IMGT database (Robinson et al., 2016) for the different HLA alleles as called by Polysolver from the raw germline data files for the HLA-A, B and C genes. A custom R script was created to calculate the Grantham distance at each position on exons 2 and 3 of two aligned HLA alleles (exon 2 and 3 being the peptide binding region of the HLA protein).
A holistic perspective on herpes simplex virus (HSV) ecology and evolution
2021, Advances in Virus ResearchLarge-scale integrative analysis of juvenile idiopathic arthritis for new insight into its pathogenesis
2024, Arthritis Research and TherapyHuman Leukocyte Antigen Signatures as Pathophysiological Discriminants of Microscopic Colitis Subtypes
2024, Journal of Crohn's and Colitis