Elsevier

Human Immunology

Volume 77, Issue 3, March 2016, Pages 233-237
Human Immunology

The IPD-IMGT/HLA Database – New developments in reporting HLA variation

https://doi.org/10.1016/j.humimm.2016.01.020Get rights and content

Abstract

IPD-IMGT/HLA is a constituent of the Immuno Polymorphism Database (IPD), which was developed to provide a centralised system for the study of polymorphism in genes of the immune system. The IPD project works with specialist groups of nomenclature committees who provide and curate individual sections before they are submitted to IPD for online publication. The primary database within the IPD project is the IPD-IMGT/HLA Database, which provides a locus-specific database for the hyper-polymorphic allele sequences of the genes in the HLA system, also known as the human Major Histocompatibility Complex. The IPD-IMGT/HLA Database was first released over 17 years ago, building on the work of the WHO Nomenclature Committee for Factors of the HLA system that was initiated in 1968. The IPD-IMGT/HLA Database enhanced this work by providing the HLA community with an online, searchable repository of highly curated HLA sequences. Many of the genes encode proteins of the immune system and are hyper polymorphic, with some genes currently having over 4000 known allelic variants. Through the work of the HLA Informatics Group and in collaboration with the European Bioinformatics Institute we are able to provide public access to this data through the website, http://www.ebi.ac.uk/ipd/imgt/hla.

Introduction

The Immuno Polymorphism Database (IPD) is a set of specialist databases related to the study of polymorphic genes in the immune system. The primary database within the IPD project is the IPD-IMGT/HLA Database, which provides a locus-specific database for the hyper polymorphic allelic sequences of the genes in the HLA system, also known as the human Major Histocompatibility Complex (MHC). Within the MHC is the HLA complex, which is a region of ∼4 mb on the short arm of chromosome 6 (6p21) containing >220 genes, many of which contribute to immune defence against infection [1]. The core genes of interest in the HLA system are 21 highly polymorphic HLA genes, whose protein products mediate human responses to infectious disease and influence the outcome of cell and organ transplants. The MHC is one of the most complex and polymorphic regions of the human genome [1]. These highly complex genes encode the hyper-polymorphic HLA class I (HLA-A, -B and -C) and class II molecules (HLA-DP, -DQ, and -DR) which diversify immunity within human populations and form the major genetic barriers to clinical transplantation of cells, tissues and organs. The ontology of the HLA alleles, has been continuously developed since 1968 [2], when the initial work in this area was initiated. Thirty years later, the IPD-IMGT/HLA Database was established to serve this purpose [3]. The IPD-IMGT/HLA Database continued the work of an initial phase of development when HLA class I and II sequences were published by the WHO Nomenclature Committee [4], [5], [6], [7], [8]. In the 17 years since the IPD-IMGT/HLA Database was first released [9], over 21,000 submissions have been accepted for inclusion in the database, leading to the definition of 14,000 HLA alleles by the end of November 2015. The first public release of the IPD-IMGT/HLA Database was made in December 1998. Since its inception, the database has been updated every three months, with over 66 major releases, to include all the publicly available sequences officially named by the WHO Nomenclature Committee.

A driving force behind the development and continued success of the IPD-IMGT/HLA Database is its use by the transplantation community. The HLA molecules, whose polymorphic variants are stored in the database, play a key role in transplantation, with the success of kidney and bone marrow transplantation correlated with the degree to which donors and recipient are HLA matched. It has been shown that HLA matching is recognised as a critical determinant of outcome for patients receiving unrelated donor hematopoietic stem cells for haematological disorders [10]. This has led to progressive improvements in the level of resolution achieved by HLA class I and II typing methods. HLA alleles are now routinely characterised using molecular typing methods and, moving forward, next-generation sequencing techniques [11], [12], [13]. The typing of HLA focuses on distinguishing differences at both the synonymous and non-synonymous level for the nucleotide sequences encoding the protein domains of HLA class I and II, which bind peptides and interact with variable lymphocyte receptors. The consequence of these improvements has required the development of a nucleotide sequence database that is both accurate and comprehensive.

Details of the IPD-IMGT/HLA Database have been published previously and a more detailed description of the database its structure and content are available elsewhere [14], here we report some of the new additions to be the database.

Section snippets

IPD data sources

IPD receives submissions from laboratories across the world. These submissions are curated and analysed, and if they meet the strict requirements, an official allele designation is assigned. The IPD-IMGT/HLA Database is the official repository for the WHO Nomenclature Committee for Factors of the HLA System, and is the only way of receiving an official allele designation for a sequence. Sequence submissions come from a variety of sources; the majority are from laboratories involved in clinical

Tools available at IPD-IMGT/HLA

The project provides a large number of tools for the analysis of HLA sequences. These tools are either custom written for the database or are incorporated into existing tools on the European Bioinformatics Institute (EBI) website [22], [23], see Table 1. We are constantly adding new tools and resources to the database, and list here a number of recent developments.

DPB1 T-Cell Epitope Algorithms

The IPD-IMGT/HLA Database also collaborates with clinicians to provide web-based versions of published algorithms, which have a

Discussion

The IPD-IMGT/HLA Database provides a centralised resource for the study of the HLA system, whether this is clinically or scientifically focussed. The database and accompanying tools allow the study of HLA alleles from a single site on the World Wide Web. It aids in the management and development of HLA nomenclature, providing a continuing and updated resource for the WHO Nomenclature Committee. The challenges for the database are to keep up with this increase in submitted sequences, keep pace

Funding

This work was supported by Histogenetics; One Lambda Inc., part of Thermo Fisher Scientific; Conexio; DKMS; Abbott Molecular Laboratories Inc.; the American Society for Histocompatibility and Immunogenetics; Fujirebio, Illumina; Olersup SSP; LabCorp; Lifecodes + Immunocor Gamma; the European Federation for Immunogenetics; Zentrales Knochenmarkspender-Register Deutschland; Anthony Nolan; the Asia–Pacific Histocompatibility and Immunogenetics Association; BAG Healthcare; Be the Match Foundation;

List of abbreviations (alphabetical)

  • CWD – common and well-documented

  • DDBJ – DNA DataBank of Japan

  • EBI – European Bioinformatics Institute

  • EMBL – European Molecular Biology Laboratory

  • ENA – EMBL-European Nucleotide Archive

  • FTP – file transfer protocol

  • HGVS – Human Genome Variation Society

  • HML – Histogenetics Markup Language

  • INSDC – International Nucleotide Sequence Database Collaboration

  • IPD – Immuno-Polymorphism Database

  • MHC – Major Histocompatibility Complex

  • NMDP – National Marrow Donor Program

  • SBT – sequence based typing

  • SNP – single

Acknowledgements

The authors would like to thank Angie Dahl of the Be The Match Foundation, for her continuing work in securing on-going funding for the IPD-IMGT/HLA Database. We would like to thank all of the individuals and organisations that support our work financially.

Finally the authors would like to thanks Paul Flicek and the European Bioinformatics Institute for technical and infrastructure support.

References (40)

  • WHO-Nomenclature-Committee

    Nomenclature for factors of the HL-a system

    Bull. World Health Organ.

    (1968)
  • J. Robinson et al.

    Development of the international immunogenetics HLA database

    Hum. Immunol.

    (1998)
  • S.G.E. Marsh et al.

    HLA-DRB nucleotide sequences, 1990

    Immunogenetics

    (1990)
  • S.G.E. Marsh et al.

    HLA class II nucleotide sequences, 1991

    Tissue Antigens

    (1991)
  • J. Zemmour et al.

    HLA class I nucleotide sequences, 1991

    Tissue Antigens

    (1991)
  • J. Robinson et al.

    The IPD and IMGT/HLA database: allele variant databases

    Nucl. Acids Res.

    (2015)
  • B.E. Shaw et al.

    Diverging effects of HLA-DPB1 matching status on outcome following unrelated donor transplantation depending on disease stage and the degree of matching for other HLA alleles

    Leukemia

    (2010)
  • C. Wang et al.

    High-throughput, high-fidelity HLA genotyping with deep sequencing

    Proc. Natl. Acad. Sci. USA

    (2012)
  • C.L. Holcomb et al.

    A multi-site study using high-resolution HLA genotyping by next generation sequencing

    Tissue Antigens

    (2011)
  • N.P. Mayor et al.

    HLA typing for the next generation

    PLoS One.

    (2015)
  • Cited by (86)

    • HLA-A∗02-gated safety switch for cancer therapy has exquisite specificity for its allelic target antigen

      2022, Molecular Therapy Oncolytics
      Citation Excerpt :

      However, the extensive polymorphism of HLA alleles engenders a risk that binders may cross-react with related allelic products. Indeed, HLA is one of the most polymorphic loci known, with thousands of closely related alleles in the human population.11 Therefore, it is important that a Tmod blocker component intended for the clinic be tested thoroughly to understand the limits of its specificity.

    • Meta-analysis of tumor- and T cell-intrinsic mechanisms of sensitization to checkpoint inhibition

      2021, Cell
      Citation Excerpt :

      Grantham distances between HLA gene allele pairs were calculated using the same procedure described in Pierini et al. (Pierini and Lenz, 2018), utilizing the Grantham distance metric originally designed for investigating protein evolution from physiochemical differences in amino acid sequences (Grantham, 1974). Aligned protein sequences for HLA alleles were obtained from the IMGT database (Robinson et al., 2016) for the different HLA alleles as called by Polysolver from the raw germline data files for the HLA-A, B and C genes. A custom R script was created to calculate the Grantham distance at each position on exons 2 and 3 of two aligned HLA alleles (exon 2 and 3 being the peptide binding region of the HLA protein).

    View all citing articles on Scopus
    View full text