A microarray platform-independent classification tool for cell of origin class allows comparative analysis of gene expression in diffuse large B-cell lymphoma

PLoS One. 2013;8(2):e55895. doi: 10.1371/journal.pone.0055895. Epub 2013 Feb 12.

Abstract

Cell of origin classification of diffuse large B-cell lymphoma (DLBCL) identifies subsets with biological and clinical significance. Despite the established nature of the classification existing studies display variability in classifier implementation, and a comparative analysis across multiple data sets is lacking. Here we describe the validation of a cell of origin classifier for DLBCL, based on balanced voting between 4 machine-learning tools: the DLBCL automatic classifier (DAC). This shows superior survival separation for assigned Activated B-cell (ABC) and Germinal Center B-cell (GCB) DLBCL classes relative to a range of other classifiers. DAC is effective on data derived from multiple microarray platforms and formalin fixed paraffin embedded samples and is parsimonious, using 20 classifier genes. We use DAC to perform a comparative analysis of gene expression in 10 data sets (2030 cases). We generate ranked meta-profiles of genes showing consistent class-association using ≥6 data sets as a cut-off: ABC (414 genes) and GCB (415 genes). The transcription factor ZBTB32 emerges as the most consistent and differentially expressed gene in ABC-DLBCL while other transcription factors such as ARID3A, BATF, and TCF4 are also amongst the 24 genes associated with this class in all datasets. Analysis of enrichment of 12323 gene signatures against meta-profiles and all data sets individually confirms consistent associations with signatures of molecular pathways, chromosomal cytobands, and transcription factor binding sites. We provide DAC as an open access Windows application, and the accompanying meta-analyses as a resource.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • B-Lymphocytes / cytology
  • Cell Survival / genetics
  • Chromosomes, Human / genetics
  • Computational Biology / methods*
  • Focal Adhesions / genetics
  • Gene Expression Profiling / methods*
  • Humans
  • Lymphoma, Large B-Cell, Diffuse / genetics*
  • Lymphoma, Large B-Cell, Diffuse / immunology
  • Lymphoma, Large B-Cell, Diffuse / pathology*
  • Oligonucleotide Array Sequence Analysis*
  • Transcription Factors / genetics

Substances

  • Transcription Factors