Génie: literature-based gene prioritization at multi genomic scale

Nucleic Acids Res. 2011 Jul;39(Web Server issue):W455-61. doi: 10.1093/nar/gkr246. Epub 2011 May 23.

Abstract

Biomedical literature is traditionally used as a way to inform scientists of the relevance of genes in relation to a research topic. However many genes, especially from poorly studied organisms, are not discussed in the literature. Moreover, a manual and comprehensive summarization of the literature attached to the genes of an organism is in general impossible due to the high number of genes and abstracts involved. We introduce the novel Génie algorithm that overcomes these problems by evaluating the literature attached to all genes in a genome and to their orthologs according to a selected topic. Génie showed high precision (up to 100%) and the best performance in comparison to other algorithms in most of the benchmarks, especially when high sensitivity was required. Moreover, the prioritization of zebrafish genes involved in heart development, using human and mouse orthologs, showed high enrichment in differentially expressed genes from microarray experiments. The Génie web server supports hundreds of species, millions of genes and offers novel functionalities. Common run times below a minute, even when analyzing the human genome with hundreds of thousands of literature records, allows the use of Génie in routine lab work.

Availability: http://cbdm.mdc-berlin.de/tools/genie/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Gene Expression Profiling
  • Genes*
  • Genomics
  • Heart / embryology
  • Humans
  • Internet
  • MEDLINE
  • Mice
  • Models, Animal
  • Software*
  • Zebrafish / embryology
  • Zebrafish / genetics