Prediction of the subcellular localization of eukaryotic proteins using sequence signals and composition

Proteomics. 2004 Jun;4(6):1591-6. doi: 10.1002/pmic.200300769.

Abstract

A tool called Locfind for the sequence-based prediction of the localization of eukaryotic proteins is introduced. It is based on bidirectional recurrent neural networks trained to read sequentially the amino acid sequence and produce localization information along the sequence. Systematic variation of the network architecture in combination with an efficient learning algorithm lead to a 91% correct localization prediction for novel proteins in fivefold cross-validation. The data and evaluation procedure are the same as the non-plant part of the widely used TargetP tool by Emanuelsson et al. The Locfind system is available on the WWW for predictions (http://www.stepc.gr/~synaptic/locfind.html).

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Eukaryotic Cells / chemistry*
  • Humans
  • Internet
  • Molecular Sequence Data
  • Neural Networks, Computer
  • Protein Sorting Signals / genetics*
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism*
  • Reproducibility of Results
  • Sequence Analysis, Protein
  • Subcellular Fractions / metabolism*

Substances

  • Protein Sorting Signals
  • Proteins