Journal article
Improving signal and transit peptide predictions using AlphaFold2-predicted protein structures
Journal of molecular biology, Vol.436(2), 168393
01/15/2024
DOI: 10.1016/j.jmb.2023.168393
PMCID: PMC10843742
PMID: 38065275
Abstract
Many proteins contain cleavable signal or transit peptides that direct them to their final subcellular locations. Such peptides are usually predicted from sequence alone using methods such as TargetP 2.0 and SignalP 6.0. While these methods are usually very accurate, we show here that an analysis of a protein's AlphaFold2-predicted structure can often be used to identify false positive predictions. We start by showing that when given a protein's full-length sequence, AlphaFold2 builds experimentally annotated signal and transit peptides in orientations that point away from the main body of the protein. This indicates that AlphaFold2 correctly identifies that a signal is not destined to be part of the mature protein's structure and suggests, as a corollary, that predicted signals that AlphaFold2 folds with high confidence into the main body of the protein are likely to be false positives. To explore this idea, we analyzed predicted signal peptides in 48 proteomes made available in DeepMind's AlphaFold2 database (https://alphafold.ebi.ac.uk). Applying TargetP 2.0 and SignalP 6.0 to the 561,562 proteins in the database results in 95,236 being predicted to contain a cleavable signal or transit peptide. In 95.1% of these cases, the AlphaFold2 structure of the full-length protein is fully consistent with the prediction of TargetP 2.0 or SignalP 6.0. In the remaining 4.9% of cases where the AlphaFold2 structure does not appear consistent with the prediction, the signal is often only predicted with low confidence. The potential false positives identified here may be useful for training even more accurate signal prediction methods.
Details
- Title: Subtitle
- Improving signal and transit peptide predictions using AlphaFold2-predicted protein structures
- Creators
- Venkata R Sanaboyana - Department of Biochemistry & Molecular Biology, University of IowaAdrian H Elcock - Department of Biochemistry & Molecular Biology, University of Iowa. Electronic address: adrian-elcock@uiowa.edu
- Resource Type
- Journal article
- Publication Details
- Journal of molecular biology, Vol.436(2), 168393
- DOI
- 10.1016/j.jmb.2023.168393
- PMID
- 38065275
- PMCID
- PMC10843742
- NLM abbreviation
- J Mol Biol
- eISSN
- 1089-8638
- Grant note
- DOI: 10.13039/100008893, name: University of Iowa; DOI: 10.13039/100000002, name: National Institutes of Health, award: R35 GM122466
- Language
- English
- Electronic publication date
- 12/06/2023
- Date published
- 01/15/2024
- Academic Unit
- Physics and Astronomy; Biochemistry and Molecular Biology
- Record Identifier
- 9984530265702771
Metrics
14 Record Views