This tool use deep learning sequence-based prediction model for peptide solubility prediction.
The intended use of this tool is for peptides or proteins expressed in E. coli that are less than 200 residues long. May provide solubility predictions more broadly applicable.
It shows excellent performance in predicting short peptides (<50). The AUROC and accuracy are 95% and 91.3% respectively, outperforming the existing method DSResSol in predicting the solubility of short peptides.
The training data contains 18,453 sequences (47.6% positive and 52.4% negative), sourced from PROSO II. These data have a wide distribution of sequence lengths (18 - 198).
It has lower accuracy for long peptide sequences (>100).
Solubility was defined in PROSO II as sequence that was transfectable, expressible, secretable, separable, and soluble in E. coli system.
Input max-length: 200 residues long.
1. Enter a single peptide (raw sequence):
Full Length:0