Protein secondary structure refers to specific conformations formed when backbone atoms in a polypeptide chain coil or fold along certain axes. It describes the spatial arrangement of peptide backbone atoms and does not involve amino acid side chains. Two common annotation and evaluation schemes are Q3 and Q8: Q3 is a coarse three-state classification (H/E/C), while Q8 is an eight-state fine-grained classification (defined by DSSP). The main force stabilizing secondary structure is hydrogen bonding. In practice, protein secondary structure is usually a combination of different conformations rather than a single pure alpha-helix or beta-sheet form, with different proteins having different composition ratios.
1. Protein Sequence (max 1024 aa):
Total length: 0
Secondary Structure Prediction
A more accuracy secondary structure prediction model (Q3) is available at Protein Structural Property Prediction (ProtSA).
Secondary structure prediction is an essential step in protein machine learning, catalytic residue analysis, and protein structure prediction. The 8 secondary-structure states are: H (alpha-helix), G (3-10 helix), I (pi-helix), B (isolated beta-bridge), E (extended strand/beta-sheet), T (hydrogen-bonded turn), S (bend), and C/blank (coil/other).
Model Performance Metrics
We trained a protein secondary-structure prediction model supporting both Q8 and Q3. Below are the test results on the CB513 dataset (SOTA-level performance):
- CB513 Test Accuracy (Q8): 0.7587
- CB513 Test Accuracy (Q3): 0.8731
[Q8 Classification Report]
precision recall f1-score support
H 0.8850 0.9424 0.9128 43037
G 0.5375 0.4450 0.4869 5173
I 0.7033 0.4497 0.5487 796
E 0.8464 0.8789 0.8623 30090
B 0.5756 0.1622 0.2531 1831
T 0.6149 0.6481 0.6310 16457
S 0.5892 0.4262 0.4946 13541
C/L/ 0.6908 0.6911 0.6910 33086
micro avg 0.7661 0.7587 0.7624 144011
macro avg 0.6803 0.5805 0.6100 144011
weighted avg 0.7562 0.7587 0.7541 144011
[Q3 Classification Report]
precision recall f1-score support
H 0.8914 0.9211 0.9060 48993
E 0.8566 0.8543 0.8554 31845
C 0.8664 0.8447 0.8554 61789
accuracy 0.8731 142627
macro avg 0.8714 0.8733 0.8723 142627
weighted avg 0.8728 0.8731 0.8728 142627
References
- Jones, D.T. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292:195-202.
Last updated: 2026-03-19