Protein Intrinsically Disordered Region Prediction (ProtIDR)

This tool is based on a protein large language model + CRF sequence labeling architecture to perform residue-level intrinsically disordered region (IDR) prediction for protein sequences. The model accurately determines whether each amino acid position belongs to an IDR and outputs the corresponding confidence score. It can be widely applied to protein function annotation, structure-prediction assistance, mutation analysis, and related tasks.

The model was trained on approximately 2,000 high-quality annotated sequences, with CAID2 used as the validation set and CAID3 as the test set for performance evaluation. The model is highly lightweight and provides very fast inference, making it suitable for large-scale protein sequence analysis and real-time online prediction services.

1. Protein Sequences (up to 10 FASTA entries):

Parsed sequences: 0, total residues: 0



Model Performance Metrics

Performance Comparison: ProtIDR vs. ESMDisPred (CAID3 Test Set)
========================================================================================
                          Overall Performance Metrics
========================================================================================
Metric                      ProtIDR (Ours)          ESMDisPred (SOTA)
----------------------------------------------------------------------------------------
Accuracy                    0.8413                   0.8370
MCC                         0.6104                   0.6430
ROC-AUC                     0.8922                   0.8950
Average Precision (AP)      0.7575                   0.7780
F1-max                      0.7261                   0.7590
Optimal Threshold           0.425                    N/A
========================================================================================

========================================================================================
                          Per-Class Performance (Residue-Level)
========================================================================================
Class            Metric         ProtIDR (Ours)          ESMDisPred (SOTA)
----------------------------------------------------------------------------------------
IDR (1)          Precision      0.7357                   0.7380
                 Recall         0.7067                   0.7800
                 F1             0.7209                   0.7580
----------------------------------------------------------------------------------------
Non-IDR (0)      Precision      0.8821                   0.8920
                 Recall         0.8963                   0.8640
                 F1             0.8892                   0.8780
========================================================================================
        

Last updated: 2026-04-30