Protein loops on structural similar scaffold
Loop: irregular regions of neither alpha helix nor beta sheet defined by DSSP
Stem: regular secondary structure fragments connecing a loop
Motif: loop and its stems.
This database is composed of a series of motif families. Each family is an entry. The family is named by loop length, type of stem and a serial number beginning from 0. So the loop length in family 4ab1 is 4, the upstream of loop in 4ab1 is a helix, the downstream is a strand.
The sequential and structural parameters are stored in a file with the same name of that family. The coordinates of every motif in each family are stored in a single file in PDB format. These files are extracted from original PDB entries and are translated and rotated so that all the motifs in the family can superimposed.
The following is a sample with remarks in RED color
#Head information
The first part is some summary information
The similarity of two sequences is calculated using global alignment score
of align program of FASTA package
NAME 6bb1 Name of that family
TYPE beta-beta Type of stems
REPRESENTATIVE 1mlb-A_33
The representative of that family is named by the
PDB code, chain id and a number where the motif begins.
MEMBERS 12 How many motifs in that family
CLUSTER 5 Conformational sub-clusters
#Loop element
Name of motif motif sequence. coordinate file ..subcluster
>1mlb-A_33 ..lhwyqqKSHESPrllik. 6bb1_0.pdb 0
>1hil-A_39 ..ltwyqqKPGQPPkvliy. 6bb1_1.pdb 0
>1fdl-L_33 ..lawyqqKQGKSPqllvy. 6bb1_2.pdb 0
>1vge-L_33 ..lawyqqKPGKAPrlliy. 6bb1_3.pdb 0
>1eap-A_33 ..igwyqhKPGKGPrllih. 6bb1_4.pdb 1
>1wtl-B_33 ..vnwfqqRPGQAPkvliy. 6bb1_5.pdb 0
>2fb4-L_35 ...nwyqqLPGMAPklliy. 6bb1_6.pdb 2
>1bre-B_34 ...iwyqqKLGKAPnlliy. 6bb1_7.pdb 0
>1jhl-L_33 ..lawyqeKPGKTNnlliy. 6bb1_8.pdb 3
>4bjl-B_34 ..vtwyqhLSGTAPklliy. 6bb1_9.pdb 4
>1rei-A_33 ..lnwyqqTPGKAPklliy. 6bb1_10.pdb 0
>7fab-L_36 ...kwyqqLPGTAPkl.... 6bb1_11.pdb 0
#Motif sequence
Aligned motif sequences
..LHWYQQKSHESPRLLIK.
..LTWYQQKPGQPPKVLIY.
..LAWYQQKQGKSPQLLVY.
..LAWYQQKPGKAPRLLIY.
..IGWYQHKPGKGPRLLIH.
..VNWFQQRPGQAPKVLIY.
...NWYQQLPGMAPKLLIY.
...IWYQQKLGKAPNLLIY.
..LAWYQEKPGKTNNLLIY.
..VTWYQHLSGTAPKLLIY.
..LNWYQQTPGKAPKLLIY.
...KWYQQLPGTAPKL....
#Loop sequence
Aligned loop sequences
KSHESP
KPGQPP
KQGKSP
KPGKAP
KPGKGP
RPGQAP
LPGMAP
KLGKAP
KPGKTN
LSGTAP
TPGKAP
LPGTAP
#Fingerprint
Sequence pattern. the most frequently used
amino acid is in the first row. a ":" means same as above
K P G K A P
: : : : : :
: : : : : :
: : : : : :
: : : : : :
: : : : : :
: : : Q : :
L : : : S :
: S : T : :
: : : : G :
R L : E P :
T Q H M T N
#Geometry parameters
DISTANCE 6.62 ( 0.18)
Averaged distance between each end of loop
Vector1 ( 0.22 -0.80 -0.56)
Direction of beginning stem
Vector2 ( -0.45 0.78 0.43)
Direction of ending stem
Vector3 ( 0.12 0.99 -0.06)
Divection of loop
Struc diversity(A) ( 2.0 4.1 6.8 3.9 2.8 1.9)
Maximum variation of each CA atoms in loop of that family
#RMSD matrix of conformation of loops (A)
0.00 0.99 0.61 0.49 1.57 0.47 1.99 0.32 2.19 1.26 0.60 0.73
0.99 0.00 0.80 1.04 0.94 1.09 2.86 0.87 1.58 1.27 0.83 0.36
0.61 0.80 0.00 0.62 1.57 0.81 2.36 0.46 2.14 1.30 0.67 0.46
0.49 1.04 0.62 0.00 1.72 0.65 1.92 0.48 2.30 1.05 0.65 0.79
1.57 0.94 1.57 1.72 0.00 1.68 3.29 1.53 0.89 1.61 1.31 1.19
0.47 1.09 0.81 0.65 1.68 0.00 2.08 0.51 2.33 1.42 0.94 0.88
1.99 2.86 2.36 1.92 3.29 2.08 0.00 2.13 3.74 2.22 2.19 2.61
0.32 0.87 0.46 0.48 1.53 0.51 2.13 0.00 2.12 1.19 0.57 0.59
2.19 1.58 2.14 2.30 0.89 2.33 3.74 2.12 0.00 1.87 1.79 1.79
1.26 1.27 1.30 1.05 1.61 1.42 2.22 1.19 1.87 0.00 0.89 1.22
0.60 0.83 0.67 0.65 1.31 0.94 2.19 0.57 1.79 0.89 0.00 0.64
0.73 0.36 0.46 0.79 1.19 0.88 2.61 0.59 1.79 1.22 0.64 0.00
#Sequence identities difference matrix of loop (%)
0.0 67.0 50.0 67.0 67.0 83.0 83.0 67.0 83.0 67.0 83.0 83.0
67.0 0.0 50.0 33.0 33.0 33.0 50.0 50.0 50.0 67.0 50.0 50.0
50.0 50.0 0.0 33.0 33.0 67.0 67.0 33.0 50.0 67.0 50.0 67.0
67.0 33.0 33.0 0.0 17.0 33.0 33.0 17.0 33.0 50.0 17.0 33.0
67.0 33.0 33.0 17.0 0.0 50.0 50.0 33.0 33.0 67.0 33.0 50.0
83.0 33.0 67.0 33.0 50.0 0.0 33.0 50.0 67.0 50.0 33.0 33.0
83.0 50.0 67.0 33.0 50.0 33.0 0.0 50.0 67.0 33.0 33.0 17.0
67.0 50.0 33.0 17.0 33.0 50.0 50.0 0.0 50.0 50.0 33.0 50.0
83.0 50.0 50.0 33.0 33.0 67.0 67.0 50.0 0.0 83.0 50.0 67.0
67.0 67.0 67.0 50.0 67.0 50.0 33.0 50.0 83.0 0.0 50.0 17.0
83.0 50.0 50.0 17.0 33.0 33.0 33.0 33.0 50.0 50.0 0.0 33.0
83.0 50.0 67.0 33.0 50.0 33.0 17.0 50.0 67.0 17.0 33.0 0.0
#Sequence identities difference matrix of motif (%)
0.0 40.0 35.0 30.0 40.0 55.0 45.0 40.0 45.0 45.0 40.0 55.0
40.0 0.0 35.0 25.0 40.0 25.0 30.0 35.0 35.0 35.0 25.0 45.0
35.0 35.0 0.0 20.0 40.0 50.0 40.0 30.0 30.0 45.0 30.0 50.0
30.0 25.0 20.0 0.0 25.0 35.0 25.0 20.0 20.0 35.0 15.0 40.0
40.0 40.0 40.0 25.0 0.0 50.0 40.0 35.0 35.0 40.0 35.0 50.0
55.0 25.0 50.0 35.0 50.0 0.0 25.0 40.0 50.0 35.0 25.0 45.0
45.0 30.0 40.0 25.0 40.0 25.0 0.0 25.0 40.0 25.0 15.0 25.0
40.0 35.0 30.0 20.0 35.0 40.0 25.0 0.0 30.0 35.0 25.0 40.0
45.0 35.0 30.0 20.0 35.0 50.0 40.0 30.0 0.0 45.0 30.0 55.0
45.0 35.0 45.0 35.0 40.0 35.0 25.0 35.0 45.0 0.0 30.0 35.0
40.0 25.0 30.0 15.0 35.0 25.0 15.0 25.0 30.0 30.0 0.0 35.0
55.0 45.0 50.0 40.0 50.0 45.0 25.0 40.0 55.0 35.0 35.0 0.0
#Sequence identities difference matrix of global PDB chains (%)
0.0 25.0 20.0 38.0 23.0 72.0 61.0 72.0 67.0 61.0 73.0 60.0
25.0 0.0 25.0 40.0 27.0 68.0 58.0 70.0 71.0 58.0 69.0 57.0
20.0 25.0 0.0 36.0 21.0 68.0 59.0 65.0 68.0 58.0 67.0 59.0
38.0 40.0 36.0 0.0 40.0 62.0 58.0 61.0 64.0 60.0 63.0 58.0
23.0 27.0 21.0 40.0 0.0 68.0 60.0 66.0 69.0 62.0 67.0 60.0
72.0 68.0 68.0 62.0 68.0 0.0 76.0 18.0 39.0 76.0 19.0 77.0
61.0 58.0 59.0 58.0 60.0 76.0 0.0 76.0 79.0 10.0 75.0 20.0
72.0 70.0 65.0 61.0 66.0 18.0 76.0 0.0 35.0 76.0 16.0 78.0
67.0 71.0 68.0 64.0 69.0 39.0 79.0 35.0 0.0 78.0 37.0 79.0
61.0 58.0 58.0 60.0 62.0 76.0 10.0 76.0 78.0 0.0 76.0 19.0
73.0 69.0 67.0 63.0 67.0 19.0 75.0 16.0 37.0 76.0 0.0 76.0
60.0 57.0 59.0 58.0 60.0 77.0 20.0 78.0 79.0 19.0 76.0 0.0
#FASTA SCORE matrix of loop sequence
- 14 20 15 14 13 3 13 4 10 8 4
- - 24 35 34 37 24 21 23 12 28 23
- - - 30 29 23 13 29 19 15 23 14
- - - - 40 38 28 31 28 18 38 29
- - - - - 33 23 26 26 13 33 24
- - - - - - 30 24 21 18 34 29
- - - - - - - 14 11 26 30 37
- - - - - - - - 14 16 24 15
- - - - - - - - - 1 21 12
- - - - - - - - - - 20 32
- - - - - - - - - - - 31
- - - - - - - - - - - -
#FASTA SCORE matrix of motif sequence
- 66 73 75 67 60 57 58 51 56 67 49
- - 86 99 81 100 86 76 79 74 95 66
- - - 100 79 76 73 87 83 71 88 57
- - - - 97 93 90 89 92 76 105 73
- - - - - 80 74 69 74 75 86 61
- - - - - - 95 73 68 75 100 69
- - - - - - - 71 65 86 103 85
- - - - - - - - 75 69 81 54
- - - - - - - - - 55 80 48
- - - - - - - - - - 81 73
- - - - - - - - - - - 79
- - - - - - - - - - - -
#FASTA SCORE matrix of global PDB chain sequence
- 1093 1177 921 1090 218 501 199 250 488 188 502
- - 1100 885 1046 262 557 219 204 541 221 527
- - - 940 1143 272 521 284 234 529 261 527
- - - - 876 342 525 327 297 506 311 524
- - - - - 251 500 243 185 483 250 493
- - - - - - 73 598 475 50 575 49
- - - - - - - 54 22 1276 81 1110
- - - - - - - - 480 58 588 22
- - - - - - - - - 15 452 17
- - - - - - - - - - 62 1122
- - - - - - - - - - - 47
- - - - - - - - - - - -