A Motif database with structurally diverse loops and conserved frameworks


In the Protein Data Bank (PDB), Loops are most variable regions in protein structures, but their anchoring secondary structures are often conserved. Therefore, there are many small motifs composed of variable loops and stable alpha-helics or beta-strands. This page provides some databases containing many such motifs derived from PDB. See below example:

a picture of loops
The picture is an example motif family in the database. It clearly shows that loops have variable conformations and the anchoring secondary structures are structurally conversed. This family is composed of 6 motifs listed below:
MOTIF ID SHEET LOOP SHEET
EE1bro-A_21_4 TSIDLYYEDH GTGQ PVVLI
EE1ede_42_7 AHYLDE GNSDAED VFLC
EE1ivy-A_40_10 KHLHYWFV ESQKDPENSP VVLWL
EE1tht-A_26_11 QELHVWET PPKENVPFKNN TILIA
EE1wht-A_36_10 RSLFYLLQ EAPEDAQPAP LVLWL
EE2ctc_53_8 PIYVLKF STGGSNRP AIWIDL

Here, we define some terms used in this database.

Loop: defined by DSSP programs.
Motif: A alpha-helix [or beta-strand] -- loop -- alpha-helix [or beta-strand] segment.
Framework: A pair of secondary structures (either alpha-helics or beta-strands) bracing a loop.
Motif id: A unique id assigned to a motif such as EH1one-B_320_7, here E and H are type of secondary structures, 1one is the PDB id, B is chain id in PDB, 320 is the residue number the loop begins, and 7 is the loop length.
This database is composed of main database and three sub-databases.
The main database contains the motif families from all the PDB.
The first sub-database comprises the motif families from homologous protein families.
In the second sub-database, the loops in each motif family is of same length.
The the last sub-database is the collection of loops with variable conformations in different PDB files.
The authors of the database is:
Dr. Weizhong Li et al. of the Molecular design lab