First of all, we must understand the principle of homology modeling. Homology modeling specifically refers to the use of a protein with a known structure that has homology with an unknown structure protein as a template, and the use of bioinformatics methods to predict its three-dimensional structure based on the primary sequence through computer simulation and calculation.
Homologous modeling is based on two principles. First, the structure of a protein is uniquely determined by its amino acid sequence. Knowing its primary sequence can theoretically obtain its secondary structure and tertiary structure. Second, the tertiary structure of proteins is more stable or more conserved in evolution. If the amino acid sequences of two proteins are 50% identical, then about 90% of the a-carbon atoms have a position deviation of no more than 3 Å, which is a guarantee for the success of the homology modeling method in structural prediction. Homologous modeling usually requires that the sequence identity of the template protein and the target protein be higher than 30%.
• Single template modeling
Step 1. Search for templates
Method for searching templates
✓ Exquisite approach: Use Modeller's build_profile.py to search from the local sequence database, then use compare.py to compare each Template template to get a distance matrix, and then select the desired template based on the distance matrix.
✓ Extensive approach: Use the blast of databases such as uniprot or PDB to search for similar sequences, then enter the PDB to find the desired sequence, and download the PDB structure as a template.
✓ Simple method: directly use the Template template searched out by swiss-model.
Step 2. Template comparison
Change your own sequence to the ali format according to the corresponding file in the basic-example folder, use mod9.18 to run the align2d.py file to generate myprotein-1aoaA.ali and myprotein-1aoaA.pap files.
Step 3. Model establishment
Here you can see that the lower the molpdf and DOPE scores of the six templates, the better, and the closer the GA341 score is to 1, the better. Here, the second model myprotein.B99990002.pdb is selected.
Step 4. Model evaluation
Using VMD for visualization, it is obvious that there is a large area that has not been modeled, and three tails are formed. It can be seen that the single template modeling is not good due to the limited amount of information.
• Multi-template modeling
Step 1. Template merger
First, use salign.py to compare templates to merge each template.
Step 2. Template comparison
Use align2d_mult.py for template comparison.
Step 3. Manually add the template
Use the single sequence alignment align2d.py to generate myprotein -2f2oA.ali comparison result file, add myprotein-mult.ali.
Step 4. Model establishment
Use model_mult.py to build a model.
Step 5. Model visualization
• PROCHECK evaluation details
The following is a Ramachandran, which is the result of Modeller's multi-template. Yellow is the allowable area, red is the maximum allowable area, and white is the disallowed area.