DeepMainMast

DeepMainmast is a de novo modeling protocol designed to construct an entire protein 3D model directly from an EM map with a resolution of 0-5 A. It generates a file in .pdb format that documents the modeled protein structure derived from the input cryo-EM map. In cases where the cryo-EM map contains DNA/RNA, only the protein structures within the complex are modeled. For modeling the complete complex, we recommend using ComplexModeler (https://em.kiharalab.org/algorithm/ComplexModeler).

Our results webpage comprises three tabs: Results Visualization, Output Logs, and Job Configuration.

Results Visualization:
The "Result Visualization" panel showcases the protein structure generated by DeepMainMast.
On the right-hand side, you'll find the "Download Outputs" button enabling you to download the modeled structure in .pdb format. This file contains four structures (details outlined below). You can visualize or refine this structure using tools like PyMol, Coot, or Chimera. The downloaded .pdb file includes four structures consolidated into one file, each structure's specifics detailed below.
Additionally, you can visualize the map online by clicking the "Show map" button. Once loaded, the default contour level matches your input; however, you can make adjustments by clicking the "..." button beside "isosurface." Within the "Type: Isosurface" option, you can modify the iso-surface value and opacity by scrolling through the bar for precise adjustments. This feature allows you to assess the alignment between the modeled structure and the map.

The output 3D model displays coloration based on the DAQ (AA) score, ranging from red (-1.0) to blue (1.0), signifying the structural quality. Blue highlights well-modeled regions, while red indicates areas potentially less reliable. This 3D model includes either four models (with Rosetta) or two models (without Rosetta).

  • MODEL1: Ca-only structure, where all modeled positions are colored by the DAQ (AA) score.
  • MODEL2: Ca-only structure, excluding amino acids with a DAQ (CA) score below -0.5 and replacing amino acids with a DAQ (AA) score below -0.5 with 'UNK'.
  • MODEL3: Full-atomic structure displaying all modeled positions colored by the DAQ (AA) score. Rosetta is used to build the full atom model.
  • MODEL4: Full-atomic structure excluding amino acids with a DAQ (CA) score below -0.5 and replacing amino acids with a DAQ (AA) score below -0.5 with 'UNK'. Rosetta is used.

These distinctions enable detailed exploration and comparison of the models, providing insights into specific structural aspects based on the DAQ scores.
Output Logs:
The 'Output Logs' panel compiles all outputs generated by the scripts.
If you're interested in monitoring the job's progress during execution, this section provides a comprehensive overview.

Job Configuration:
In the 'Job Configuration' panel, you'll find the input parameters used for this specific job. These records serve to maintain a log of your submitted input for reference.

Problem Debugging:
For any troubleshooting needs: Should you encounter any issues, please don't hesitate to contact us via email to report the problems. When sending an email, kindly use the subject line format 'DeepMainMast problem: [jobid]', where [jobid] corresponds to the job displayed in the title. This specific identification helps us efficiently locate and debug jobs in the backend, ensuring a prompt response to your concerns.

Contact:
dkihara@purdue.edu, gterashi@purdue.edu, xiaowang20140001@gmail.com.

DeepMainmast is a de novo modeling protocol to build an entire protein 3D model directly from a EM map of up to 5 A resolution.

If encounter problems, please contact Daisuke Kihara (dkihara@purdue.edu) or Xiao Wang (xiaowang20140001@gmail.com) or Genki Terashi (gterashi@purdue.edu)


Example EMD-2513 three chains
Input Map file: emd_2513.mrc

Input Sequence file: emd_2513.fasta

Input AF2 template file: emd_2513_af2.pdb

Contour level: 0.01

Result Example:Result Example with full-atom model building and refinement by Rosetta
Tutorial PPT Tutorial Web Workshop Video


Video Tutorial:

Please simply click "Schedule Job" when you filled all input fields.

Limited by the memory constraint of DM computation, please try to use DiffModeler(seq) if your structure is more than 10k residues.

Please make sure your contour level is lower than your focused region. This is absolute density threshold, not standard deviation.


If you are not sure, just use threshold 0, our model can automatically detect the structure regions.







Please use a sequence file in FASTA format. If the target protein has multiple chains, include all sequences. If it has multiple copies of the same chain (homo-multimer), include the sequence exactly the number of times corresponding to the copies in the complex. Each chain must have an ID line (beginning with a caret (">")) and a sequence line.




AlphaFold2 modeled structure in pdb format. Please combine all single-chain structures in one PDB file, separated by "TER" for different chains' records. The chain ID does not matter, for identical chains, you only need to provide the chain records once in the pdb format. Although submitting AF2 model is optional we highly recommend to provide it because the accuracy can often largely improve. To obtain AF2 models or to use Alphafold3, see the tutorial.

Please simply ignore this field if you do not plan to use AlphaFold2 structure.




Please select Yes if you have any two or more chains are identical. That is important of correct chain assignment for identical chains.




rosetta refinement to build full-atom structure, that will take very long time (5 hour/1000 residues). If you really want to build full atom structure, you can select yes but wait patiently.