DiffModeler is a computational tool using a diffusion model to automatically build full protein complex structure from cryo-EM maps at 0-20 Å resolution.
Full protein complex structure in .cif file
Collect 3D cryo-EM map from microscope in .mrc and .map format.
Example map
You can also find many maps in EMDataResource as testing
examples.
Please zip all single-chain pdb files to a .zip/.tar.gz/.tar file to upload. We can also model part of protein
complex if you only know partial single-chain structure.
Please check AlphaFold Database for single-chain structure with
UniProt ID.
You can also search EBI Search Tool against structure
database to find most similar structures as template for us to model protein complex.
pdb_config_file is a text file (only .txt file accepted) where each line includes pdb file name in the zipped
file and its corressponding chains. It can correspond to many identical chains.
Example config file:
Suppose you have p142.pdb (corresponds to A, B chain) and p143.pdb (corresponds C chain) in the zip files,
then the config file should be
p142.pdb A B
p143.pdb C
Each line of the config file should be "[file_name] [chain_id1]". If one template corresponds to multiple
chains, please simply add the chain id in the same line and split by the blank space.
Please Please use a sequence file with fasta format. Each chain must have a ID line (begin with a carat (">"))
and a SEQUENCE line.
For ID line, please only include the chain id without any other information. If multiple chains include the
identical sequences, please use comma "," to split different chains.
Example Sequence ID line:
>A,B,C,D
MATPAGRRASETERLLTPNPGYGTQVGTSPAPTTPTEEEDLRR
>E,F
VVTFREENTIAFRHLFLLGYSDGSDDTFAAYTQEQLYQ
which indicates 6 chains with A,B,C,D share the identical sequences and E,F share another identical sequences.
Please make sure your contour level is lower than your focused region.
This is absolute density threshold, not standard deviation.
Please do not input 0, you must provide a contour to remove the outside very noisy regions.
For 0-2A resolution, the diffusion process will be skipped.
Therefore, if you want to use diffusion model, please just use an approximate resolution of your map.
Once you collected the input files, please submit your job here (DiffModeler(seq)->here) For each input field, please input the files/info collected before.
Once you finished input, simply click the upload button to submit jobs. After submission, you will be redirected to the “view job“ page. If you are not registered, please bookmark the link. Once the job is done, you can view jobs from this link. If you are registered, you will receive email notifications once job is done and you can also check job status from my jobs list under job manager.
Once job is done, you can check the modeled structure from the link bookmarked before. Here you can also download the modeled structure in .pdb format by clicking the “Download Outputs” button. You can also visualize the 3D cryo-EM map online to check its consistency with the modeled structure. For more detailed instructions, please see the “Instructions” in the same page.
If you noticed any strange outputs or job failure on your side, please submit a backend review by using the field in the bottom of the “View Job” page. We will get back to you as soon as possible.
Full code is available here.
Wang, X., Zhu, H., Terashi, G., Taluja, M., & Kihara, D. (2024). DiffModeler: Large macromolecular structure modeling for cryo-EM maps using a diffusion model. Nature Methods. https://doi.org/10.1038/s41592-024-02479-0