DAQFinder

DAQfinder is a tool designed to identify protein sequences in cryo-EM maps using data from the UniProt and PDB databases. It employs the DeepMainmast mainchain tracing process to determine the protein backbone structure. Subsequently, it searches against protein sequence databases (UniProt and PDB) based on predicted amino acid types in the traced backbone structure. To enhance search speed, DAQfinder uses MMseq2, followed by dynamic programming (DP) for final scoring. Scores are normalized using Z-scores. A Z-score of 10.0 or higher indicates significant hits.

When PDB95 is used as the target database, alignments and scores are calculated for all sequences. However, when UniProt+PDB is used as the target database, MMseqs2 searches the databases first, and the resulting sequences are aligned and scored. If MMseqs2 does not find any hits in the initial search, the result will be shown as "No Sequence Found".

DAQFinder has been tested with EM maps with resolution of 0-5 A. For maps and regions with lower resolution, predictions of protein backbone structures and amino acid types are less reliable. Typically, this results in shorter hits and lower Z-scores, or "No Sequence Found" in the case of UniProt+PDB search.

Our results webpage comprises three tabs: Results Visualization, Output Logs, and Job Configuration.

Results Visualization:
The "Results Visualization" panel consists of two main components: a structure viewer and a table. The table displays the sequence IDs, Z-scores, and sequence details identified by DAQFinder. It is sorted by Z-score, with higher Z-scores indicating more likely hits. When you click the "ShowChain" button, only the selected structure is displayed in the structure viewer along with the map. Clicking the "Download" button allows you to download the structure.

Output Logs:
The 'Output Logs' panel compiles all outputs generated by the scripts.
If you're interested in monitoring the job's progress during execution, this section provides a comprehensive overview.

Job Configuration:
In the 'Job Configuration' panel, you'll find the input parameters used for this specific job. These records serve to maintain a log of your submitted input for reference.

Problem Debugging:
For any troubleshooting needs: Should you encounter any issues, please don't hesitate to contact us via email to report the problems. When sending an email, kindly use the subject line format 'DAQFinder problem: [jobid]', where [jobid] corresponds to the job displayed in the title. This specific identification helps us efficiently locate and debug jobs in the backend, ensuring a prompt response to your concerns.

Contact:
dkihara@purdue.edu

DAQfinder is a sequence database searching protocol using traced backbone models from a EM map of up to 5 A resolution.







The sequence database to search. The default database is PDB95. You can also use Uniprot or Custom (upload your FASTA file below).




Please upload your FASTA file if you select Custom.