Here is a database for predicting protein domains: SMART (Simple Modular Architecture Research Tool), which is an online analysis tool for protein domain identification and annotation. Its data is synchronized with UniProt, Ensembl and STRING databases, and more than 1,300 protein domains have been manually annotated. More than 20 years have passed since the first edition was released in 1997, and it is still popular.

Tool URL:


SMART has two modes: normal and genomic. The main difference between the two is the difference in the underlying database. The former is redundant, while the latter only uses proteomic data that has completed genome sequencing.

The colors of the two modes are different, but the interface is similar. You can switch by clicking the corresponding mode. For example, enter the genomic mode like this:

You can search for protein domains by entering the ID (or ACC) or protein sequence of the Uniprot/Ensembl protein sequence.

Click the Sequence SMART button to submit the task. After a few seconds, you can get the prediction result in the figure below. The structure diagram of the web page is interactive, and it can also be saved as a vector diagram in svg format for easy use in the article.

The following is a list of predicted domains, including those shown and not shown in the figure above.

If you have a lot of sequences that need to be predicted, you can also enter SMART's batch processing page (as shown in the figure below, click the question mark button in the Sequence analysis section to have a link), URL:

Prepare the ID (or ACC) or protein sequence (fasta format) of the Uniprot/Ensembl protein sequence according to the requirements on the page, and you can submit up to 10,000.

Here is just a brief introduction, if you are interested, hurry up and try other functions~

