COGcollator 2.0:

a tool for analysis of distant relationships between homologous protein families
Please cite: Dibrova D.V., Rykov S.Y. (2024) COGcollator 2.0: an improved web server for analysis of distant relationships between homologous protein families. Bioinformatics of Genome Regulation and Structure/Systems Biology (BGRS/SB-2024), Fourteenth International Multiconference. Abstracts. August 5–10, 2024, Novosibirsk, Russia: P. 60–62. [DOI: 10.18699/bgrs2024-1.1-16]
Database
Available databases:
  • COG2020 — the latest version of the profile HMMs for the COG database released in 2020 (March 2024, 4870 entries)
  • Pfam — the Pfam 36.0 release (July 2023, 20795 entries)
ID
Enter domain ID which corresponds to the selected domain database. You can use both a standard identifier (e.g. PF00001) or a short unique name (e.g. 7tm_1) for the Pfam database.
E-value Threshold:
E-value threshold: only hits with e-value not higher than this value will be shown on a graph. E-value for the selected profile hits is calculated by the hmmsearch for the full sequence (that is, proteins containing duplication of target domain would obtain much higher score). Hits were identified in the database of prokaryotic proteins sampled from 1309 genomes, which were used for the COG database construction (Galperin et al., 2021) and a database of eukaryotic proteins sampled from the manually selected 102 eukaryotic genomes. For the Pfam domains analysis the number of eukaryotic species was reduced to 61 at the cost of vertebrates and insects.
Filter:
If “Yes”, only COGs with annotated coordinates overlapping by at least 5 aa with hits of current domain profile HMM are shown. This results in hiding COGs which appear on a graph not because of possible homology relationship with the current domain but because of fusions with it and/or related domains. Proteins which have all their COGs hidden will be shown as proteins without any COGs (as small black dots).