List of putative accessory gene clusters conserved near Type III genetic modules which passed the Type III association score cut-off (>24). All annotated accessory protein sequences are available in Shah2018.TypeIIIaccessory.faa. For each putative accessory protein family, the cluster id, the size (i.e. number of members per cluster), and the calculated Type III association score are listed. An example (gene-) locus id is also provided for reference. Names are provided for accessory protein families identified in earlier studies (Haft et al. 2005; Vestergaard et al. 2014; Makarova et al. 2015). Thirty nine of 76 putative accessory protein families are newly identified. Links to sequence alignments, gene and genome maps are provided as well as profile-profile alignment results against Pfam and PDB. A rudimentary functional prediction is given in the last two columns.
cluster | name | score | size | length | example | links | comment | description |
---|---|---|---|---|---|---|---|---|
1 | Csx1/Csm6 | 73.88 | 116 | 446 | J114_15030 | ALN Gene Pfam PDB | CARF+HEPN | N-ter CARF, C-ter HEPN toxin |
2 | Csx1 | 69.25 | 94 | 464 | Thet_1098 | ALN Gene Pfam PDB | CARF+HEPN | N-ter CARF, C-ter HEPN toxin |
3 | Csm6 | 62.33 | 69 | 347 | NIES39_J03350 | ALN Gene Pfam PDB | CARF+RelE | N-ter CARF, C-ter RelE toxin |
5 | Cas11/Csx19 | 59.36 | 64 | 179 | Athe_0139 | ALN Gene Pfam PDB | core gene (SS) | |
6 | Csx1 | 58.51 | 61 | 461 | aq_378 | ALN Gene Pfam PDB | CARF+HEPN | N-ter CARF, C-ter HEPN toxin |
7 | WYL/Csx1 | 42.59 | 51 | 335 | THEYE_A0059 | ALN Gene Pfam PDB | CARF+WYL | N-ter CARF, C-ter WYL |
9 | Csx1 | 37.91 | 49 | 399 | PTH_1930 | ALN Gene Pfam PDB | CARF+Nuclease | N-ter CARF, C-ter putative endonuclease |
11 | Csx15/20/peptidase | 35.59 | 41 | 129 | Vpar_1818 | ALN Gene Pfam PDB | peptidase | peptidase associated with CRISPR3 in Synecocystis. Suggests an unknown step, may be a proteolytic maturation of some Cas protein |
17 | Cas_RecF | 24.7 | 31 | 331 | Tneu_0569 | ALN Gene Pfam PDB | SMC ATPase | ABC AAA ATPase cassette found in SMC type DNA repair proteins. Called CasRecF in Vestergaard et al. 2014 |
23 | Csx1 | 4.61 | 20 | 378 | LS215_0610 | ALN Gene Pfam PDB | found elsewhere | CARF protein also found elsewhere on the genome far from Type III modules |
25 | Csx3 | 55.83 | 18 | 100 | THEYE_A0067 | ALN Gene Pfam PDB | Csx3 (CARF) | |
27 | Csx18 | 50.07 | 17 | 94 | sll7069 | ALN Gene Pfam PDB | adaptation associated | found in cyanobacterial Type III systems that have an adaptation module |
28 | Csx21 | 66.76 | 17 | 229 | Cyan7425_0157 | ALN Gene Pfam PDB | Type III-D associated | possible C-terminal RNA binding domain. Associated exclusively with Type III-D systems |
29 | CorA | 45.91 | 14 | 572 | CLB_2115 | ALN Gene Pfam PDB | CorA-like | C-ter. CorA-like Mg2+ transporter domain. N-ter. domain unknown. Associated with Type III-B modules in bacteria and archaea. |
33 | Csa3 | 10.47 | 13 | 215 | Pcal_0266 | ALN Gene Pfam PDB | Type I-A associated | CARF-HTH or Csa3 normally associated with Type I-A systems as a transcriptional repressor or activator |
35 | Csx3 | 61.07 | 12 | 310 | CYB_0599 | ALN Gene Pfam PDB | Csx3-AAA | |
36 | WYL | 50.71 | 11 | 451 | Mic7113_2620 | ALN Gene Pfam PDB | HTH-WYL | N-ter. DNA binding domain and C-ter WYL domain that may be involved in regulation (mainly cyanobacteria) |
37 | Csx26 | 56.62 | 11 | 125 | M1627_1089 | ALN Gene Pfam PDB | HNH nuclease | restriction endonuclease/HNH endonuclease |
38 | C3a38 | 54.35 | 10 | 277 | CLOAM0849 | ALN Gene Pfam PDB | unknown | Unknown structure/function |
42 | Cas11/Csx19 | 74.4 | 9 | 140 | FSU_2045 | ALN Gene Pfam PDB | core gene (SS) | CasT3 - specific for bacterial type III-A/III-B systems |
43 | C3a43 | 58.83 | 9 | 492 | Ferpe_1557 | ALN Gene Pfam PDB | Lon protease | likely ATP-dependent protease domain at N-ter. (Lon family). Large unknown C-terminal domain. |
45 | PD-DExK | 28.54 | 9 | 288 | Dd586_3238 | ALN Gene Pfam PDB | possible nuclease | Possible nuclease-related domain at C-terminal end |
47 | protease | 27.09 | 9 | 103 | YN1551_2381 | ALN Gene Pfam PDB | peptidase | peptidase like cluster 11. The finding of peptidases as part of the accessory proteome, both in achaea and in some baceria points at an unknown step. Possible proteolytic maturation or activation of some Cas protein. |
50 | Csx1 | 61.07 | 8 | 277 | Cpha266_2053 | ALN Gene Pfam PDB | TM+CARF | N-ter trans membrane, C-ter CARF |
55 | C3a55 | 44.44 | 7 | 283 | Riv7116_3423 | ALN Gene Pfam PDB | ABC permease | likely oligonucleotide ABC transporter permease |
57 | HerA | 20.46 | 7 | 583 | Vdis_1158 | ALN Gene Pfam PDB | Type III coevolved | HerA helicase normally involved in DNA repair, but this version of it has coevolved with crenarchaeal Type III modules |
58 | NurA | 33.01 | 7 | 348 | Cmaq_1511 | ALN Gene Pfam PDB | Type III coevolved | NurA-like nuclease: ssDNA endonuclease and 5'-3' exonuclease on ss or dsDNA |
59 | C3a59 | 55.84 | 7 | 178 | Nos7107_2826 | ALN Gene Pfam PDB | Trans-membrane (TM) | transmembrane, unknown structure/function |
64 | C3a64 | 43.07 | 7 | 78 | M1627_1085 | ALN Gene Pfam PDB | Unknown | unknown structure/function |
67 | C3a67 | 40.03 | 6 | 388 | Cagg_1075 | ALN Gene Pfam PDB | AAA-Csx3 | Csx3 at C-terminus; looks like nucleotidyl phosphokinase at N-terminus |
69 | C3a69 | 29.69 | 6 | 125 | B005_5545 | ALN Gene Pfam PDB | unknown | unknown structure-function |
76 | C3a76 | 39 | 6 | 96 | Ssol_2352 | ALN Gene Pfam PDB | unknown | unknown structure-function |
77 | Csx16 | 29.73 | 6 | 99 | NE0116 | ALN Gene Pfam PDB | a.k.a. cas_VVA1548 | |
80 | C3a80 | 27.48 | 6 | 1184 | Metvu_1289 | ALN Gene Pfam PDB | AAA+ ATPase | AAA+ ATPase, DUF499 family, unknown function |
81 | Csx1 | -9.48 | 6 | 179 | M1425_0870 | ALN Gene Pfam PDB | Type I associated | small CARF protein often also associated with Type I systems |
83 | cas_RFas | 88.69 | 6 | 157 | VMUT_1493 | ALN Gene Pfam PDB | cluster 17 associated | unknown structure-function, always associated with Cas-RecF, cluster 17 |
84 | cas_RFas | 66.57 | 6 | 128 | VMUT_1494 | ALN Gene Pfam PDB | cluster 17 associated | possible RNA recognition motif (RRM) at N-ter. Always associated with Cas-RecF, cluster 17 |
87 | Cmr7 | 74.65 | 6 | 185 | LS215_0814 | ALN Gene Pfam PDB | Cmr7 | Cmr7 (Sulfolobus) |
93 | C3a93 | 30.57 | 5 | 155 | Thit_1351 | ALN Gene Pfam PDB | poss. AAA ATPase | likely AAA+ ATPase |
96 | Mvol_0529-fam | 42.25 | 5 | 125 | CLB_2116 | ALN Gene Pfam PDB | DNA binding C-ter. | possible DNA binding C-ter. domain |
104 | Csx1 | 39.74 | 5 | 466 | PYCH_07970 | ALN Gene Pfam PDB | CARF+PIN | N-ter CARF, C-ter PIN toxin |
107 | Csx1 | 84.37 | 5 | 680 | TTHB155 | ALN Gene Pfam PDB | CARF | large CARF protein with interspersed stretches of sequence that have no significant domain matches |
108 | cas_RFas | 78.8 | 5 | 180 | Tneu_0568 | ALN Gene Pfam PDB | cluster 17 associated | unknown structure-function, always associated with Cas-RecF, cluster 17 |
116 | C3a116 | 57.92 | 4 | 257 | AZL_010430 | ALN Gene Pfam PDB | cluster 29 associated | unknown structure-function, always associated with CorA, cluster 29 |
121 | Csx23 | 62.04 | 4 | 168 | CFF8240_1673 | ALN Gene Pfam PDB | unknown | unknown structure-function |
123 | C3a123 | 37.09 | 4 | 211 | Cagg_1060 | ALN Gene Pfam PDB | unknown | unknown structure/function |
124 | C3a124 | 44.42 | 4 | 433 | Cyan10605_3519 | ALN Gene Pfam PDB | unknown | SYNPCC7002_F0039 has an SMC_N domain pointing at a function in DNA metabolism and recombination such as recN, recF. The gbk entry for SYNPCC7002_F003 states that Slr7100 is a homolog, which seems unlikely. However, the latter is CRISPR3-associated. |
139 | C3a139 | 28.91 | 4 | 1158 | Metvu_1226 | ALN Gene Pfam PDB | HKD+Snf2 | N-ter. HKD family nuclease fused to C-ter. Snf2-like helicase domain which has no helicase activity |
146 | C3a146 | 68.43 | 4 | 177 | SSO1421 | ALN Gene Pfam PDB | DNA binding HTH | Likely HTH domain, DNA biinding protein |
152 | C3a152 | 25.85 | 3 | 143 | Anacy_5891 | ALN Gene Pfam PDB | Type III-C specific | Type III-C specific. Likely N-ter. RRM (RNA recognition motif; unknown function |
156 | C3a156 | 28.1 | 3 | 330 | NIES39_M01150 | ALN Gene Pfam PDB | oxidoreductase | methylenetetrahydromethanopterin reductase |
159 | C3a159 | 27.06 | 3 | 304 | Calkr_2554 | ALN Gene Pfam PDB | methyl transferase | N-ter. methyl transferase domain |
162 | C3a162 | 40.43 | 3 | 181 | Calni_1607 | ALN Gene Pfam PDB | unknown | unknown structure-function |
166 | C3a166 | 32.1 | 3 | 181 | Chy400_2477 | ALN Gene Pfam PDB | poss. crRNA proc. | possible crRNA maturation ribonuclease, instead of Cas6 |
168 | C3a168 | 31.1 | 3 | 374 | MYO_4810 | ALN Gene Pfam PDB | unknown | slr7083 in S.6803. Single k.o. for this gene. Belongs to CRISPR3. |
173 | Csx21 | 70 | 3 | 107 | Adeg_0988 | ALN Gene Pfam PDB | unknown | unknown structure-function |
174 | C3a174 | 74.3 | 3 | 309 | Adeg_0798 | ALN Gene Pfam PDB | nucleotidyl trans. | nucleotidyl transferase. Could be involved as the toxin in the Type IV toxin-antitoxin antiphage mechanism. See PMID: 24465005 for details. |
178 | C3a178 | 37.97 | 3 | 120 | Fisuc_1555 | ALN Gene Pfam PDB | putative invertase | putative transposase or invertase |
181 | C3a181 | 25.55 | 3 | 275 | Kcr_0450 | ALN Gene Pfam PDB | diadenylate cyclase | diadenylate cyclase (c-di-AMP synthetase), DisA, possible involved in signalling |
186 | C3a186 | 28.77 | 3 | 469 | ANT_24470 | ALN Gene Pfam PDB | C-3',4' desaturase | C-3',4' desaturase CrtD |
187 | C3a187 | 39 | 3 | 624 | Syn7502_02851 | ALN Gene Pfam PDB | NERD+UvrD | DNA/RNA helicase + (exo)nuclease at C-terminus |
189 | C3a189 | 50.73 | 3 | 1053 | Hoch_1322 | ALN Gene Pfam PDB | kinase | catalytic domain of bacterial serine/threonine kinases(STKs) |
193 | C3a193 | 43.78 | 3 | 1067 | Thebr_0949 | ALN Gene Pfam PDB | poss. alt. Cas3 | looks like a variant of Cas3 for Type III |
194 | C3a194 | 26.83 | 3 | 477 | TepRe1_0131 | ALN Gene Pfam PDB | AA transporter | Amino acid transporter or permease |
196 | C3a196 | 46.29 | 3 | 96 | Runsl_4349 | ALN Gene Pfam PDB | unknown | unknown structure-function |
198 | C3a198 | 40.67 | 3 | 532 | K649_15110 | ALN Gene Pfam PDB | TPR-repeat | tetratricopeptide repeat |
205 | C3a205 | 67.5 | 3 | 387 | MLP_11360 | ALN Gene Pfam PDB | unknown | unknown structure-function |
206 | C3a206 | 29.39 | 3 | 535 | B005_5542 | ALN Gene Pfam PDB | TAP-like protein | TAP-like protein; family of peptidases and hydrolases |
211 | cas_RFas | 51.67 | 3 | 187 | Pogu_1163 | ALN Gene Pfam PDB | unknown | unknown structure-function |
212 | C3a212 | 39.67 | 3 | 338 | RoseRS_0371 | ALN Gene Pfam PDB | SNc+LTD | N-ter. endonuclease domain with a C-ter. lamin tail domain (LTD) |
214 | C3a214 | 30.16 | 3 | 639 | Rcas_3915 | ALN Gene Pfam PDB | GlgB | 1,4 alpha glucan-branching enzyme, GlgB |
216 | C3a216 | 25.6 | 3 | 77 | SSA_1254 | ALN Gene Pfam PDB | adaptation associated | associated with Type III systems that have an adaptation module. Shows sequence similarity with Cas1 in other bacteria. |
222 | C3a222 | 44 | 3 | 39 | Vpar_1800 | ALN Gene Pfam PDB | unknown | unknown structure-function |
223 | C3a223 | 43.33 | 3 | 84 | YG5714_0635 | ALN Gene Pfam PDB | unknown | unknown structure-function |
227 | PrimPol | 36.73 | 3 | 577 | Thein_1350 | ALN Gene Pfam PDB | adaptation polymerase | N-ter. tetratricopeptide, C-ter. likely primase/polymerase domain. Linked to adaptation. Possible reverse transcriptase repair |
230 | C3a230 | 28.76 | 3 | 145 | TT_P0114 | ALN Gene Pfam PDB | RecX family protein | likely RecX recombination regulator |