match no.target idtarget lengthalignment lengthprobabilityE-valuecoveragematch description
1cd15860603474.89.5[ --------------                                  ]SNARE_USE1SNARE motif of USE1. USE1 (unconventional SNARE in the ER 1 homolog, also known as SNARE-like tail-anchored protein 1 or SLT1) is involved in retrograde transport of CopI coatomer coated vesicles from the Golgi to the ER. It forms a complex with syntaxin18 (Ufe1p, Qa), Bnip1 (Sec20p, Qb) and Sec22b (R-SNARE). USE1 is a member of the Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.
2pfam01985843370.37.8[                      --------------             ]CRS1_YhbYCRS1 / YhbY (CRM) domain. Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localizes to the nucleolus, suggesting that an analogous activity may have been retained in plants. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome.
3pfam10078895867.638[--------------------------                       ]DUF2316Uncharacterized protein conserved in bacteria (DUF2316). Members of this family of hypothetical bacterial proteins have no known function.
4pfam10431815366.16.9[  ------------------------                       ]ClpB_D2-smallC-terminal, D2-small domain, of ClpB protein. This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, pfam00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerization, forming a tight interface with the D2-large domain of a neighbouring subunit and thereby providing enough binding energy to stabilize the functional assembly. The domain is associated with two Clp_N, pfam02861, at the N-terminus as well as AAA, pfam00004 and AAA_2, pfam07724.
5pfam037501204065.211[  ------------------                             ]DUF310Protein of unknown function (DUF310). This family contains a number of archaeal proteins that are completely uncharacterized. The proteins are between 130 and 160 amino acids long. Their C-terminus contains several conserved residues.
6cd07936851762.53.8[                               -------           ]SCANSCAN oligomerization domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several vertebrate proteins that contain C2H2 zinc finger motifs, many of which may be transcription factors playing roles in cell survival and differentiation. This protein-interaction domain is able to mediate homo- and hetero-oligomerization of SCAN-containing proteins. Some SCAN-containing proteins, including those of lower vertebrates, do not contain zinc finger motifs. It has been noted that the SCAN domain resembles a domain-swapped version of the C-terminal domain of the HIV capsid protein. This domain model features elements common to the three general groups of SCAN domains (SCAN-A1, SCAN-A2, and SCAN-B). The SCAND1 protein is truncated at the C-terminus with respect to this model, the SCAND2 protein appears to have a truncated central helix.
7cd02682755155.954[     ----------------------------                ]MIT_AAA_ArchMIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in mostly archaebacterial AAA-ATPases. The molecular function of the MIT domain is unclear.
8PRK071784723854.59.4[--------------------                             ]PRK07178pyruvate carboxylase subunit A; Validated
9pfam12926814653.150[     ----------------------                      ]MOZART2Mitotic-spindle organizing gamma-tubulin ring associated. FAM128A and FAM128B proteins have been re-named MOZART2A and B. The name MOZART is derived from letters of 'mitotic-spindle organizing proteins associated with a ring of gamma-tubulin'. This family operates as part of the gamma-tubulin ring complex, gamma-TuRC, one of the complexes necessary for chromosome segregation. This complex is located at centrosomes and mediates the formation of bipolar spindles in mitosis; it consists of six subunits. However, unlike the other four known subunits, the MOZART proteins, both 1 and 2, do not carry the conserved 'Spc97-Spc98' GCP domain, so the TUBCGP nomenclature cannot be used for it. The exact function of MOZART2 is not clear.
10cd072324075251.835[           ----------------------                ]Pat_PLPLPatain-like phospholipase. Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants and fungi. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates.
11pfam11116844951.345[    ---------------------                        ]DUF2624Protein of unknown function (DUF2624). This family is conserved in the Bacillaceae family. Several members are named as YqfT. The function is not known.
12pfam07130743947.322[ ------------------                              ]YebGYebG protein. This family consists of several bacterial YebG proteins of around 75 residues in length. The exact function of this protein is unknown but it is thought to be involved in the SOS response. The induction of the yebG gene occurs as cell enter into the stationary growth phase and is dependent on is dependent on cyclic AMP and H-NS.
13COG02763202845.325[                      ------------               ]HemHProtoheme ferro-lyase (ferrochelatase)
14pfam154691814544.967[ -------------------                             ]Sec5Exocyst complex component Sec5. This Sec5 family of eukaryotic proteins conserved is not representing the Sec5-Ral binding site.
15pfam04542715343.358[   -----------------------                       ]Sigma70_r2Sigma-70 region 2. Region 2 of sigma-70 is the most conserved region of the entire protein. All members of this class of sigma-factor contain region 2. The high conservation is due to region 2 containing both the -10 promoter recognition helix and the primary core RNA polymerase binding determinant. The core binding helix, interacts with the clamp domain of the largest polymerase subunit, beta prime. The aromatic residues of the recognition helix, found at the C-terminus of this domain are though to mediate strand separation, thereby allowing transcription initiation.
16pfam062521197741.81.2E+02[        --------------------------------------   ]DUF1018Protein of unknown function (DUF1018). This family consists of several bacterial and phage proteins of unknown function.
17pfam043611552741.430[    -------------                                ]DUF494Protein of unknown function (DUF494). Members of this family of uncharacterized proteins are often named Smg.
18cd071255185141.433[     --------------------------                  ]ALDH_PutA-P5CDHDelta(1)-pyrroline-5-carboxylate dehydrogenase, PutA. The proline catabolic enzymes of the aldehyde dehydrogenase (ALDH) protein superfamily, proline dehydrogenase and Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH, (EC=1.5.1.12 )), catalyze the two-step oxidation of proline to glutamate; P5CDH catalyzes the oxidation of glutamate semialdehyde, utilizing NAD+ as the electron acceptor. In some bacteria, the two enzymes are fused into the bifunctional flavoenzyme, proline utilization A (PutA) These enzymes play important roles in cellular redox control, superoxide generation, and apoptosis. In certain prokaryotes such as Escherichia coli, PutA is also a transcriptional repressor of the proline utilization genes.
19pfam09261824140.864[                    ------------------           ]Alpha-mann_midAlpha mannosidase, middle domain. Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase.
20pfam093172841939.791[                             --------            ]DUF1974Domain of unknown function (DUF1974). Members of this family of functionally uncharacterized domains are predominantly found in various prokaryotic acyl-coenzyme a dehydrogenases.
21COG14211377139.01.2E+02[   ----------------------------------            ]Csm2CRISPR/Cas system CSM-associated protein Csm2, small subunit
22cd147571515838.560[ --------------------------                      ]GS_EcDosC-like_GGDEFGlobin sensor domain of Escherichia coli Direct Oxygen Sensing Cyclase and related proteins; coupled to a C-terminal GGDEF domain. Globin-coupled-sensors belonging to this subfamily have a C-terminal diguanylate cyclase (DGC/GGDEF) domain coupled to the globin sensor domain. DGC/GGDEF likely functions as a c-di-GMP cyclase in the synthesis of the second messenger cyclic-di-GMP (c-di-GMP). Members include Escherichia coli DosC (also known as YddV), the gene for which is found in a two-gene operon, dosCP. In DosC, the sensory globin domain is coupled to a GGDEF-class diguanylate cyclase, while in DosP, a heme-containing PAS domain is coupled to an EAL-class c-di-GMP phosphodiesterase. DosP and DosC associate in a di-GMP-responsive Escherichia coli RNA processing complex along with polynucleotide phosphorylase (PNPase), enolase, RNase E, and RNA.
23pfam04659993436.738[            --------------                       ]Arch_fla_DEArchaeal flagella protein. Family of archaeal flaD and flaE proteins. Conserved region found at N-terminus of flaE but towards the C-terminus of flaD.
24COG1937892936.158[    ------------                                 ]FrmRDNA-binding transcriptional regulator, FrmR family
25TIGR03059821434.88.8[                           -----                 ]psaOeukphotosystem I protein PsaO. Members of this family are the PsaO protein of photosystem I. This protein is found in chloroplasts but not in Cyanobacteria.
26cd059304452834.723[------------                                     ]A_NRPSThe adenylation domain of nonribosomal peptide synthetases (NRPS). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.
27COG15281674834.167[       --------------------                      ]FtnFerritin
28COG1722813933.81.2E+02[ ----------------                                ]XseBExonuclease VII small subunit
29COG3141976133.896[ ------------------------------                  ]YebGdsDNA-binding SOS-regulon protein, induction by DNA damage requires cAMP
30pfam144011521033.617[                              ----               ]RLANRimK-like ATPgrasp N-terminal domain. An uncharacterized alpha+beta fold domain that is mostly fused to a RimK-like ATP-grasp and is found in bacteria and euryarchaea. Members of this family are almost always associated in gene neighborhoods with a GNAT-like acetyltransferase fused to a papain-like petidase. Additionally M20-like peptidases, GCS2, 4Fe-4S Ferredoxins, a distinct metal-sulfur cluster protein and ribosomal proteins are found in the gene neighborhoods. Contextual analysis suggests a role for these in peptide biosynthesis.
31pfam087841032433.547[     ------------                                ]RPA_CReplication protein A C terminal. This domain corresponds to the C terminal of the single stranded DNA binding protein RPA (replication protein A). RPA is involved in many DNA metabolic pathways including DNA replication, DNA repair, recombination, cell cycle and DNA damage checkpoints.
32COG34842551633.033[                               ------            ]COG3484Predicted proteasome-type protease
33pfam061331084332.21.5E+02[   ---------------------                         ]DUF964Protein of unknown function (DUF964). This family consists of several relatively short bacterial and archaeal hypothetical sequences. The function of this family is unknown.
34pfam000251741631.722[                                -------          ]ArfADP-ribosylation factor family. Pfam combines a number of different Prosite families together
35COG36791184031.61.6E+02[      ------------------                         ]YlbFCell fate regulator YlbF, YheA/YmcA/DUF963 family (controls sporulation, competence, biofilm development)
36COG39451897331.32.5E+02[    ---------------------------------            ]COG3945Hemerythrin-like domain
37TIGR029371585531.21.4E+02[ -------------------------                       ]sigma70-ECFRNA polymerase sigma factor, sigma-70 family. This model encompasses all varieties of the sigma-70 type sigma factors including the ECF subfamily. A number of sigma factors have names with a different number than 70 (i.e. sigma-38), but in fact, all except for the Sigma-54 family (TIGR02395) are included within this family. Several Pfam models hit segments of these sequences including Sigma-70 region 2 (pfam04542) and Sigma-70, region 4 (pfam04545), but not always above their respective trusted cutoffs.
38cd012591241631.026[                                          ------ ]PH_APBB1IPAmyloid beta (A4) Precursor protein-Binding, family B, member 1 Interacting Protein pleckstrin homology (PH) domain. APBB1IP consists of a Ras-associated (RA) domain, a PH domain, a family-specific BPS region, and a C-terminal SH2 domain. Grb7, Grb10 and Grb14 are paralogs that are also present in this hierarchy. These adapter proteins bind a variety of receptor tyrosine kinases, including the insulin and insulin-like growth factor-1 (IGF1) receptors. Grb10 and Grb14 are important tissue-specific negative regulators of insulin and IGF1 signaling based and may contribute to type 2 (non-insulin-dependent) diabetes in humans. RA-PH function as a single structural unit and is dimerized via a helical extension of the PH domain. The PH domain here are proposed to bind phosphoinositides non-cannonically ahd are unlikely to bind an activated GTPase. The tandem RA-PH domains are present in a second adapter-protein family, MRL proteins, Caenorhabditis elegans protein MIG-1012, the mammalian proteins RIAM and lamellipodin and the Drosophila melanogaster protein Pico12, all of which are Ena/VASP-binding proteins involved in actin-cytoskeleton rearrangement. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.
39pfam10771652230.941[                                      ---------- ]DUF2582Protein of unknown function (DUF2582). This family is conserved in bacteria and archaea. The function is not known.
40pfam02023922430.732[                                -----------      ]SCANSCAN domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several pfam00096 proteins. The domain has been shown to be able to mediate homo- and hetero-oligomerization.
41cd041501593229.428[                         --------------          ]Arf1_5_likeADP-ribosylation factor-1 (Arf1) and ADP-ribosylation factor-5 (Arf5). The Arf1-Arf5-like subfamily contains Arf1, Arf2, Arf3, Arf4, Arf5, and related proteins. Arfs1-5 are soluble proteins that are crucial for assembling coat proteins during vesicle formation. Each contains an N-terminal myristoylated amphipathic helix that is folded into the protein in the GDP-bound state. GDP/GTP exchange exposes the helix, which anchors to the membrane. Following GTP hydrolysis, the helix dissociates from the membrane and folds back into the protein. A general feature of Arf1-5 signaling may be the cooperation of two Arfs at the same site. Arfs1-5 are generally considered to be interchangeable in function and location, but some specific functions have been assigned. Arf1 localizes to the early/cis-Golgi, where it is activated by GBF1 and recruits the coat protein COPI. It also localizes to the trans-Golgi network (TGN), where it is activated by BIG1/BIG2 and recruits the AP1, AP3, AP4, and GGA proteins. Humans, but not rodents and other lower eukaryotes, lack Arf2. Human Arf3 shares 96% sequence identity with Arf1 and is believed to generally function interchangeably with Arf1. Human Arf4 in the activated (GTP-bound) state has been shown to interact with the cytoplasmic domain of epidermal growth factor receptor (EGFR) and mediate the EGF-dependent activation of phospholipase D2 (PLD2), leading to activation of the activator protein 1 (AP-1) transcription factor. Arf4 has also been shown to recognize the C-terminal sorting signal of rhodopsin and regulate its incorporation into specialized post-Golgi rhodopsin transport carriers (RTCs). There is some evidence that Arf5 functions at the early-Golgi and the trans-Golgi to affect Golgi-associated alpha-adaptin homology Arf-binding proteins (GGAs).
42pfam101687175528.91.1E+02[ -----------------------                         ]Nup88Nuclear pore component. Nup88 can be divided into two structural domains; the N-terminal two-thirds of the protein has no obvious structural motifs but is the region for binding to Nup98, one of the components of the nuclear pore. the C-terminal end is a predicted coiled-coil domain. Nup88 is overexpressed in tumor cells.
43TIGR02105724228.91.3E+02[     --------------------                        ]III_needletype III secretion apparatus needle protein. Type III secretion systems translocate proteins, usually virulence factors, out across both inner and outer membranes of certain Gram-negative bacteria and further across the plasma membrane and into the cytoplasm of the host cell. This protein, termed YscF in Yersinia, and EscF, PscF, EprI, etc. in other systems, forms the needle of the injection apparatus.
44cd004541113028.61.7E+02[    -------------                                ]TrHb1_Ntruncated hemoglobins (TrHbs, 2/2Hb, 2/2 globins); group 1 (N). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. They are classified into three main groups based on their structural properties: TrHb1s (N), TrHb2s (O) and TrHb3s (P). Typical of the TrHb1s (N) group is a protein matrix tunnel. It includes a Mycobacterium tuberculosis TrHb1, Mt-trHbN, which is encoded by the glbN gene. Mt-trHbN is expressed during the Mycobacterium stationary phase, and plays a specific defense role against nitrosative stress. The cyanobacterium Synechococcus sp. PCC 7002 TrHb1 GlbN, is constitutively expressed, and likely also protects cells from reactive nitrogen species.
45pfam02861532928.455[   ------------                                  ]Clp_NClp amino terminal domain. This short domain is found in one or two copies at the amino terminus of ClpA and ClpB proteins from bacteria and eukaryotes. The function of these domains is uncertain but they may form a protein binding site.
46PRK001171573528.41.1E+02[    ----------------                             ]recXrecombination regulator RecX; Reviewed
47pfam13260541727.869[                               --------          ]DUF4051Protein of unknown function (DUF4051). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.
48COG21371744927.42.1E+02[   ------------------------                      ]RecXSOS response regulatory protein OraA/RecX, interacts with RecA
49pfam076071283226.695[            ----------------                     ]DUF1570Protein of unknown function (DUF1570). A family of hypothetical proteins in Rhodopirellula baltica. This family carries a highly conserved HExxH sequence motif characteristic of members of the Peptidase clan MA.
50PRK043748695426.41.7E+02[              -----------------------            ]PRK04374PII uridylyl-transferase; Provisional
51COG42307693126.476[      ------------                               ]PutA2Delta 1-pyrroline-5-carboxylate dehydrogenase
52pfam090902503726.21.3E+02[ ---------------                                 ]MIF4G_like_2MIF4G like. Members of this family are involved in mediating U snRNA export from the nucleus. They adopt a highly helical structure, wherein the polypeptide chain forms a right-handed solenoid. At the tertiary level, the domain is composed of a superhelical arrangement of successive antiparallel pairs of helices.
53TIGR012385003525.590[     --------------                              ]D1pyr5carbox3delta-1-pyrroline-5-carboxylate dehydrogenase (PutA C-terminal domain). This model represents one of several related branches of delta-1-pyrroline-5-carboxylate dehydrogenase. Members of this branch are the C-terminal domain of the PutA bifunctional proline dehydrogenase / delta-1-pyrroline-5-carboxylate dehydrogenase.
54PRK1190512083525.594[     ---------------                             ]PRK11905bifunctional proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase; Reviewed
55PRK065262543725.336[                ---------------                  ]PRK06526transposase; Provisional
56pfam06006711825.345[                 -------                         ]DUF905Bacterial protein of unknown function (DUF905). This family consists of several short hypothetical Enterobacteria proteins of unknown function. Structural analysis of the surface features of the protein YvyC has revealed a single cluster of highly conserved residues on the surface. Additionally, these residues fall into two groups which lie within the two largest of the three cavities identified over the surface. The conclusion from this is that these two cavities with, Leu 58, Glu 75, Ile 82, and Glu 83 and Pro 86, conserved, are likely to be important for the molecular function and reflect the cavities found on the surface of the FlaG proteins in pfam03646.
57cd087361202025.260[               ----------                        ]RGS_RhoGEF-likeRegulator of G protein signaling (RGS) domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein. The RGS domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein subfamily of the RGS domain containing protein family, which is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RhoGEFs link signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RGS domain of the RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. The RGS-GEFs subfamily includes the leukemia-associated RhoGEF (LARG), p115RhoGEF, and PDZ-RhoGEF. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.
58cd14367422124.578[    ---------                                    ]CUE_CUED2CUE domain found in CUE domain-containing protein 2 (CUED2) and similar proteins. CUEDC2 is a novel negative regulator of progesterone receptor (PR) and functions to promote the progesterone-induced PR degradation by the ubiquitin-proteasome pathway. It also acts as the regulator of JAK1/STAT3 signaling through inhibiting cytokine-induced phosphorylation of JAK1 and STAT3 and the subsequent STAT3 transcriptional activity. All members in this subfamily contain a CUE domain.
59pfam011521203424.51.3E+02[   ---------------                               ]Bac_globinBacterial-like globin. This family of heme binding proteins are found mainly in bacteria. However they can also be found in some protozoa and plants as well.
60cd029771055623.949[    -------------------------                    ]ArsC_familyArsenate Reductase (ArsC) family; composed of TRX-fold arsenic reductases and similar proteins including the transcriptional regulator, Spx. ArsC catalyzes the reduction of arsenate
61cd014821642023.932[             --------                            ]vWA_collagen_alphaI-XII-likeCollagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions.
62TIGR01967128310523.94.3E+02[  --------------------------------------------   ]DEAH_box_HrpARNA helicase HrpA. This model represents HrpA, one of two related but uncharacterized DEAH-box ATP-dependent helicases in many Proteobacteria and a few high-GC Gram-positive bacteria. HrpA is about 1300 amino acids long, while its paralog HrpB, also uncharacterized, is about 800 amino acids long. Related characterized eukarotic proteins are RNA helicases associated with pre-mRNA processing. The HrpA/B homolog from Borrelia is 500 amino acids shorter but appears to be derived from HrpA rather than HrpB.
63PRK136761144223.52.9E+02[  ---------------------                          ]PRK13676hypothetical protein; Provisional
64pfam06771854223.373[                      ------------------         ]Desmo_NViral Desmoplakin N-terminus. This family represents the N-terminus of viral desmoplakin. Desmoplakin is a component of mature desmosomes, which are the main adhesive junctions in epithelia and cardiac muscle. Desmoplakin is also essential for the maturation of adherens junctions. Note that many family members are hypothetical.
65pfam04994774022.91E+02[      ----------------                           ]TfoX_CTfoX C-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the C-terminal presumed domain of TfoX. The domain is found associated with pfam00383 in a member from Neisseria meningitidis serogroup B. It is also found as an isolated domain in some proteins suggesting this is an autonomous domain.
66cd121174742622.751[-----------                                      ]A_NRPS_Srf_likeThe adenylation domain of nonribosomal peptide synthetases (NRPS), including Bacillus subtilis termination module Surfactin (SrfA-C). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and, in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the adenylation domain of the Bacillus subtilis termination module (Surfactin domain, SrfA-C) which recognizes a specific amino acid building block, which is then activated and transferred to the terminal thiol of the 4'-phosphopantetheine (Ppan) arm of the downstream peptidyl carrier protein (PCP) domain.
67TIGR010757156222.699[            -----------------------------        ]uvrDDNA helicase II. Designed to identify uvrD members of the uvrD/rep subfamily.
68PRK00977803822.42.4E+02[----------------                                 ]PRK00977exodeoxyribonuclease VII small subunit; Provisional
69PRK072184233822.380[  ------------------                             ]PRK07218replication factor A; Provisional
70PRK1190410383822.21.1E+02[    ----------------                             ]PRK11904bifunctional proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase; Reviewed
71TIGR033491836122.23.3E+02[ ---------------------------                     ]IV_VI_DotUtype IV / VI secretion system protein, DotU family. At least two families of proteins, often encoded by adjacent genes, show sequence similarity due to homology between type IV secretion systems and type VI secretion systems. One is the IcmF family (TIGR03348). The other is the family described by this model. Members include DotU from the Legionella pneumophila type IV secretion system. Many of the members of this protein family from type VI secretion systems have an additional C-terminal domain with OmpA/MotB homology.
72cd041551743722.157[                       ----------------          ]Arl3Arf-like 3 (Arl3) GTPase. Arl3 (Arf-like 3) is an Arf family protein that differs from most Arf family members in the N-terminal extension. In is inactive, GDP-bound form, the N-terminal extension forms an elongated loop that is hydrophobically anchored into the membrane surface; however, it has been proposed that this region might form a helix in the GTP-bound form. The delta subunit of the rod-specific cyclic GMP phosphodiesterase type 6 (PDEdelta) is an Arl3 effector. Arl3 binds microtubules in a regulated manner to alter specific aspects of cytokinesis via interactions with retinitis pigmentosa 2 (RP2). It has been proposed that RP2 functions in concert with Arl3 to link the cell membrane and the cytoskeleton in photoreceptors as part of the cell signaling or vesicular transport machinery. In mice, the absence of Arl3 is associated with abnormal epithelial cell proliferation and cyst formation.
73PRK02821771622.146[                           ------                ]PRK02821hypothetical protein; Provisional
74pfam098502066122.13.2E+02[ ---------------------------                     ]DUF2077Uncharacterized protein conserved in bacteria (DUF2077). This domain, found in various hypothetical prokaryotic proteins, has no known function.
75pfam12550812422.035[                                 ----------      ]GCR1_CTranscriptional activator of glycolytic enzymes. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. This family is activates the transcription of glycolytic enzymes.
76pfam096655113521.81.3E+02[                     ---------------             ]RE_Alw26IDEType II restriction endonuclease (RE_Alw26IDE). Members of this entry are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. characterized specificities of the three members are GGTCTC, CGTCTC and the shared subsequence GTCTC.
77pfam136101405321.711[            -----------------------              ]DDE_Tnp_IS240DDE domain. This DDE domain is found in a wide variety of transposases including those found in IS240, IS26, IS6100 and IS26.
78pfam124552744021.53.2E+02[  -----------------                              ]DynactinDynein associated protein. This domain family is found in eukaryotes, and is approximately 280 amino acids in length. The family is found in association with pfam01302. There is a single completely conserved residue E that may be functionally important. Dynactin has been associated with Dynein, a kinesin protein which is involved in organelle transport, mitotic spindle assembly and chromosome segregation. Dynactin anchors Dynein to specific subcellular structures.
79pfam05016894021.52E+02[           -------------------                   ]Plasmid_stabilPlasmid stabilisation system protein. Members of this family are involved in plasmid stabilisation. The exact molecular function of this protein is not known. This family also encompasses RelE/ParE described in.
80cd14423471921.468[      -------                                    ]CUE_UBR5CUE domain found in E3 ubiquitin-protein ligase UBR5 and similar proteins. UBR5, also called E3 ubiquitin-protein ligase, HECT domain-containing 1, hyperplastic discs protein homolog (HYD), progestin-induced protein, EDD, or Rat100, belongs to the E3 protein family of HECT (homologous to E6-AP C-terminus) ligases. It is frequently overexpressed in breast and ovarian cancer, suggesting a role in cancer development. UBR5 is involved in DNA-damage signaling. It can ubiquitinate DNA topoisomerase II-binding protein 1 (TopBP1) in the presence of the E2 enzyme UBCH4. It also activates the DNA-damage checkpoint kinase CHK2. Moreover, UBR5 interacts with the calcium and integrin-binding protein (CIB) in a DNA-damage-dependent manner. It functions as the substrate of the extracellular signal-regulated kinases (ERKs) 1 and 2. It also acts as a ubiquitin ligase that controls the levels of poly(A)-binding protein-interacting protein 2. In addition, UBR5 ubiquitinates and up-regulates beta-catenin, regulates transcription, and activates smooth-muscle differentiation through its ability to stabilize myocardin. UBR5 contains an N-terminal CUE domain, a zinc-finger-like domain termed the ubiquitin-recognin (UBR) box, a MLLE (mademoiselle) domain, and a C-terminal catalytic HECT domain.
81TIGR01280543921.12.1E+02[ ----------------                                ]xseBexodeoxyribonuclease VII, small subunit. This protein is the small subunit for exodeoxyribonuclease VII. Exodeoxyribonuclease VII is made of a complex of four small subunits to one large subunit. The complex degrades single-stranded DNA into large acid-insoluble oligonucleotides. These nucleotides are then degraded further into acid-soluble oligonucleotides.
82pfam04737971920.976[                      --------                   ]Lant_dehyd_NLantibiotic dehydratase, N terminus. Lantibiotics are ribosomally synthesized antimicrobial agents derived from ribosomally synthesized peptides. They are produced by bacteria of the Firmicutes phylum, and include mutacin, subtilin, and nisin. Lantibiotic peptides contain thioether bridges termed lanthionines that are thought to be generated by dehydration of serine and threonine residues followed by addition of cysteine residues. This family constitutes the N-terminus of the enzyme proposed to catalyse the dehydration step.
83TIGR045492613320.985[                      ---------------------      ]LP_HExxH_w_tonBsubstrate import-associated zinc metallohydrolase lipoprotein. Members of this family are lipoproteins with the typical zinc metallohydrolase HExxH motif and additional similarities to a better-documented zinc peptidase family, pfam06167. The seed alignment begins immediately after the lipoprotein motif Cys residue. Up to five members of this protein family occur per genome, in the context of certain gene pairs related to RagA and RagB, or to SusC and SusD. Those gene pairs, like the present family, are restricted to the Bacteriodetes, may number up to 100 pairs per genome, and are linked to TonB-dependent uptake of biopolymer-derived nutrients such as glycans. A possible function for this lipoprotein is to hydrolyse larger molecules to prepare substrates for import and utilization.
84pfam086372906620.75E+02[      ----------------------------               ]NCA2ATP synthase regulation protein NCA2. NCA2 has been shown to be required for the regulation of ATP synthase subunits Atp6p and Atp8p in Saccharomyces cerevisiae.
85pfam10743872420.586[                      ---------------            ]Phage_CoxRegulatory phage protein cox. This family of phage Cox proteins is expressed by Enterobacteria phages. The Cox protein is a 79-residue basic protein with a predicted strong helix-turn-helix DNA-binding motif. It inhibits integrative recombination and it activates site-specific excision of the HP1 genome from the Haemophilus influenzae chromosome, Hp1. Cox appears to function as a tetramer. Cox binding sites consist of two direct repeats of the consensus motif 5'-GGTMAWWWWA, one Cox tetramer binding to each motif. Cox binding interferes with the interaction of HP1 integrase with one of its binding sites, IBS5. This competition is central to directional control. Both Cox binding sites are needed for full inhibition of integration and for activating excision, because it plays a positive role in assembling the nucleoprotein complexes that produce excisive recombination, by inducing the formation of a critical conformation in those complexes.
86pfam03622642220.31.5E+02[---------                                        ]IBV_3BIBV 3B protein. Product of ORF 3B from Avian infectious bronchitis virus (IBV). Currently, the function of this protein remains unknown.
87COG4453955420.33.2E+02[  -------------------------                      ]COG4453Uncharacterized conserved protein, DUF1778 family
88pfam065331514220.22.4E+02[        --------------------                     ]DUF1110Protein of unknown function (DUF1110). This family consists of hypothetical proteins specific to Oryza sativa. One sequence appears to be tandemly repeated.