{QDOC50005} {PS50005; TPR_REPEAT} {Status=preliminary} {BEGIN} *************** * TPR repeat. * *************** A repeat structure of typically 34 amino acids, first described in the yeast cell cycle regulator Cdc23p [1] and later found to occur in a large number of proteins [2,3]. A function for this repeat seems to be protein-protein interaction, but common features in the interaction partners have not been defined. It has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes. Prominent examples of TPR-proteins include - Cdc16p, Cdc23p and Cdc27p, all components of the cyclosome/APC - Pex5p/Pas10p, the receptor for peroxisomal targeting signals - Tom70p, a co-receptor for mitochondrial targeting signals - Ser/Thr phosphatase 5C - O-GlcNAc transferase, p110 subunit [1] UI:90124639 [2] UI:91352828 [3] UI:95397415 [4] UI:92354695 [D1] INTERPRO:IPR001440 [D2] PFAM:PF00515 [D3] PRINTS: [D4] SMART: {END} {QDOC50015} {PS50015; SAP_B} {Status=preliminary} {BEGIN} ************************** * Saposin type B domain. * ************************** Saposins are small lysosomal proteins that serve as activator of various lysosomal lipid-degrading enzymes [1]. They probably act by isolating the lipid substrate from the membrane surroundings, thus making it more accessible to the soluble degradative enzymes. All mammalian saposins are synthesized as a single precursor molecule (prosaposin) which contains four Saposin-B domains, yielding the active saposins after proteolytic cleavage, and two Saposin-A domains that are removed in the activation reaction. Recently, it was recignized that Saposin-B domains also occur in other proteins, many of them active in the lysis of membranes [2,3]. The following protein classes were found to contain Saposin-B domains: - Saposins Sap-A, Sap-B, Sap-C, Sap-D - Mammalian pulmonary surfactant protein PSP-B - Mammalian acid sphingomyelinase - Mammalian Acyloxyacyl-hydrolase - Natural Killer Cell lytic protein NK-lysin - Amoebapore protein - Schistosoma LGG protein - A group of plant aspartic proteases related to cyprosin. These proteins have a peculiar SAP-B domain where the two halves are 'swapped' [4]. - A large group of nematode proteins of unknown function The 3D-structure of NK-lysin has recently been determined [5] and found to be very different from the one predicted in [1]. [1] UI:96048294 [2] UI:94272336 [3] UI:97021725 [4] UI:95334819 [5] UI:97475218 [D1] INTERPRO:IPR000004 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50022} {PS50022; DS_DOMAIN} {Status=empty} {BEGIN} ******************************************* * Discoidin domain (FA5/8 type C domain). * ******************************************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001092 [D2] PFAM:PF00010 [D3] PRINTS: [D4] SMART: {END} {QDOC50028} {PS50028; HIST_TAF} {Status=preliminary} {BEGIN} *************************************** * Histone-fold/TFIID-TAF/NF-Y domain. * *************************************** The core histones together with some other DNA binding proteins appear to form a superfamily defined by a common fold and distant sequence similarities [1-3]. Some proteins contain local homology domains related to this assumed histone fold. Proteins belonging to this superfamily include: - Histones H2A and H2B (PDOC00045,PDOC00308) - Histones H3 (PDOC00287) - Histones H4 (PDOC00046) - Several histone-like proteins from various species - NF-Y subunits CBF-A and CBF-C and their yeast homologues Hap3p and Hap5p (PDOC00578). - Human TFIID subunits TAF15, TAF20, TAFII28, TAFII31, TAFII70 and their homologues from other species. - Centromere binding protein CENPA - DR1-associated corepressor DRAP1 - Son of sevenless (SOS) and related proteins [1] UI:95380285 [2] UI:97169409 [3] http://www.nhgri.nih.gov/DIR/GTB/HISTONES [D1] INTERPRO:IPR000166 [D2] PFAM:PF00125 [D3] PRINTS: [D4] SMART: {END} {QDOC50030} {PS50030; UBA} {Status=preliminary} {BEGIN} ******************************** * Ubiquitin-associated domain. * ******************************** The UBA-domain is a novel sequence motif found in several proteins having connections to ubiquitin and the ubiquitination pathway [1,2]. The UBA domain is probably a non-covalent ubiquitin binding domain. Proteins known to contain UBA domains include: - Bovine E2-25K and several other ubiquitin-conjugating enzymes (E2), catalyzing the second step in protein ubiquitination. - Drosophila hyperplastic discs protein, a putative ubiquitin-protein ligase (E3), catalyzing the third and final step in protein ubiquitination. - Ubiquitin isopeptidase T and several other ubiquitin C-terminal hydrolases, catalyzing regulatory protein-deubiquitination. - S.cerevisiae RAD23 and its mammalian homologues. These proteins act in UV excision repair and contain a N-terminal domain similar to ubiquitin itself. - S.cerevisiae DSK2 protein required for spindle pole body duplication. This protein contains a N-terminal domain similar to ubiquitin itself. - Mammalian proto-oncogene c-cbl, a protein interacting with multiple signal transduction factors. Cbl has recently been shown to undergo regulatory ubiquination upon macrophage stimulation. - Drosophila REF(2)P protein and its mammalian homologues. In the murine Ref(2)-homolog p62 has been shown to bind to ubiquitin via its UBA domain [3] - A large family of SNF1-like kinases from various plants and other organisms. -Expert(s) to contact by email: Hofmann K. (khofmann@isrec.unil.ch) [1] http://ulrec3.unil.ch/domains/uba [2] UI:97025177 [3] UI:96355343 [D1] INTERPRO:IPR000449 [D2] PFAM:PF00627 [D3] PRINTS: [D4] SMART:UBA {END} {QDOC50032} {PS50032; KA1} {Status=empty} {BEGIN} ******************************* * Kinase associated domain 1. * ******************************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001772 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} A{QDOC50033} {PS50033; UX_DOMAIN} {Status=preliminary} {BEGIN} ************************** * UBA-associated domain. * ************************** The UX (UBX) domain is a novel sequence motif found in several proteins having connections to ubiquitin and the ubiquitination pathway and occurs occasionally together with UBA domains [1]. The UX domain might be distantly related to ubiquitin-like domains. Proteins known to contain UX domains include: -Mouse Fas associated factor FAF1. -Mammalian P97/VCP associated protein p47, involved in homotypic ER and golgi fusion control. -Human REP-8 protein. -Human undulin, a matrix glycoprotein. -Yeast protein Shp1. -Yeast hypothetical proteins Ybr273c, Ydl091c, Ydr330w, Yjl048c, Ymr067c. -Expert(s) to contact by email: Hofmann K. (khofmann@isrec.unil.ch) [1] UI:97025177 [D1] INTERPRO:IPR001012 [D2] PFAM:PF00789 [D3] PRINTS: [D4] SMART: {END} {QDOC50043} {PS50043; HTH_LUXR} {Status=preliminary} {BEGIN} ******************************************************************** * Helix-turn-helix domain, luxR and related types (lysR etc) * ******************************************************************** The luxR family of bacterial transcriptional regulators is described in PDOC00542. Several subclasses of bacterial helix-turn-helix motifs are related to each other, in several cases the subclassification even appears artificial. The luxR family is related to the lysR family (described in PDOC00043). This profile, based on accepted luxR proteins finds also a substantial number of lysR proteins with high significance. [D1] INTERPRO:IPR000792 [D2] PFAM:PF00196 [D3] PRINTS:PR00038 [D4] SMART: {END} {QDOC50079} {PS50079; NLS_BP} {Status=incomplete} {BEGIN} ****************************************** * Bipartite nuclear localization signal. * ****************************************** This is the profile version of the prosite rule NUCLEAR (PS00015). See PDOC00015 for more explanation. WARNING: This profile is a frequent hit producer. In the absence of other evidence (experimental, co-occurrence with DNA-binding domains, etc.) a match to this entry should only be taken as a weak indication of nuclear localization. -Last Update: August 1999 (M. Pagni, P. Bucher). [D1] INTERPRO:IPR001472 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50083} {PS50083; SPEC_REPEAT} {Status=empty} {BEGIN} ******************** * Spectrin repeat. * ******************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR002017 [D2] PFAM:PF00435 [D3] PRINTS: [D4] SMART: {END} {QDOC50085} {PS50085; RAP_GAP_3} {Status=empty} {BEGIN} **************** * Rap/ran-GAP. * **************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000331 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50086} {PS50086; RAB_GAP} {Status=empty} {BEGIN} ********************** * RabGAP/TBC domain. * ********************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000195 [D2] PFAM:PF00566 [D3] PRINTS: [D4] SMART: {END} {QDOC50091} {PS50091; GELS} {Status=empty} {BEGIN} ******************** * Gelsolin repeat. * ******************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001974 [D2] PFAM:PF00626 [D3] PRINTS:PR00597 [D4] SMART: {END} {QDOC50095} {PS50095; LH2} {Status=empty} {BEGIN} *********************************************** * Lipoxygenase homology 2 region (contact ... * *********************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001024 [D2] PFAM:PF01477 [D3] PRINTS: [D4] SMART: {END} {QDOC50098} {PS50098; LRI} {Status=empty} {BEGIN} ********************************************* * A-latrotoxin receptor interaction domain. * ********************************************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR002149 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50100} {PS50100; DA_BOX} {Status=empty} {BEGIN} ****************************************************************** * 2nd half motif for nucleotide binding, associated with P-loop. * ****************************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001051 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50103} {PS50103; ZF_CCCH} {Status=preliminary} {BEGIN} ******************************* * Zinc finger CCCH signature * ******************************* The well conserved structure (C-x(8)-C-x(5)-C-x(3)-H) of the CCCH zinc finger is present in many eukaryotic proteins from yeast to mammals. It has been shown that different CCCH zinc finger proteins interact with the 3' untranslated region of various mRNA [1,2]. It is very often present in two copies. The proteins currently known to contain one or more copies of a CCCH Zinc Finger are listed below. - Mammalian tristetraprolin (TTP) proteins, prototype of the CCCH zinc finger proteins, inhibited TNF alpha production from macrophages by destabilizing its messenger RNA [2] (2 copies). - Eukaryotic TIS/CTH family. Regulatory protein involved in regulating the response to growth factors similar to TTP (2 copies). - Eukaryotic U2AG, U2R1, U2R2 proteins. Small nuclear ribonucleoprotein (1 or 2 copies). - Caenorhabditis Elegans PIE1 protein. Required for specification of embryonic germ cells lineage and for the lack of mRNA transcription in these cells (2 copies). - Caenorhabditis Elegans POS-1 protein. Cytoplasmic protein similar to TIS11, essential for germ line specification in C. elegans (2 copies). - Drosophila CPSF 30K/Clipper protein. Multisubunit cleavage and polyadenylation specific factor is required for cleavage of the mRNA precursor as well as polyadenylation (3 copies ). - Schizosaccharomyces pombe ZFS1 protein. Isolated as a suppressor of the sterility caused by overexpression of a double-stranded RNase. - Human Z183 protein. Contain one ring finger and one CCCH zinc finger. - Yeast hypothetical YL23 protein.(2 copies). - Caenorhabditis Elegans hypothetical YP25 protein (2 copies). - Human hypothetical Y054 protein. -Consensus pattern: C-x(8)-C-x(5)-C-x(3)-H -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last update: April 1999 / First entry (N. Hulo, C. Sigrist). [ 1] Carballo E., Lai W.S., Blackshear P.J. Science 281:1001-1005(1998). [ 2] Lai W.S., Carballo E., Strum J.R., Kennington E.A., Phillips R.S., Blackshear P.J. Mol. Cell. Biol. 19:4311-4323(1999). [D1] INTERPRO:IPR000571 [D2] PFAM:PF00642 [D3] PRINTS: [D4] SMART: {END} {QDOC50116} {PS50116; HTH_FIS_FAMILY} {Status=incomplete} {BEGIN} ****************************** * Helix-Turn-Helix fis-type. * ****************************** Another subgroup of bacterial Helix-Turn-Helix transcriptional regulators. This subgroup is named after the "factor for inversion stimulation", whose crystal structure has been solved to 2.0 Angstrom. [1] Other proteins belonging to this subfamily include E.coli: atoC, hydG, ntrC, fhlA, tyrR, Rhizobium: ntrC, nifA, dctD [1] UI:92318262 [D1] INTERPRO:IPR002197 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50118} {PS50118; HMG} {Status=preliminary} {BEGIN} ************************************* * High-mobility group (HMG) family. * ************************************* The high-mobility group domain occurs in a large family of DNA binding proteins[1,2]. This entry is based on the HMG1/2 family described in PDOC00305 but is much more general. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes: - HMG1, HMG2 and related proteins - The SOX family of transcription factors - SRY Sex determining region Y protein and related proteins - LEF1 Lymphoid enhancer binding factor 1 - SSRP Recombination signal recognition protein - MTF1 Mitochondrial transcription factor 1 - UBF1/2 Nucleolar transcription factors - Abf2 yeast ARS-binding factor - Yeast transcription factors Ixr1, Rox1, Nhp6a, Nhp6b, Spp41 and several other proteins. The phylogeny of HMG-box domains has been extensively studied [3,4] and several 3D-structures have been determined, e.g. [5,6] [1] UI:94233631 [2] UI:95303647 [3] UI:94016623 [4] UI:93281400 [5] UI:93223672 [6] UI:93223672 [D1] INTERPRO:IPR000910 [D2] PFAM:PF00505 [D3] PRINTS: [D4] SMART: {END} {QDOC50120} {PS50120; HHH} {Status=preliminary} {BEGIN} ****************************** * Helix-hairpin-helix motif. * ****************************** WARNING: Due to the heterogeneity of this family not all HhH are recognized by this profile. The HhH motif is an around 20 amino acids domain present in prokaryotic and eukaryotic non-sequence-specific DNA binding proteins [1, 2, 3]. The HhH motif is similar to, but distinct from, the HtH motif. Both of these motifs have two helices connected by a short turn. In the HtH motif the second helix binds to DNA with the helix in the major groove. This allow the contact between specific base and residues throughout the protein. In the HhH motif the second helix does not protrude from the surface of the protein and therefore cannot lie in the major groove of the DNA. Cristallographic studies suggest that the interaction of the HhH domain with DNA is mediated by amino acids located in the strongly conserved loop (L-P-G-V) and at the N-terminal end of the second helix [1]. This interaction could involve the formation of hydrogen bonds between protein backbone nitrogens and DNA phosphate groups [4]. The structural difference between the HtH and HhH domains is reflected at the functional level: whereas the HtH domain, found primarily in gene regulatory proteins, binds DNA in a sequence specific manner, the HhH domain is rather found in proteins involved in enzymatic activities and binds DNA with no sequence specificity [4]. Protein currently known to include the HhH motif are listed below. - Prokaryotic UvrC protein. One of the three subunits of the ABC excision nuclease, a DNA repair enzyme that catalyzes the excision reaction of UV-damaged nucleotide segments. - Bacterial DNA ligase protein. - Eukaryotic ERCC1/RAD10 family, structure-specific DNA repair endonuclease responsible for the 5-prime incision during DNA repair. - Bacterial RecA proteins, binds to single and double stranded DNA and exhibits DNA-dependent ATPase activity. Underwinds duplex DNA. - Bacterial RadC, involved in DNA repair. - Yeast SW10 protein, involved in termination of copy-synthesis during mating-type switching. Involved in nucleotide excision repair of DNA damaged with UV light, bulky adducts, or cross-linking agents. Along with rad16 forms an endonuclease that specifically degrades single-stranded DNA. - Bacillus subtilis Cme1 protein. A integral membrane protein required for genetic transformation, is needed for both DNA binding and transport. - Escherichia coli endonuclease III, possesses both an apurinic and/or apyrimidic endonuclease activity and a DNA N-glycosylase activity. It has been studied by cristallography [1]. - Bacillus subtilis ComEA protein. This bitopic membrane protein is a DNA receptor for transformation of competent Bacillus subtilis [4]. -Consensus pattern: profile. -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last update: April 1999 / First entry (N. Hulo, C. Sigrist). [ 1] Thayer M.M., Ahern H., Xing D., Cunningham R.P., Tainer J.A. EMBO J. 14:4108-4120(1995). [ 2] Aravind L., Walker D.R., Koonin E.V. Nucleic Acids Res. 27:1223-1242(1999). [ 3] Provvedi, R., Dubnau, D. Mol. Microbiol. 31:271-280(1999). [ 4] Doherty A.J., Serpell L.C., Ponting C.P. Nucleic Acids Res. 24:2488-2497(1996). [D1] INTERPRO:IPR000445 [D2] PFAM:PF00633 [D3] PRINTS: [D4] SMART: {END} {QDOC50121} {PS50121; SBP_GLUR} {Status=empty} {BEGIN} ***************************************************** * Solute binding protein/glutamate receptor domain. * ***************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001311 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50124} {PS50124; MET_TRANS} {Status=empty} {BEGIN} *************************************** * Generic methyl-transferase profile. * *************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001601 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50128} {PS50128; SURP} {Status=empty} {BEGIN} ********************************************* * SURP module found in splicing regulators. * ********************************************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000061 [D2] PFAM:PF01805 [D3] PRINTS: [D4] SMART: {END} {QDOC50129} {PS50129; PABP} {Status=empty} {BEGIN} ************************************************** * Poly-adenylate binding protein, unique domain. * ************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR002004 [D2] PFAM:PF00658 [D3] PRINTS: [D4] SMART: {END} {QDOC50130} {PS50130; SKP1_NT} {PS50131; SKP1_CT} {Status=incomplete} {BEGIN} ******************************************* * SKP1 N-terminal and C-terminal domains. * ******************************************* SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex [1]. It was found to bind several F-box (QDOC500181) containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway [2,6]. A yeast homologue of SKP1 (CB34_YEAST) was identified in the centromere bound kinetochore complex [3] and is also involved in the ubiquitin pathway [5]. In the slime mold FP21 was shown to be glycosylated in the cytosol and has homology to SKP1 [4]. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: May 1999 (L. Falquet). [1] UI:96016087 [2] UI:96319729 [3] UI:96312958 [4] UI:95155385 [5] UI:98050924 [6] UI:98190440 [D1] INTERPRO:IPR001232 [D2] PFAM:PF01466 [D3] PRINTS: [D4] SMART: {END} {QDOC50134} {PS50134; ZF_TAZ} {Status=preliminary} {BEGIN} ******************* * TAZ zinc finger * ******************* CBP and the related protein p300 are large nuclear molecules that interact with transcriptional activator and repressor. They belong to a class of protein containing an histone acetyltransferase activity, which suggests a role in chromatin remodeling. They have been implicated in biological function as diverse as cell growth, differentiation, or apoptosis [1]. CBP/P300 proteins contain in their N and C terminal parts the so called transcriptional adaptor zinc finger (TAZ finger). Each TAZ domain is an around 100 amino acids domain which shows an internal triplication of a Cys-x4-Cys-x8-His-x3-Cys module, although some of the repeats are imperfect [2]. The binding sites for YY1, E1A and TFIIB in CBP and P300 proteins have been mapped in the region that contain the TAZ zinc finger, suggesting a possible protein-binding function for this motif. It has been shown that each module is capable of forming stable secondary structure in presence of zinc [3]. The integrity of at least two modules is essential for the interaction with some CBP partners. This domain has been identified only in protein belonging to the CBP/P300 family. As a signature pattern for the TAZ zinc finger we selected a region that covers the last two modules. We have also developed a profile that spans a region slightly bigger on both sides. -Consensus pattern: C-x(4)-C-x(8)-H-x(3)-C-x(4,16)-C-x(2,4)-C-x(8,9)-H-x(3)-C -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Note: this documentation entry is linked to both a signature pattern and a profile. As the profile is much more sensitive than the pattern, you should use it if you have access to the necessary software tools to do so. [ 1] Giles R.H., Peters D.J., Breuning M.H. Trends Genet. 14:178-183(1998). [ 2] Ponting C.P., Blake D.J., Davies K.E., Kendrick-Jones J., Winder S.J. Trends Biochem. Sci. 21:11-13(1996). [ 3] Newton A.L., Sharpe B.K., Kwan A., MacKay J.P., Crossley M. J. Biol. Chem. 275:15128-15134(2000). -Last Update: April 2000 (N. Hulo). [D1] INTERPRO:IPR000197 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50140} {PS50140; HSF_ETS} {Status=preliminary} {BEGIN} ******************************* * HSF/ETS DNA-binding domain. * ******************************* The heat shock factor (HSF) type and ETS-type DNA binding domains are distantly related, as shown by profile-analysis. This similarity is supported by the 3D-structures of the HSF-domain [1] and of the ETS domain [2], which show that both domains belong to the class of winged-helix DNA binding domains. Other members of this structural class are the forkhead domain (PDOC00564) and the catabolite activator protein domain. This profile finds members of the two families: PDOC00381 HSF-domain PDOC00374 ETS-domain [1] UI:94112547 [2] UI:96176767 [D1] INTERPRO:IPR002341 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50144} {PS50144; TRAF} {Status=preliminary} {BEGIN} *********************************** * TRAF-domain (including meprin). * *********************************** TRAF proteins were isolated by their ability to interact with TNF receptor [1]. They promote cell survival by the activation of downstream protein kinase and, finally, transcription factors NF-kB and AP-1 family. The TRAF proteins are composed of 3 structural domains: a RING finger (see ) in the N terminal part of the protein, one to seven repeat of the TRAF zinc finger (see ) in the middle and the TRAF domain in the C terminal part [1]. The TRAF domain is necessary and sufficient for self-association and receptor interaction. From the structural analysis two consensus sequence recognised by the TRAF domain have been defined: a major one, [PSAT]x[QE]E and a minor one, PxQxxD [2]. Some of the proteins containing a TRAF domain are listed below: - Mammalian meprin A beta subunit - Mammalian TNF receptor associated factor 1 (TRAF1). - Mammalian TNF receptor associated factor 2 (TRAF2). - Mammalian TNF receptor associated factor 3 (TRAF3). We have developed a profile that covers the whole domain. -Last Update: December 2000 (N. Hulo). [ 1] Rothe M., Wong S.C., Henzel W.J., Goeddel D.V. Cell 78:681-692(1994). [ 2] Ye H., Park Y.C., Kreishman M., Kieff E., Wu H. Mol. Cell 4:321-330(1999). [D1] INTERPRO: [D2] PFAM: [D3] PRINTS: [D4] SMART:MATH {END} {QDOC50146} {PS50146; DAGK} {Status=empty} {BEGIN} ********************** * DAG-kinase domain. * ********************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001206 [D2] PFAM:PF00781 [D3] PRINTS: [D4] SMART: {END} {QDOC50147} {PS50147; SNF4_REP} {Status=empty} {BEGIN} **************** * SNF4 repeat. * **************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000644 [D2] PFAM:PF00571 [D3] PRINTS: [D4] SMART: {END} {QDOC50148} {PS50148; PALP_1} {Status=preliminary} {BEGIN} ******************************************************** * Beta family of pyridoxalphosphate dependent enzymes. * ******************************************************** Pyridoxal phosphate (PLP)-dependent enzymes (B6 enzymes) catalyze a wider variety of different reactions than those containing any other cofactor. With the exception of glycogen phosphorylase, PLP enzymes catalyze manifold reactions in the metabolism of amino acids [1]. The common features of PLP catalysis underlying these diverse reactions are: (1) A Schiff base is formed by the amino acid substrate (the amine component) and PLP (the carbonyl component). (2) The protonated form of PLP acts as an 'electron sink' to stabilize catalytic intermediates that are negatively charged - the ring nitrogen of PLP attracts electrons from the amino acid substrate. (3) The product Schiff base is then hydrolyzed. In the absence of substrate, the aldehyde group of PLP is in Schiff base linkage with the epsilon-amino group of a specific lysine residue at the active site [2]. The particular environment provided by the protein part of the various PLP enzymes directs the catalytic properties of the coenzyme so as to provide the holoenzyme with its own substrate and reaction specificity [1]. Sequence analyses have shown that the PLP enzymes can be grouped into three families of homologous proteins, one of which is the beta family including [3, 4, 5]: - L-serine dehydratase (EC 4.2.1.13) (L-serine deaminase). - D-serine dehydratase (EC 4.2.1.14) (D-serine deaminase). - Threonine dehydratase biosynthetic (EC 4.2.1.16) (threonine deaminase). - Tryptophan synthase beta chain (EC 4.2.1.20) - Threonine synthase (EC 4.2.99.2). - Cysteine synthase (EC 4.2.99.8) (O-acetylserine sulfhydrylase) (O- acetylserine (thiol)-lyase) (CSase). - Cystathionine beta-synthase (EC 4.2.1.22) (serine sulfhydrase) (beta- thionase). - 1-aminocyclopropane-1-carboxylate deaminase (EC 4.1.99.4) (ACC deaminase). - Putative diaminopropionate ammonia-lyase (EC 4.3.1.15) (diaminopropionatase) (alpha,beta-diaminopropionate ammonia-lyase). These enzymes generally catalyze elimination and replacement reactions at the beta-carbon atom of amino acid substrates. Their PLP-binding lysine residue is positioned in the N-terminal segment of the polypeptide chain [3]. The 3D structure of the beta-subunit of tryptophan synthase has been solved. The subunit has two domains that are approximately the same size and similar to each other in folding pattern. Each has a core containing a four-stranded parallel beta-sheet with three helices on its inner side and one on the outer side. The cofactor is bound at the interface between the domains [1]. [ 1] John R.A. Biochim. Biophys. Acta 1248:81-96(1995). [ 2] Stryer L. Biochemistry, Third Edition, Freeman, New-York (1988). [ 3] Alexander F.W., Sandmeier E., Mehta P.K., Christen P. Eur. J. Biochem. 219:953-960(1994). [ 4] Kery V., Poneleit L., Meyer J.D., Manning M.C., Kraus J.P. Biochemistry 38:2716-2724(1999). [ 5] Minami R., Uchiyama K., Murakami T., Kawai J., Mikami K., Yamada T., Yokoi D., Ito H., Matsui H., Honma M. J. Biochem. 123:1112-1118(1998). -Last Update: March 2000 (C.J.A. Sigrist). [D1] INTERPRO:IPR001926 [D2] PFAM:PF00291 [D3] PRINTS: [D4] SMART: {END} {QDOC50150} {PS50150; RFC} {Status=preliminary} {BEGIN} ****************************************** * Replication factor C conserved domain * ****************************************** DNA replication or DNA repair requires the concerted action of many enzymes, together with other proteins or cofactors. Among them three main accessory proteins, replication factor C (RF-C), proliferating-cell nuclear antigen (PCNA) and replication protein A (RP-A), are essential for accurate and processive DNA synthesis by DNA polymerases. RF-C is a multiprotein complex consisting of one large and four small subunits with distinct functions. RF-C can bind to a template-primer junction and, in the presence of ATP, load the PCNA clamp onto DNA, thereby recruiting DNA polymerases to the site of DNA synthesis [1]. Each of the five RF-C subunits as well as the prokaryotic clamp loader share seven regions of high similarity (termed RFC boxes II to VIII). These boxes are between 3 to 16 amino acids in length and the distances between them are similar in all subunits. In the four small subunits, these boxes are clustered within the N-terminal half of the polypeptide while they are centrally located in the large subunit [2]. Protein currently known to include this motif are listed below. - Yeast Replication Factor C (RFC), RFC1, RFC2, RFC3, RFC4, RFC5. - Eukaryotic Activator 1 (AC1) protein. AC11,AC12, AC13, AC14, AC15. RFC homologue. - Bacterial DNA Polymerase III delta' subunit. - Bacterial DNA Polymerase III TAU subunit (DP3X). - Drosophila Germline transcription factor 1 (GFN1). - Yeast RAD17/24 proteins. Participates in checkpoint pathways arrest of the cell cycle. A mechanism that allows the DNA repair pathways to act to restore the integrity of the DNA prior to DNA synthesis or separation of the replicated chromosomes. - Yeast protein CHL12, essential for the fidelity of chromosome transmission. - Bacteriophage DPA4 proteins, Required for elongation of primed templates by DNA polymerase. Our RFC profile is directed against a region covering modules VII and VIII. This region has an alpha-beta-alpha-beta-alpha-alpha-alpha structure and could, according to mutant phenotype, amino acids conservation and crystal structure, contain two ATP sensors being part of a nucleotide-binding pocket [3,4]. -Consensus pattern: profile -False positives: LON2_BACSU, ATP-dependent protease la homologue, also known as the lon family of ATP-dependent proteases. Score 11.8 -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last update: April 1999 / First entry (N. Hulot, C. Sigrist). [ 1] Mossi R., Hubscher U. Eur. J. Biochem. 254:209-216(1998). [ 2] Cullmann G., Fien K., Kobayashi R., Stillman B. Mol. Cell. Biol. 15:4661-4671(1995). [ 3] Guenther B., Onrust R., Sali A., O'Donnell M., Kuriyan J. Cell 91:335-345(1997) [ 4] Beckwith W.H., Sun Q., Bosso R., Gerik K.J., Burgers P.M., McAlear M.A. Biochemistry 37:3711-3722(1998). [D1] INTERPRO:IPR000862 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50153} {PS50153; PAP} {Status=empty} {BEGIN} ****************************************** * PAP/25A domain (polyA-related domain). * ****************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001201 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50154} {PS50154; PAP_CORE} {Status=empty} {BEGIN} ************************ * PAP/25A core domain. * ************************ Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001201 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50155} {PS50155; PAP_ASSOCIATED} {Status=empty} {BEGIN} ****************************** * PAP/25A associated domain. * ****************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR002058 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50159} {PS50159; RIBO_S13} {Status=empty} {BEGIN} ************************************* * Ribosomal protein S13/S18 family. * ************************************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001892 [D2] PFAM:PF00416 [D3] PRINTS: [D4] SMART: {END} {QDOC50164} {PS50164; UVRC_1} {PS50165; UVRC_2} {Status=preliminary} {BEGIN} ********************************* * UvrC homology region 1 and 2 * ********************************* During the process of E. coli nucleotide excision repair, DNA damage recognition and processing are achieved by the action of the uvrA, uvrB, and uvrC gene products [1]. The UvrC protein contain 4 conserved regions: a central region which interact with UvrB (Uvr domain), a Helix hairpin Helix (HhH) domain important for 5 prime incision of damage DNA and the homology regions 1 and 2 of unknown function. UvrC homology region 2 is specific for UvrC proteins, whereas UvrC homology region 1 is also shared by few other nucleases. The protein that contain the UvrC homology region 1 are listed below. - Prokaryotic UvrC proteins. - Bacteriophage T4 END2 protein. Small subunit of ribonucleotide reductase enzyme. - Bacteriophage T4 TEV1 protein. Endonuclease specific to the thymidylate synthase (td) gene splice junction and is involved in intron homing. - Mycobacterium hypothetical protein Y002. Exonuclease by similarity. - Bacillus subtilis hypothetical protein YURQ. -Consensus pattern: profile. -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last update: April 1999 / First entry (N. Hulo, C. Sigrist). [ 1] Van Houten B., Snowden A. Bioessays 15:51-59(1993). [ 2] Sancar A. Annu. Rev. Biochem. 65:43-81(1996). [D1] INTERPRO:IPR000305 [D2] PFAM:PF01541 [D3] PRINTS: [D4] SMART: {END} {QDOC50167} {PS50167; GLYC_TRANS} {Status=empty} {BEGIN} *************************************** * General Glycosyltransferase domain. * *************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001173 [D2] PFAM:PF00535 [D3] PRINTS: [D4] SMART: {END} {QDOC50169} {PS50169; PP2C_1} {Status=empty} {BEGIN} ********************************** * Protein phosphatase 2C, box 1. * ********************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001932 [D2] PFAM:PF00481 [D3] PRINTS: [D4] SMART: {END} {QDOC50170} {PS50170; PP2C_2} {Status=empty} {BEGIN} ********************************** * Protein phosphatase 2C, box 2. * ********************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001932 [D2] PFAM:PF00481 [D3] PRINTS: [D4] SMART: {END} {QDOC50173} {PS50173; UMUC_DOMAIN} {Status=preliminary} {BEGIN} *************** * UMUC domain * *************** In Escherichia coli, UV and many chemicals appear to cause mutagenesis by a process of translesion synthesis that requires DNA polymerase III and the SOS-regulated proteins UmuD, UmuC and RecA. This machinery allow the replication to continue through DNA lesion, and therefore avoid lethal interruption of DNA replication after DNA damage [1]. The UmuC is a well conserved protein in prokaryote, with homologue in yeast. We developed a profile against the first 200 aa. Protein currently known to belong to this family are listed below. - Escherichia coli MucB protein. Plasmid-born analog of the UmuC protein. - Yeast Rev1 protein. Homologue of UmuC also required for normal induction of mutations by physical and chemical agents. - Salmonella typhimurium ImpB protein. Plasmid-born analog of the UmuC protein. - Bacterial UmuC protein. - Escherichia coli DNA-damage-inducible protein P (DinP). - Salmonella typhimurium SamB homologue of UmuC plasmid associated. -Consensus pattern: -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last update: April 1999 / First entry (N. Hulo, C. Sigrist). [ 1] Smith B.T., Walker G.C. Genetics 148:1599-1610(1998). [ 2] Walker G.C. Trends Biochem. Sci. 20:416-420(1995). [D1] INTERPRO:IPR001126 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50180} {PS50180; G_ADAPT_CT} {Status=empty} {BEGIN} ************************************ * Gamma-adaptin C-terminal domain. * ************************************ Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001121 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50182} {PS50182; 53EXO_N_DOMAIN} {PS50183; 53EXO_I_DOMAIN} {Status=incomplete} {BEGIN} ***************************** * 5'3'-Exonuclease domains. * ***************************** The N-terminal domain 53EXO_N_DOMAIN and the internal domain 53EXO_I_DOMAIN are commonly found together. They are most often associated with 5' to 3' nuclease activities. The XPG protein signatures (PS00841, PS00842, PDOC00658) is never found outside the "53EXO" domains. The latter are found in more diverse proteins [1,2,3]. The number of amino acids that separate the two 53EXO domains, and the presence of accompanying motifs allow the diagnosis of several protein families. - In the eubacterial type A DNA-polymerases, the domains 53EXO_N_DOMAIN and 53EXO_I_DOMAIN are separated by a few amino acids, usually four. The pattern DNA_POLYMERASE_A (PS00447) is always present towards the C-terminus. - Several eucaryotic structure-dependant endonuclease and exonuclease have the 53EXO domains separated by 24 to 27 amino acids. The XPG protein signatures are always present. - In several proteins from herpesviridae, the two 53EXO domains are separated by 50 to 120 amino acids. These proteins are implicated in the inhibition of the expression of the host genes. - Eucaryotic DNA-repair proteins with 600 to 700 amino acids between the 53_EXO domains all carry the XPG protein signatures. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: May 1999 (M. Pagni). [1] Harrington JJ & Lieber MRH Functional domains within FEN-1 and RAD2 define a family of structure-specific endonucleases: implications for nucleotide excision repair. Genes Dev. 1994 Jun 1;8(11):1344-55. UI:95011546. [2] Mueser TC, Nossal NG & Hyde CC Structure of bacteriophage T4 RNase H, a 5' to 3' RNA-DNA and DNA-DNA exonuclease with sequence similarity to the RAD2 family of eukaryotic proteins. Cell 1996 Jun 28;85(7):1101-12. UI:96270512. [3] Carr AM, Sheldrick KS, Murray JM, al-Harithy R, Watts FZ & Lehmann AR Evolutionary conservation of excision repair in Schizosaccharomyces pombe: evidence for a family of sequences related to the Saccharomyces cerevisiae RAD2 gene. Nucleic Acids Res. 1993 Mar 25;21(6):1345-9. UI:93219111. [D1] INTERPRO:IPR000513 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50185} {PS50185; PHOSPHO_ESTER} {Status=empty} {BEGIN} ********************************** * Metallo-phosphoesterase motif. * ********************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000934 [D2] PFAM:PF00149 [D3] PRINTS:PR00114 [D4] SMART: {END} {QDOC50187} {PS50187; ESTERASE} {Status=empty} {BEGIN} **************************************************** * Esterase/lipase/thioesterase active site serine. * **************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000379 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50188} {PS50188; B302} {Status=preliminary} {BEGIN} ***************************** * B30.2-like domain profile * ***************************** The B30.2-like domain is a conserved domain of 160-170 amino acids which is found in nuclear and cytoplasmic proteins, as well as transmembrane and secreted proteins. It was named after the B30-2 exon which maps within the human class I histocompatibility complex region and codes for a 166-amino-acid peptide similar to the C-terminal domain of human Sjoegren's syndrome nuclear antigen A/Ro (SS-A/Ro) and ret finger protein (RFP), Xenopus nuclear factor 7 (XNF7), and bovine butyrophilin [1]. The B30.2-like domain is found associated with different N-terminal domains: immunoglobulin domain in the case of butyrophilin, zinc-binding B-box domain in the case of RFP and SS-A/Ro and leucine zipper in the case of enterophilin. The function of the B30.2-like domain is not known, but the cytoplasmic B30.2-like domain of butyrophilin has been shown to interact with xanthine oxidase [2,3,4]. The B30.2-like domain contains three highly conserved motifs (LDP, WEVE and LDYE) [3]. The most probable fold for the B30.2-like domain consists of two distinct beta-domains involving 15 beta-strands, with the 5th beta-strand corresponding to WEVE motif [5]. Some proteins known to contain a B30.2-like domain are listed below: - Proteins of the RBCC (RING-finger, B-box and coiled-coil domain) family, such as Ro/SS-A, RFP, XNF7 and pyrin/marenostrin. - Proteins of the butyrophylin-related family. Butyrophilin is a membrane protein expressed in milk fat globule membrane. Its B30.2 domain is linked to two external immunoglobulin-like motifs by a single transmembrane segment. - Enterophilins, a family of leucine zipper proteins associated with enterocyte differentiation [6]. - The alpha and beta subunits of stonustoxin (STNX), a secreted protein that was purified from the venom of stonefish (Synanceja horrida). - SPRY domain-containing proteins with a SOCS boc (SSB) [4]. The SOCS proteins appear to form part of a classical negative feedback loop that regulates cytokine signal transduction. - Vitamin-K-dependent gamma-carboxylases [4]. -Note: The B30.2-like domain contains a subdomain called SPRY (see ) and can be considered as a subclass of the SPRY domain family. The SPRY domain has an N-terminal deletion when compared to the B30.2-like domain [5]. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Vernet C., Boretto J., Mattei M.G., Takahashi M., Jack L.J., Mather I.H., Rouquier S., Pontarotti P. J. Mol. Evol. 37:600-612(1993). [ 2] Henry J., Ribouchon M.-T., Depetris D., Mattei M.-G., Offer C., Tazi-Ahnini R., Pontarotti P. Immunogenetics 46:383-395(1997). [ 3] Henry J., Ribouchon M.-T., Offer C., Pontarotti P. Biochem. Biophys. Res. Commun. 235:162-165(1997). [ 4] Henry J., Mather I.H., McDermott M.F., Pontarotti P. Mol. Biol. Evol. 15:1696-1705(1998). [ 5] Seto M.H., Liu H.-L.C., Zajchowski D.A., Whitlow M. Proteins 35:235-249(1999). [ 6] Gassama-Diagne A., Hullin-Matsuda F., Li R.Y., Nauze M., Ragab A., Pons V., Delagebeaudeuf C., Simon M.-F., Fauvel J., Chap H. J. Biol. Chem. 276:18352-18360(2001). [D1] INTERPRO:IPR001870 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50189} {PS50189; NETRIN_CT} {Status=empty} {BEGIN} *************************************************************** * Netrin C-terminal domain (also in complement factors 3/4/5) * *************************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001134 [D2] PFAM:PF01759 [D3] PRINTS: [D4] SMART: {END} {QDOC50193} {PS50193; SAM_BIND} {Status=preliminary} {BEGIN} ********************* * SAM binding motif * ********************* Methyl transfer from S-Adenosyl-L-methionine (SAM) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalyzed by methyltransferases (MTases) and modifies DNA, RNA, proteins and small molecules like catechol [1]. The catalytic domain of SAM-MTases is of the alpha/beta type with a central mixed beta-sheet around which several alpha-helices are arranged. Topologically it can be divided into two halves. The first half, formed by beta1-alphaA-beta2-alphaB-beta3-alphaC, is mainly responsible for SAM binding. The second half, beta4-alphaD-beta5-alphaE-beta6-beta7, is primarily responsible for catalysis [2]. According to the sequential order of these two sites, the SAM-MTases can be divided into three families. We developed a profile against the SAM binding site and part of the catalytic site. Consequently, this profile recognize members of the gamma family of SAM-MTases in which a connection between helix alphaC and strand beta4 links the two regions [3]. Proteins currently known to include a SAM binding motif and recognized by our profile are listed below. - Erm: Erythromycin resistance methyltransferase. rRNA adenine N-6-methyltransferase (EC 2.1.1.48) and others rRNA methyltransferase. N6 methylation of a specific adenine in the peptidyl transferase loop of 23 S ribosomal RNA confer resistance against the widely used antibiotic erythromycin. - Spermine synthase (EC 2.5.1.22). - Protein-L-isoaspartatE(D-aspartate) O-methyltransferase (EC 2.1.1.77). - 3-demethylubiquinone-9 3-methyltransferase (EC 2.1.1.64) - Adenine-specific methyltransferase activity (EC 2.1.1.72) - Catechol O-methyltransferase, membrane-bound form (EC 2.1.1.6) - Caffeoyl-CoA O-methyltransferase (EC 2.1.1.104) - N2,N2-dimethylguanosine tRNA methyltransferase precursor (EC 2.1.1.32). - Hydroxyindole O-methyltransferase (EC 2.1.1.4) - Cyclopropane-fatty-acyl-phospholipid synthase 2 (EC 2.1.1.79) - Delta(24)-sterol C-methyltransferase (EC 2.1.1.41). - Chemotaxis protein methyltransferase (EC 2.1.1.80). - Glycine N-methyltransferase (EC 2.1.1.20) - Sterigmatocystin 7-O-methyltransferase precursor (EC 2.1.1.110). - Precorrin-6Y C5,15-methyltransferase [decarboxylating] (EC 2.1.1.132) - Multifunctional cyclase-dehydratase-3-O-methyl transferase Tcmn. - Phosphatidylethanolamine N-methyltransferase (EC 2.1.1.17). - Myo-inositol 4-O-methyltransferase (EC 2.1.1.129) - O-demethylpuromycin-O-methyltransferase (EC 2.1.1.38). - Hexaprenyldihydroxybenzoate methyltransferase precursor (EC 2.1.1.114) - rRNA (guanine-n1-)-methyltransferase (EC 2.1.1.51) - Histamine n-methyltransferase (EC 2.1.1.8) -Consensus pattern: profile -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last update: April 1999 / First entry (N. Hulo, C. Sigrist). [ 1] Schluckebier G., O'Gara M., Saenger W., Cheng X. J. Mol. Biol. 247:16-20(1995). [ 2] Cheng X. Curr. Opin. Struct. Biol. 5:4-10(1995). [ 3] Malone T., Blumenthal R.M., Cheng X. J. Mol. Biol. 253:618-632(1995). [D1] INTERPRO:IPR000051 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50200} {PS50200; RA_DOMAIN} {Status=empty} {BEGIN} ************************** * Ras-associated domain. * ************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000159 [D2] PFAM:PF00788 [D3] PRINTS: [D4] SMART: {END} {QDOC50202} {PS50202; MSP_DOMAIN} {Status=empty} {BEGIN} ****************************************** * SCS2/MSP (Major sperm protein) domain. * ****************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000535 [D2] PFAM:PF00635 [D3] PRINTS: [D4] SMART: {END} {QDOC50203} {PS50203; CYS_PROT_CALPAIN} {Status=empty} {BEGIN} ********************************** * Calpain-type cystein-protease. * ********************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001300 [D2] PFAM:PF00648 [D3] PRINTS:PR00704 [D4] SMART: {END} {QDOC50204} {PS50204; UBA_NAD} {Status=preliminary} {BEGIN} *************************************** * UBA/THIF-type NAD/FAD binding fold. * *************************************** The family of Ubiquitin-activating enzymes (described in PDOC00463) shares in its catalytic domain significant similarity with a large family of NAD/FAD-binding proteins. This entry is based on the common NAD/FAD-binding fold and finds members of the following families: - UBA ubiquitin activating enzymes (PDOC00463) - hesA/moeB/thiF family - NADH peroxidases - LDH family (PDOC00062) - Sarcosin oxidase - Phytoene dehydrogenases (PDOC00755) - Alanine dehydrogenases (PDOC00654) - Hydroxyacyl-CoA dehydrogenases (PDOC00065) and many other NAD/FAD dependent dehydrogenases and oxidases. [D1] INTERPRO:IPR000594 [D2] PFAM:PF00899 [D3] PRINTS: [D4] SMART: {END} {QDOC50205} {PS50205; NAD_BINDING} {Status=empty} {BEGIN} ********************* * NAD binding site. * ********************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000205 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50206} {PS50206; RHODANESE} {Status=preliminary} {BEGIN} ************************* * Rhodanese/cdc25 fold. * ************************* Rhodanese, a sulfurtransferase involved in cyanide detoxification shares evolutionary relationship with a large family of proteins [1,2], including - Cdc25 phosphatase catalytic domain - non-catalytic domains of eukaryotic dual-specificity MAPK-phosphatases - non-catalytic domains of yeast PTP-type MAPK-phosphatases - non-catalytic domains of yeast Ubp4, Ubp5, Ubp7 - non-catalytic domains of mammalian Ubp-Y - drosophila heat shock protein HSP-67BB - several bacterial cold-shock and phage shock proteins - plant senescence associated proteins - catalytic and non-catalytic domains of rhodanese (PDOC00322) [1] K.Hofmann, P.Bucher, A.Kajava (submitted) [2] http://ulrec3.unil.ch/domains/rhodanese/index.html [D1] INTERPRO:IPR001763 [D2] PFAM:PF00581 [D3] PRINTS: [D4] SMART: {END} {QDOC50210} {PS50210; AKAP_DOMAIN} {Status=incomplete} {BEGIN} ************************************** * Protein kinase A anchoring domain. * ************************************** This domain is found in scaffold proteins that bind the regulatory subunit (RII) of protein kinase A with high affinity[1]. Each AKAP protein (more than 36) allow subcellular targeting of protein kinase A through association with structural proteins, membranes or cellular organelles[1]. The gravin, an autoantigen recognized by serum from myasthenia gravis patients contains 3 repeats of this domain[2]. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: May 1999 (L. Falquet). [1] UI:97123248 [2] UI:97153077 [D1] INTERPRO:IPR001573 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50211} {PS50211; DENN_DOMAIN} {Status=empty} {BEGIN} ********************** * DENN/aex-3 domain. * ********************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001194 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50212} {PS50212; LTE_DOMAIN} {Status=incomplete} {BEGIN} ********************************** * LTE1/rasGRF-associated domain. * ********************************** The LTE_DOMAIN domain is made of three alpha helices in the human son of sevenless (Sos) protein [1]. This domain is found in proteins of the guanine-nucleotide dissociation stimulators CDC25 family. It is usually located on the N-terminal side of GRF_CDC25. More information about this family is available {PDOC00594}. [ 1] Boriack-Sjodin P.A., Margarit S.M., Bar-Sagi D. & Kuryian J. Nature 394:337-343 (1998). -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: February 2000 (M. Pagni). [D1] INTERPRO:IPR000963 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50213} {PS50213; BIGH3_DOMAIN} {Status=incomplete} {BEGIN} ******************************** * Beta-Ig-H3/Fasciclin domain. * ******************************** The BIGH3 domain is an extracellular module of about 140 amino acids occurring as multiple repeats in a limited number of proteins, including Drosophila fasciclin I, TGF-beta induced protein Ig-H3, osteoblast-specific factor 2 (OSF-2) [1], as well as several hypothetical proteins from plants, fungi, and protozoans. Interestingly, a few bacterial proteins also contain one copy of this domain, e.g. MPB70 from Mycobacterium bovis which has been implicated in vaccination-induced osteitis [2]. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: May 1999 (P. Bucher). [ 1] UI:93371373 [ 2] UI:95122204 [D1] INTERPRO:IPR000782 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50218} {PS50218; FIZZY_DOMAIN} {Status=empty} {BEGIN} *********************** * FIZZY/CDC20 domain. * *********************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000002 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50219} {PS50219; ROM_MOTIF} {Status=preliminary} {BEGIN} *************************** * Citron homology domain. * *************************** Based on sequence similarities a domain of homology has been identified in the following proteins [1]: - Citron and Citron kinase. These two proteins interact with the GTP-bound forms of the small GTPases Rho and Rac but not with Cdc42. - Myotonic dystrophy kinase-related Cdc42-binding kinase (MRCKalpha). This serine/threonine kinase interacts with the GTP-bound form of the small GTPase Cdc42 and to a lesser extent with that of Rac. - NCK Interacting Kinase (NIK), a serine/threonine protein kinase. - ROM-1 and ROM-2, from yeast. These proteins are GDP/GTP exchange proteins (GEPs) for the small GTP binding protein Rho1. This domain, called the citron homology domain, is often found after cysteine rich and pleckstrin homology (PH) domains at the C-terminal end of the proteins [1]. It acts as a regulatory domain and could be involved in macromolecular interactions [1, 2]. [ 1] Chen X.Q., Tan I., Leung T., Lim L. J. Biol. Chem. 274:19901-19905(1999). [ 2] Su Y.C., Han J., Xu S., Cobb M., Skolnik E.Y. EMBO J. 16:1279-1290(1997). -Last Update: March 2000 (C.J.A. Sigrist). [D1] INTERPRO:IPR001180 [D2] PFAM:PF00780 [D3] PRINTS: [D4] SMART:CNH {END} {QDOC50220} {PS50220; REM_REPEAT} {Status=incomplete} {BEGIN} ********************************************* * PKN/rhophilin/rhotekin rho-binding repeat * ********************************************* The REM repeat, which is also called rho effector or HR1 domain, was first described as a three times repeated homology region of the N-terminal non-catalytic part of protein kinase PRK1(PKN) [1]. The first two of these repeats were later shown to bind the small G protein rho [2,3] known to activate PKN in its GTP-bound form. Similar rho-binding domains also occur in a number of other protein kinases and in the rho-binding proteins rhophilin and rhotekin. Recently, the structure of the N-terminal REM repeat complexed with RhoA has been determined by X-ray crystallography [4]. It forms an antiparallel coiled-coil fold termed an ACC finger. -Sequences known to belong to this class detected by the profile: ALL -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: February 2000 (P. Bucher). [ 1] UI:95154310 [ 2] UI:96213692 [ 3] UI:98112814 [ 4] UI:20085746 [E1] http://smart.embl-heidelberg.de/smart/do_annotation.pl?DOMAIN=HR1 [D1] INTERPRO:IPR000861 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50224} {PS50224; SPRY} {Status=preliminary} {BEGIN} *********************** * SPRY domain profile * *********************** The SPRY domain is an around 140 amino acids domain which was originally identified in the SPlA kinase of Dictyostellium and the rabbit RYanodine receptor (RyR). The SPRY domain is found in one or three copies (like in splA and RyR) in eucaryotic nuclear and cytoplasmic proteins, as well as in transmembrane and secreted proteins. The suggested functions for some SPRY domain containing proteins are RNA-binding, cell growth and differentiation. It has been proposed that the SPRY domain might play a role in RNA-binding or protein-protein interaction [1,2,3]. Secondary structure predictions suggest that the SPRY domain contains ten beta-strands and it has been proposed that the SPRY domain could have a C-terminal Immunoglobulin (Ig)-like fold and could be a member of the Ig superfamily [2]. Some proteins known to contain a SPRY domain are listed below: - Dictyostelium discoideum splA, a dual-specificity kinase that regulates spore cell differentiation. - Ryanodine receptors (RyRs), involved in the release of Ca2+ ions from intracellular stores. - SPRY domain-containing proteins with a SOCS boc (SSB) [4]. The SOCS proteins appear to form part of a classical negative feedback loop that regulates cytokine signal transduction. - Proteins of the RBCC (RING-finger, B-box and coiled-coil domain) family, such as Sjoegren syndrom type-A or Ro/SS-A antigen, xenopus nuclear factor 7 (Xnf7) and pyrin/marenostrin. - Proteins of the butyrophylin-related family. Butyrophilin is a membrane protein expressed in milk fat globule membrane. Its SPRY domain is linked to two external immunoglobulin-like motifs by a single transmembrane segment. - Enterophilins, a family of leucine zipper proteins associated with enterocyte differentiation [5]. - The alpha and beta subunits of stonustoxin (STNX), a secreted protein that was purified from the venom of stonefish (Synanceja horrida). -Note: The SPRY domain is a subdomain of the B30.2-like domain, which can be considered as a subclass of the SPRY domain family. The SPRY domain has an N-terminal deletion when compared to the B30.2-like domain [2]. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Ponting C., Schultz J., Bork P. Trends Biochem. Sci. 22:193-194(1997). [ 2] Seto M.H., Liu H.-L.C., Zajchowski D.A., Whitlow M. Proteins 35:235-249(1999). [ 3] Schenker T., Trueb B. Biochim. Biophys. Acta 1493:255-258(2000). [ 4] Hilton D.J., Richardson R.T., Alexander W.S., Viney E.M., Willson T.A., Sprigg N.S., Starr R., Nicholson S.E., Metcalf D., Nicola N.A. Proc. Natl. Acad. Sci. U.S.A. 95:114-119(1998). [ 5] Gassama-Diagne A., Hullin-Matsuda F., Li R.Y., Nauze M., Ragab A., Pons V., Delagebeaudeuf C., Simon M.-F., Fauvel J., Chap H. J. Biol. Chem. 276:18352-18360(2001). [D1] INTERPRO:IPR003878 [D2] PFAM:PF00622 [D3] PRINTS: [D4] SMART:SPRY {END} {QDOC50226} {PS50226; PA_PHOSPHATASE} {Status=empty} {BEGIN} ************************************************** * PA-phosphatase and other phosphomonoesterases. * ************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000326 [D2] PFAM:PF01569 [D3] PRINTS: [D4] SMART: {END} {QDOC50229} {PS50229; RANBP1_WASP} {Status=preliminary} {BEGIN} *********************************************** * RanBP1-WASP or WASP homology domain 1 (WH1) * *********************************************** The RanBP1-WASP domain is found in proteins implicated in a diverse range of signaling, nuclear transport and cytoskeletal events. This domain of around 130 amino acids is present in species ranging from yeast to mammals. It seems to be a protein protein interaction module as it has been shown that the RanBP1-WASP domain in cytosqueletal proteins bind the proline-rich motif FPPPP [1]. Five proteins containing FPPPP sequence are yet known to bind WH1 domains: the actin cytosqueleton-related proteins ActA, Vinculin, Zyxin, the WIP protein, and the cytoplasmic domain of metalotropic glutamate receptors [2]. Proteins of the RanBP1 family contain a WH1 domain in their N terminal region, which seems to bind a different sequence motif present in the C terminal part of RanGTP protein [3]. Tertiary structure of the WH1 domain of the Mena protein revealed structure similarities with the pleckstrin homology (PH) domain (PDOC50003) and suggest that the WH1 domain could also be involved in phospholipid binding [4]. Note: We also developed a profile which allow the specific detection of WH1 domain in RanBP1 family (see PS50196). Some of the proteins in which a WH1 domain is found are listed below. - Mammalian Ran Binding Protein 1 (RanBP1). May act in an intracellular signaling pathway which may control the progression through the cell cycle by regulating the transport of protein and nucleic acids across the nuclear membrane. - Mammalian vasodilator-stimulated phosphoprotein (VASP). Actin- and profilin-binding microfilament-associated protein. May act in concert with profilin to convey signal transduction to actin filament production. - Mammalian Wiskott-Aldrich syndrome protein (WASP). possible regulator of lymphocyte and platelet function. Defects in WASP are the cause of Wiskott- Aldrich syndrome (WAS), an X-linked recessive immunodeficiency characterized by eczema, thrombocytopenia, recurrent infections, and bloody diarrhea. - Mammalian Ran Binding Protein 2 or Nup358. Giant nucleoporin, component of the fibrils that extend in the cytosol from the cytoplasmic face of the nuclear pore and associates with a form of RanGAP that is modified by a small ubiquitin-like protein SUMO-1. - Mammalian enable protein (Mena). Adapter protein implicated in the spatial control of actin assembly. - Yeast yrb1 protein (RanBP1 homologue). Essential for viability. Mutations display defects in both nuclear protein import and RNA export. - Yeast nucleoprotein 2 (NUP2). Component of nuclear pore complex. Nucleoporins may be involved in both binding and translocation of proteins during nucleocytoplasmic transport. [1] Niebuhr K., Ebel F., Frank R., Reinhard M., Domann E., Carl U.D., Walter U., Gertler F.B., Wehland J., Chakraborty T. EMBO J. 16:5433-5444(1997). [2] Callebaut I., Cossart P., Dehoux P. FEBS Lett. 441:181-185(1998). [3] Beddow A.L., Richards S.A., Orem N.R., MacAra I.G. Proc. Natl. Acad. Sci. U.S.A. 92:3328-3332(1995). [4] Prehoda K.E., Lee D.J., Lim W.A. Cell 97:471-480(1999). -Last Update: March 2000 (N. Hulo). [D1] INTERPRO:IPR000697 [D2] PFAM: [D3] PRINTS: [D4] SMART:WH1 {END} {QDOC50230} {PS50230; HEMOLYSIN_PORE} {Status=preliminary} {BEGIN} *********************************************** * Combined Leukocidin/ASH4 hemolysin profile. * *********************************************** Pore forming toxins represent the most potent and versatile weapons with which invading microbes damage the host macroorganism. They are produced as water soluble single-chain polypeptides that generally oligomerize and insert into target membranes to form water-filled channels that lead to cell death and lysis [1]. The different pore forming toxins do not form one unique class of molecules as they differ in sequence and structure. This profile is directed against two highly divergent families of pore forming toxins. Proteins of the first family generally have a molecular mass of 32-34 kDa, a basic pI, a sequence with no prominent hydrophobic segments, a glycine-rich region intermediate between the amino and carboxyl termini, and a preponderance of aromatic residues in the carboxyl terminal half of the protein [1, 3]. The following proteins belong to this family: - alpha-hemolysin from Staphylococcus aureus. The homoheptameric pore formed by this protein comprises the cap, rim and stem domains. The stem domain is a 14 strands antiparallel beta-barrel that defines the transmembrane channel. The cap domain protrudes from the extracellular surface and forms a large hydrophilic domain and the 7 rim domains define the underside of the cap and are in close proximity if not in direct contact with the outer leaflet of the cell membrane (see ). - gamma-hemolysins and leukocidins from Staphylococcus aureus. These toxins consist of a group of proteins (HlgA, HlgB, HlgC, LukF, LukS, etc.), each belonging to a class F or class S subtype. As the active gamma-hemolysin or leukocidin toxins are composed of class F and S components, they have been called bi-component toxins. Resolution of the 3D structure of the LukF monomer has shown that his core is very similar to that of alpha-hemolysin despite low sequence identity [4] (see ). - beta-toxin from Clostridium perfringens. - hemolysin II from Bacillus cereus. This protein is substantially larger than the other members of the group. Proteins belonging to the second family are found in bacteria of the genus Vibrio and Aeromonas [5, 6]. They are larger than the proteins of the first family and they contain a ricin B lectin domain (see ) [7]. This family contains: - hemolysin HlyA from Vibrio cholera. It is produced as a precursor form (pro-HlyA) that is processed and activated after cleavage of a N-terminal polypeptide [8]. The pore is supposed to be a pentamer [9]. - hemolysin VmhA from Vibrio mimicus [10]. - hemolysin VAH1 fromVibrio anguillarum. - cytolysin-hemolysin VVHA from Vibrio vulnificus. - hemolysin HlyA from Aeromonas hydrophila. - hemolysin AHH1 from Aeromonas hydrophila. - hemolysin ASH4 from Aeromonas salmonicida. [ 1] Bhakdi S., Bayley H., Valeva A., Walev I., Walker B., Kehoe M., Palmer M. Arch. Microbiol. 165:73-79(1996). [ 2] Gouaux E., Hobaugh M., Song L. Protein Sci. 6:2631-2635(1997). [ 3] Gouaux E. J. Struct. Biol. 121:110-122(1998). [ 4] Olson R., Nariya H., Yokota K., Kamio Y., Gouaux E. Nat. Struct. Biol. 6:134-140(1999). [ 5] Hirono I., Aoki T. Microb. Pathog. 15:269-282(1993). [ 6] Hirono I., Masuda T., Aoki T. Microb. Pathog. 21:173-182(1996). [ 7] Sigrist C.J.A., Unpublished observation (2000). [ 8] Nagamune K., Yamamoto K., Naka A., Matsuyama J., Miwatani T., Honda T. Infect. Immun. 64:4655-4658(1996). [ 9] Zitzer A., Zitzer O., Bhakdi S., Palmer M. J. Biol. Chem. 274:1375-1380(1999). [10] Kim G.T., Lee J.Y., Huh S.H., Yu J.H., Kong I.S. Biochim. Biophys. Acta 1360:102-104(1997). -Last Update: April 2000 (C.J.A. Sigrist). [D1] INTERPRO:IPR001340 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50236} {PS50236; CLATHRIN_REPEAT} {Status=preliminary} {BEGIN} **************************************************** * 7-fold repeat in Clathrin, also in VPS proteins. * **************************************************** Each repeat is about 140 amino acids long and can be found in two types of arrangement: a) 7 fold repeat in the arm region of clathrins heavy chain b) 1 repeat in VPS (vacuolar membrane protein) Both clathrins and VPS are involved in biogenesis and maintenance of vacuoles, coated pits and vesicles. They are located on the cytoplasmic face of those vacuoles. The 7 fold repeat of clathrin is involved in the binding of ankyrin repeat D4 [1], as well as in the self-assembly of clathrin heavy chains [2]. The clathrin triskelion is a trimer of heavy-chain subunits, each binding a single light-chain subunit [3]. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last update: May 2000 / First entry (L. Falquet). [1] Michaely P, Kamal A, Anderson RG, Bennett V. A requirement for ankyrin binding to clathrin during coated pit budding. J Biol Chem. 1999 Dec 10;274(50):35908-13. UI: 20054478 [2] Ybe JA, Brodsky FM, Hofmann K, Lin K, Liu SH, Chen L, Earnest TN, Fletterick RJ, Hwang PK. Clathrin self-assembly is mediated by a tandemly repeated superhelix. Nature. 1999 May 27;399(6734):371-5. UI: 99287321 [3] Folding and trimerization of clathrin subunits at the triskelion hub. Nathke IS, Heuser J, Lupas A, Stock J, Turck CW, Brodsky FM; Cell 1992;68:899-910. UI: 92191269 [3D] 1b89 [D1] INTERPRO:IPR000547 [D2] PFAM:PF00637 [D3] PRINTS: [D4] SMART: {END} {QDOC50238} {PS50238; RHO_GAP} {Status=empty} {BEGIN} ****************** * RhoGAP domain. * ****************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000198 [D2] PFAM:PF00620 [D3] PRINTS: [D4] SMART: {END} {QDOC50239} {PS50239; GLYCEROL_ACYLTRANS} {Status=empty} {BEGIN} ******************************************************************** * Phospholipid and glycerol acyltransferase (from 'motifs_6.msf'). * ******************************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR002123 [D2] PFAM:PF01553 [D3] PRINTS: [D4] SMART: {END} {QDOC50242} {PS50242; SUR2_DOMAIN} {Status=empty} {BEGIN} **************************************************** * SUR2-type hydroxylase/desature catalytic domain. * **************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001541 [D2] PFAM:PF01598 [D3] PRINTS: [D4] SMART: {END} {QDOC50243} {PS50243; RASGAP_CTERM} {Status=empty} {BEGIN} *********************************************************** * Domain frequently found at C-terminus of rasGAP domain. * *********************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000593 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50246} {PS50246; UBACT_REPEAT} {Status=incomplete} {BEGIN} ************************************************** * Repeat in ubiquitin-activating (UBA) proteins. * ************************************************** Ubiquitin activating enzymes (E1 or UBA) are the first enzyme in the ubiquitin protein degradation pathway [1]. This domain is found 2 times in each member of the ubiquitin activating enzymes. It is located downstream of the active site cystein [2]. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: May 1999 (L. Falquet). [1] UI:98431658 [2] UI:92340519 [D1] INTERPRO:IPR000127 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50247} {PS50247; PROTEASOME_PROTEASE} {Status=preliminary} {BEGIN} ********************************************** * Multispecific proteases of the proteasome. * ********************************************** The eukaryotic 20S proteasome, which is the catalytic core proteasome, consists of 7 different A-type and 7 different B-type subunits. The prokaryotic proteasome uses multiple copies of a single A-type and a single B-type subunit. The A-type and B-type subunits are clearly related. This entry finds both types, which are currently described in two different PROSITE documentation entries: PDOC00326 A-type PDOC00668 B-type [D1] INTERPRO:IPR001353 [D2] PFAM:PF00227 [D3] PRINTS: [D4] SMART: {END} {QDOC50248} {PS50248; APC_SEN3_REPEAT} {Status=preliminary} {BEGIN} ****************************************************** * Repeat in APC and proteasome components (3 copies) * ****************************************************** A weakly conserved repeat module of unknown function, which occurs in two regulatory subunits of the 26S-proteasome and in one subunit of the APC-complex (cyclosome) [1]. The proteasome subunits containg this repeat are S1 (Sen3p in yeast) and S2 (p97 in human, Nas1p in yeast). The APC/cyclosome subunit containing this repeat is called Apc1 in yeast, BimE in Aspergillus, tsg24 in human and Cut4 in S.pombe. [1] UI:97348748 [D1] INTERPRO:IPR002015 [D2] PFAM:PF01851 [D3] PRINTS: [D4] SMART: {END} {QDOC50249} {PS50249; MPN_DOMAIN} {Status=preliminary} {BEGIN} ********************************************************* * Domain in subunits of the 26S proteasome and of eIF3 * ********************************************************* A domain of unclear function in the N-terminus of various regulatory components of the proteasome and possibly also other proteins [1,5]. Proteins containing this domain include: - yeast 26S-proteasome regulatory component Rpn11(Mpr1) [2] - yeast 26S-proteasome regulatory component Rpn8 [2] - yeast ORF Ydl216c - S.pombe Pad1, positive regulator of transcription factor Pap1, probably proteasome component [3]. - Mammalin POH1, a Pad1 homologue and probably a proteasome component [4] - Human C6.1a, a gene of unknown function in Xq28 translocation region - Mammalian proteasome regulatory subunit p40 (MOV34) - Human translation initiation factor 3 47 kDa subunit [5] - Human translation initiation factor 3 40 kDa subunit [5] [1] Hofmann, K., Bucher, P. Trends in Biochem. Sci. 23:204-205 (1998) [2] Glickman M., Rubin, D., Fried, V., and Finley, D. Mol.Cell.Biol. in press [3] UI:95286704 [4] UI:98043754 [5] UI:98001678 [D1] INTERPRO:IPR000555 [D2] PFAM:PF01398 [D3] PRINTS: [D4] SMART: {END} {QDOC50250} {PS50250; PCI_DOMAIN} {Status=preliminary} {BEGIN} ***************************************************************** * Domain in components of the proteasome, COP9-complex and eIF3 * ***************************************************************** A homomology domain of unclear function, occuring in the C-terminal region of several regulatory components of the 26S proteasome as well as in other proteins [1]. Apparently, all of the characterized proteins containing PCI domains are parts of larger multi-protein complexes. Proteins with PCI domains include: - Budding yeast proteasome regulatory components Rpn3(Sun2), Rpn5, Rpn6, Rpn7, Rpn9 [2] - Mammalian proteasome regulatory components p55, p58 and p44.5 - Budding yeast Rpg1 and Nip1, subunits of the translation initiation factor 3 complex [3,4]. - Mammalian p110 and INT6, subunits of the mammalian translation intitiation factor 3 complex [3,4] - Arabidopsis COP9 and FUS6/COP11, components of the plant COP9-complex, involved in light signaling [5]. - Mammalian G-protein pathway suppressor GPS1 - Subunits of the "signalosome" and of a multi-protein complex phosphorylated by Src. - Budding yeast ORF Yil071c - Several uncharacterized ORFs from plant, nematodes and mammals. It cannot be excluded that several of the multi-protein complexes mentioned in the above list are similar or even identical. The complete homology domain comprises approx. 200 residues, the highest conservation is found in the C-terminal half. Several of the proteins mentioned above have no detectable homology to the N-terminal half of the domain. [1] Hofmann, K., Bucher, P. Trends in Biochem. Sci. 23:204-205 (1998) [2] Glickman M., Rubin, D., Fried, V., and Finley, D. Mol.Cell.Biol. (in press) [3] UI:97150873 [4] UI:98001678 [5] UI:96291404 [D1] INTERPRO:IPR000717 [D2] PFAM:PF01399 [D3] PRINTS: [D4] SMART: {END} {QDOC50251} {PS50251; INTRON_ENDONUC} {Status=empty} {BEGIN} ********************************************************* * Domain in intron-encoded endonucleases and maturases. * ********************************************************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001982 [D2] PFAM:PF00961 [D3] PRINTS: [D4] SMART: {END} {QDOC50258} {PS50258; NOTCH_LIN_REPEAT} {Status=preliminary} {BEGIN} *********************** * Lin-12/Notch repeat * *********************** The Lin-12/Notch repeat (LNR) region is present only in Notch related proteins. The lin-12/Notch proteins act as transmembrane receptors for intercellular signals that specify cell fates during animal development. In response to ligand, proteolytic cleavages release the intracellular domain of Notch, which then gains access to the nucleus and acts as a transcriptional co-activator [3]. The LNR region is supposed to negatively regulates the Lin-12/Notch proteins activity. It is a triplication of an around 35-40 amino acids module present on the extracellular part of the protein [1, 2]. Each module contains six cysteine residues engaged in three disulfide bonds and three conserved aspartate and asparagine residues [3]. The biochemical characterization of a recombinantly expressed LIN-12.1 module from the human Notch1 receptor indicate that the disulfide bonds are formed between the first and fifth, second and fourth, and third and sixth cysteines. The formation of this particular disulfide isomer is favored by the presence of Ca++, which is also required to maintain the structural integrity of the rLIN-12.1 module. The conserved aspartate and asparagine residues are likely to be important for Ca++ binding, and thereby contribute to the native fold [3]. Protein currently known to include the LNR region are listed below. - Drosophila Notch protein. - Vertebrate Notch proteins - Caenorhabditis elegans Lin-12 protein. - Caenorhabditis elegans GLP1 protein. -Consensus pattern: C-x(4,7)-C-x(5)-[DN]-x(2)-C-x(3)-C-x(4)-C-x-[WYF]-D-x(2)-D-C -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last update: April 1999 / First entry (N. Hulo, C. Sigrist). [ 1] Greenwald I. Genes Dev. 12:1751-1762(1998). [ 2] Greenwald I. Curr Opin Genet Dev 4:556-562(1994). [ 3] Aster J. C., Simms, W. B., Zavala-Ruiz, Z., Patriub, V., North, C. L., Blacklow S. C. Biochemistry 38:4736-4742(1999). [D1] INTERPRO:IPR000800 [D2] PFAM:PF00066 [D3] PRINTS: [D4] SMART: {END} {QDOC50260} {PS50260; 7TM_NEMATODE} {Status=preliminary} {BEGIN} **************************************************************************** * 7-Helix G-protein coupled receptor, nematode (probably olfactory) family * **************************************************************************** The nematode C. elegans contains a large family of highly divergent 7- tramsmembrane G-protein coupled receptors some of which known to be expressed in sensory neurons [1]. One member (odr-10) was later shown to function as chemoreceptor for diacetyl [2]. Five subfamilies are currently distinguished: sra, srb, srd, sre, and srg. It is assumed that all members of these subfamilies function as olfactory receptors [3]. The profile developed for this family covers all seven transmembrane helices. -Sequences known to belong to this class detected by the profile: ALL -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: June 2000 (C.J.A. Sigrist). [ 1] Medline:96028095. [ 2] Medline:96182650. [ 3] Medline:98248686. [D1] INTERPRO:IPR000168 [D2] PFAM:PF01461 [D3] PRINTS: [D4] SMART: {END} {QDOC50264} {PS50264; FMN_ENZYMES} {Status=empty} {BEGIN} ********************************************************************* * Proteins binding FMN and related compounds (core region profile). * ********************************************************************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000902 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50265} {PS50265; CHANNEL_PORE_K} {PS50273; CHANNEL_PORE_CA_NA} {PS50266; CATION_CHANNEL_TM} {PS50272; CATION_CHANNEL_TRPL} {Status=incomplete} {BEGIN} ************************************** * Pore region of potassium channels. * *************************************** * Calcium and sodium channel pore region (S4-S6). * ************************************** *************************************************** * Cation channels, 6TM region (non-ligand gated). * *************************************************** *********************************************************************** * Cation channels, 6TM region (transient receptor potential subtype). * *********************************************************************** *************************************************** * Calcium and sodium channel pore region (S4-S6). * *************************************************** Ions channel are integral membrane proteins involved in membrane excitability, muscle contraction, synaptic transmission [1,2]. The six-transmembranes regions is covered by both profile CATION_CHANNEL_TM and CATION_CHANNEL_TRPL. The latter is usually associated with the former, altough the opposite is not true. The domains CHANNEL_PORE_K and CHANNEL_PORE_CA_NA are, in principle, mutually exclusive. [1] Methods in Enzymology (19??) Vol. 207 [2] Methods in Enzymology (1999) Vol. 294 {END} {QDOC50266} {PS50266; CATION_CHANNEL_TM} {Status=empty} {BEGIN} *************************************************** * Cation channels, 6TM region (non-ligand gated). * *************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000636 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50271} {PS50271; ZF_UBP} {Status=preliminary} {BEGIN} ************************************** * Zn-finger in ubiquitin-hydrolases * ************************************** This domain displays some similarities with the Zn binding domain of the Insulinase family [1]. It is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP) [2,3], All members of this subfamily are Isopeptidase-T that are known to cleave isopeptide bonds between ubiquitin moieties. Some of the proteins containing an UBP zinc finger are listed below: - Human Deubiquitinating enzyme 13 (UBPD). - Human Deubiquitinating enzyme 5 (UBP5). - Dictyostelium discoideum Deubiquitinating enzyme A (UBPA). - Yeast Deubiquitinating enzyme 8 (UBP8). - Yeast Deubiquitinating enzyme 14 (UBP14). We have developed a profile that covers the whole domain. -Expert(s) to contact by email: Hofmann K.O Kay.Hofmann@memorec.com -Last Update: May 1999 (L. Falquet, N. Hulo). [ 1] Becker A.B., Roth R.A. Meth. Enzymol. 248:693-703(1995). Medline:95405296 [ 2] Hershko A., Ciechanover A. Annu. Rev. Biochem. 67:425-479(1998). Medline:98431658 [ 3] Wilkinson K.D. FASEB J. 11:1245-1256(1997). Medline:98072201 [D1] INTERPRO:IPR001607 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50272} {PS50272; CATION_CHANNEL_TRPL} {Status=empty} {BEGIN} *********************************************************************** * Cation channels, 6TM region (transient receptor potential subtype). * *********************************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR002111 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50273} {PS50273; CHANNEL_PORE_CA_NA} {Status=empty} {BEGIN} *************************************************** * Calcium and sodium channel pore region (S4-S6). * *************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001682 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50282} {PS50282; TRANSP_LYSE_YGGA} {Status=empty} {BEGIN} *********************************************************** * Bacterial transmembrane transporters, lysE/yggA family. * *********************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001123 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50284} {PS50284; TRANSP_CYT_PUR} {Status=empty} {BEGIN} *************************************************************** * Permeases for cytosine/purines, uracil, thiamine, allantoin * *************************************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR001248 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50288} {PS50288; COLLAGEN_REP} {Status=empty} {BEGIN} ********************************************** * Collagen repeat (G-x-x), circular profile. * ********************************************** Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000087 [D2] PFAM:PF01391 [D3] PRINTS: [D4] SMART: {END} {QDOC50292} {PS50292; PEROXIDASE_3} {Status=preliminary} {BEGIN} ************************************************************************ * Myeloperoxidase, thyroid peroxidase, cyclooxygenase catalytic domain * ************************************************************************ A subfamily of peroxidases (PDOC00394) share an extended similarity over the whole catalytic domain. These enzymes include: - Myeloperoxidase - Thyroid peroxidase - Eosinophil peroxidase - Lactoperoxidase - Melanogenic peroxidase - Prostaglandin G/H synthase (cyclooxygenase) All of these peroxidases contain the heme-attachment site discussed in the PROSITE documentation entry PDOC00394. [D1] INTERPRO:IPR001536 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50298} {PS50298; TGFB_RECEPTOR} {Status=preliminary} {BEGIN} ************************************************** * TGF-beta receptor family, extracellular domain * ************************************************** Transforming growth factor-beta (TGF-beta) forms a family with other growth factors described in PDOC00223. The receptors for most of the members of this growth factor family are related. These proteins are receptor-type kinases of Ser/Thr type (PDOC00100), which have a single transmembrane domain and a specific Cys-rich extracellular domain [1-3]. Proteins belonging to this family include - TGF-beta receptors - Activin receptors - Bone morphogenetic protein (BMP) receptors - Anti-Mullerian hormone receptors - Drosophila saxophone (Sax) protein - Drosophila punt protein - Nematode daf-1 and daf-4 proteins The profile spans the conserved C-terminal part of this extracellular domain. Some of the receptors of this family contain subclass-specific N-terminal extensions of this homology domain. [1] UI:97175397 [2] UI:94322910 [3] UI:97066313 [D1] INTERPRO:IPR000472 [D2] PFAM:PF01064 [D3] PRINTS: [D4] SMART: {END} {QDOC50299} {PS50299; CR2A} {Status=empty} {BEGIN} ************************************* * Cytokine receptor class 2 family. * ************************************* Sorry, this documentation entry has not yet been written. [D1] INTERPRO:IPR000282 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50302} {PS50302; PUM-REPEATS} {PS50303; PUM_REP} {Status=incomplete} {BEGIN} ****************************** * Pumilio RNA-binding domain * ****************************** The RNA-binding domain of the Drosophila Pumilio protein regulates mRNA translation and intracellular localization by binding to a specific sequence motif in the 3'UTR. It consists of 8 imperfect repeats of about 35 amino acids. The same type of repetitive domain has been found in in a number of other proteins from all eukaryotic kingdoms. The circular PROSITE profile PS50302/PUM_REPEATS characterises an entire RNA-binding domain. The PROSITE profile PS50303/PUM_REP characterises a single repeat unit. [1] UI:98067397 [D1] INTERPRO:IPR002997 [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50330} {PS50330; UIM} {Status=preliminary} {BEGIN} **************** * UIM profile. * **************** The Ubiquitin Interacting Motif (UIM) was first described in the 26S proteasome subunit PSD4/RPN-10 [1]. It is known to bind multiple ubiquitin and was also found in many proteins involved in the endocytic pathway [2]. - PSD4/RPN-10/S5a multiubiquitin binding subunit of the 26S proteasome - VPS27 vacuolar sorting protein - Ataxin-3 protein involved in ataxia disease. [ 1] Young P., Deveraux Q., Beal R.E., Pickart C.M., Rechsteiner M. J. Biol. Chem. 273:5461-5467(1998). [ 2] Hofmann K., Falquet L. Trends Biochem. 26:347-350(2001). [D1] INTERPRO: [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50401} {PS50404; GST_N} {PS50405; GST_C} {BEGIN} {Status=preliminary} ****************************** * Gluthatione S-transferase. * ****************************** In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. Other proteins, with diverse functions, also contain a GST structural domain. These include the elongation factors gamma, the HSP26 family of stress-related proteins (auxin-regulated proteins in plants) and some ion channel modulators. The major lens polypeptide of Cephalopoda is also a GST [1,2,3,4,5]. Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family [6,7]. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol. Most GSTs are homodimeric enzymes that display a conserved structural fold. Each monomer is composed of a distinct N-terminal sub-domain, which adopts the thioredoxin fold, and a C-terminal all-helical sub-domain. A multiple sequences alignment was constructed using the known structures of 21 GSTs. A small set of bacterial proteins was added to this alignment to improve the coverage of the diversity of GSTs. The two most conserved regions, one from the N-terminal and the other from the C-terminal sub-domain, 89 and 58 a.a. long respectively, were selected and converted into the generalised profiles GST_N and GST_C. The matches by the individual profile with normalized score greater than 8.5 allow to retrieve most known GSTs. It is however more sensitive to look for simultaneous and adjascent matches by the two profiles, while lowering their individual cutoff scores down to 6.5. This strategy (metamotif) increases the search sensitivity, while efficiently preventing any false positive to be detected. Several sub-family of GSTs, including the ion channel modulator (CLIC proteins) are revealed by this technique. An additional benefit is that it allows the detection of a few GSTs in which the sub-domains are swapped. [1] Armstrong, R. N. (1997). Structure, catalytic mechanism, and evolution of the glutathione transferases. Chem. Res. Toxicol. 10:2-18. [2] Board, P. G., Coggan, M., Chelvanayagam, G., Esteal, S., Jermiin, L. S., Schulte, G. K., Danley, D. E., Hoth, L. R., Griffor, M. C., Kamath, A. V., Rosner, M. H., Chrunyk, B. A., Perregaux, D. E., Gabel, C. A., Geoghegan, K. F. and Pandit, J. (2000). Identification, characterization and crystal structure of the Omega class glutathione transferases. J. Biol. Chem. 275, 24798-24806. [3] Droog, F. (1997). Plant glutathione S transferases, a tale of theta and tau. J. Plant Growth Regul. 16:95-107. [4] Dulhunty, A., Gage, P., Curtis, S., Chelvanayagam, G. and Board, P. (2001). The glutathione transferase structural family includes a nuclear chloride channel and a ryanodine receptor calcium release channel modulator. Journal Of Biological Chemistry. 276:3319-3323. [5] Eaton, D. L. and Bammler, T. K. (1999). Concise review of the glutathione S-transferases and their significance to toxicology. Toxicol. Sci. 49:156-164. [6] Polekhina, G., Board, P. G., Blackburn, A. C. and Parker, M. W. (2001). Crystal structure of maleylacetoacetate isomerase/glutathione transferase zeta reveals the molecular basis for its remarkable catalytic promiscuity.Biochemistry. 40:1567-1576. [7] Vuilleumier, S. (1997). Bacterial glutathione S-transferases: what are they good for? J. Bacteriol. 179:1431-1441. {END} {QDOC50500} {PS50500; LRR_BACT_20} {PS50501; LRR_CC} {PS50502; LRR_PS} {PS50503; LRR_RI} {PS50504; LRR_SDS22} {PS50505; LRR_TP} {PS50506; LRR_TYPICAL} {Status=preliminary} {BEGIN} *************************************** * Leucine-rich repeat region profiles * *************************************** Leucine-rich repeats (LRR) are tandemly repeated modules of about 24 amino acids. They occur in a large number of functionally diverse proteins [1, 2]. Many LRR regions are known to function as protein-protein interaction domains. The fold of the LRR repeat units is known from several crystal structures, e.g. from: - ribonuclease inhibitor [3], - U2a small nuclear protein [4], - internalin B [5], - Rab geranylgeranyltransferase [6]. One LRR corresponds to a beta-strand followed by either alpha-helix, 3(10)-helix or a polyproline II helix [2]. The LRR proteins have been described as a horseshoe-shaped molecules with curved parallel beta-sheet lining the inner circumference of the horeseshoe and the helices flanking its outer circumference. At least seven subfamilies of LRR proteins, characterized by different lengths and consensus sequences of the repeats, have been identified [7, 8]. Seven profiles were developed for LRRs from each of these subfamilies: - Typical LRRs - Ribonuclease inhibitor (RI)-like LRRs - Cysteine-containing (CC) LRRs - Plant-specific (PS) LRRs - SDS22(+)-like LRRs - Bacterial 20-residue long LRRs - Treponema pallidum (Tp) LRRs ! Each profile has a lenght of three LRRs. -Expert(s) to contact by email: Kajava A.V.; kajava@helix.nih.gov -Last update: October 2001 / First entry. [ 1] Kobe B., Deisenhofer J. Trends Biochem Sci 19:415-421(1994). [ 2] Kobe B., Kajava A.V. Curr Opinion Str Biol (in press)(2001) [ 3] Kobe B., Deisenhofer J. Nature 366:751-756(1993). [ 4] Price S.R., Evans P.R., Nagai K. Nature 394:645-650(1998). [ 5] Marino M., Braun L., Cossart P., Ghosh P. Mol Cell 4:1063-1072(1999). [ 6] Zhang H., Seabra M.C., Deisenhofer J. Structure Fold Des 8:241-251(2000). [ 7] Kajava A.V., Vassart G., Wodak S.J. Structure 3:867-877(1995). [ 8] Kajava A.V. J Mol Biol 277:519-527(1998). [D1] INTERPRO: [D2] PFAM: [D3] PRINTS: [D4] SMART: {END} {QDOC50520} {PS50521} {PS50507} {PS50523} {PS50524} {PS50525} {PS50526} {PS50527} {BEGIN} ******************************** * RNA-directed RNA polymerase. * ******************************** RNA-directed RNA polymerase (RdRp) (EC 2.7.7.48) is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage. It catalyses synthesis of the RNA strand complementary to a given RNA template. RdRp's of many viruses are products of processing of polyproteins. Some RdRp's consist of one polypeptide chain, other are complexes of several subunits. The domain organization [1] and the 3D structure of the catalytic center of a wide range of RdPp, even those with a low overall sequence homology, are conserved. The catalytic center is formed by several motifs containing a number of conservative amino-acid residues. We developed a set of profiles (PS50521, PS50507, PS50522, PS50523, PS50524, PS50525, PS50526) for detecting RdRp's (or subunits containing catalitic center of RdRp's) of viruses. There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage: 1. Viruses containing positive-strand RNA or double-strand RNA, except of retroviruses and Birnaviridae family. The profile PS50521 corresponds to the segment of 100 - 150 aa of RdRp, which contains three motifs putatively forming the catalytic center. Viruses whose RNA-directed RNA polymerases are described by the profile PS50521 are (see [2,3]): - All positive-strand RNA viruses with no DNA stage: Arteriviridae, Bromoviridae, Caliciviridae, Comoviridae, Coronaviridae, Flaviviridae, Leviviridae, Luteoviridae, Picornaviridae, Potyviridae, Togaviridae, Tombusviridae, Capilloviruses, Carlaviruses, Potexviruses, Tobamoviruses, Tobraviruses, Trichoviruses, Tymoviruses, Hepatitis E-like viruses, Allexivirus, Sobemovirus. - double-strand RNA viruses, families Cystoviridae, Reoviridae, Hypoviridae, Partitiviridae, Totiviridae. 2. Order Mononegavirales (negative-strand RNA viruses with non-segmented genome). The profile PS50524 corresponds to the segment of 128 - 141 aa of RdRp, which contains three motifs putatively forming the catalytic center. RdRp's of these viruses can have descriptions: "Large protein", "L protein", "RNA polymerase beta subunit", "Polymerase subunit L". 3. Negative-strand RNA viruses with segmented genome, i.e., Orthomyxoviruses (including influenza A, B, and C viruses, Thogotoviruses, and the infectious salmon anemia virus), Arenaviruses, Bunyaviruses, Hantaviruses, Nairoviruses, Phleboviruses, Tenuiviruses, and Tospoviruses. The profile PS50525 corresponds to a relatively conserved segment of 147 - 180 aa of RdRp or its catalitic subunit. The proteins detected by this profile are: - RNA polymerase PB1 subunits of Orthomyxoviruses - RNA polymerases (L proteins) of Arenaviruses, Bunyaviruses, Hantaviruses, Nairoviruses, Phleboviruses, Tenuiviruses, and Tospoviruses. 4. Birnaviridae family of dsRNA viruses. The profile PS50526 corresponds to a conservative segment of 105 aa nearly in the middle of the polypepdide chain of RdRp. We also developed profiles for RdRp's of the following three subgroups of the above superfamily 1: - All positive-strand RNA eukariotic viruses with no DNA stage, profile PS50507. - All RNA-containing bacteriophages, profile PS50522. There are two families of RNA-containing bacteriophages: Leviviridae (positive ssRNA phages) and Cystoviridae (dsRNA phages). - Reoviridae family of dsRNA viruses, profile PS50523. RdRp's of Orthoreoviruses (Reoviridae family) are known as "minor core proteins lambda 3" [4]. There are other proteins of Orthoreoviruses, sigma NS proteins, which also are annotated as RNA-directed RNA polymerases. Sigma NS are relatively small proteins, 366 aa residues long, while other RdRp's of Reoviridae are 1088 to 1444 aa residues long. Sigma NS proteins of Orthoreoviruses are not described by the profiles. The RNA polymerase gene of Coronaviridae contains two overlapping reading frames, ORF1A and ORF1B. Only the products of ORF1B are described by the profiles. - Sequences known to belong to this class detected by the profiles: ALL, except for ORF1A of Coronaviridae and sigma NS proteins of Orthoreoviruses. - Other sequence(s) detected in SWISS-PROT: NONE. 1. O'Reilly E.K., Kao C.C. Analysis of RNA-dependent RNA polymerase structure and function as guided by known polymerase structures and computer predictions of secondary structure. Virology 252(2):287-303 (1998) PMID: 98786 2. NCBI taxonomy browser (http://www.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wgetorg?name=Viruses) 3. Index Virum (http://life.anu.edu.au/viruses/Ictv/) 4. Starnes M.C., Joklik W.K. Reovirus protein lambda 3 is a poly(C)-dependent poly(G) polymerase. Virology 193(1):356-366 (1993) {END} {QDOC50530} {PS50531} {PS50532} {BEGIN} ***************************************** * HTH domains of IS21-like transposases * ***************************************** Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, called transposase, required for excising and inserting the mobile element. On the basis of sequence similarities, transposases can be grouped into various families. One of these families is called IS21 family or IS21/IS408/IS1162 family ([1], [2]). These proteins consist of 315 to 582 amino acids and contain putative DNA-binding domains (HTH domains, domains with helix-turn-helix structural motif). of 60 to 84 aa long. At the base of similarity of putative HTH-domain, the family can be divided onto two subfamilies. The first subfamily, whose HTH domain is described by the profile PS50531, contains transposases from the following elements: - IS21 from Pseudomonas aeruginosa. - IS1326 from Pseudomonas aerigunosa - IS640 from Shigella sonnei. - IS5376 from Bacillus stearotermophilus - IS232 from Bacillus thuringiensis - IS21-like from Bacteroides fragilis - A potentional insertion element from Chelatobacter heintzii. The HTH-domain of the second subfamily is described by the profile PS50532. This subfamily contains: - a putative transposase Y4UI from Rhizobium sp. (strain NGR234) - a putative transposase Y4BL/Y4KJ/Y4TB from Rhizobium sp. (strain NGR234) - a transposase from IS408 from Burkholderia cepacia (syn. Pseudomonas cepacia) - a transposase from IS1162 from Pseudomonas fluorescens. Putative transposase RV3428c from Mycobacterium tuberculosis is usually assigned to this family, but is not matched by any of these two profiles; it probably contains no HTH domain. - Sequences known to belong to this class detected by the profiles: ALL, except of RV3428c from Mycobacterium tuberculosis. - Other sequence(s) detected in SWISS-PROT: NONE 1. IS database: http://www-is.biotoul.fr/is/IS_infos/IS21_family.html 2. Berger B., Haas D. Transposase and cointegrase: specialized transposition proteins of the bacterial insertion sequence IS21 and related elements. Cell Mol. Life Sci. 58 no. 3 (2001), 403-419 {END} {QDOC50550} {PS50550; POU_HOMEODOMAIN} {Status=preliminary} {BEGIN} ******************************** * POU-homeodomain profile * ******************************** POU-proteins are eucaryotic transcription factors, containing bipartite DNA-binding domain referred to as the POU domain. The acronym POU (pronounced 'pow') is derived from the names of three mammalian transcription factors, the pituitary-specific Pit-1, the octamer-binding proteins Oct-1 and Oct-2, and the neural Unc-86 from Caenorhabditis elegans. POU-domain genes have been described in organisms as divirgent as C.elegans, Drosophila, Xenopus, zebrafish and human but have not been yet identified in plants and fungi. The various members of the POU family have a wide variety of functions; all these functions are related to the development of an organism [1]. POU-domain is a bipartite domain composed of two subunits separated by a non-conserved region of 15-55 aa. The N-terminal subunit is known as POU-specific (POUs) domain. Two its signatures have been created (, ). The other subunit is a homeodomain, see . 3D structures of complexes including both POU subdomains binded to DNA are available. Both subdomains contain structural motif 'helix-turn-helix', which directly associates with the two components of bipartite DNA-binding sites. The subdomains are connected by a flexible linker [2,3,4,5,6]. In proteins a POU-specific domain is always accompanied by a homeodomain. Despite of the lack of sequence homology, 3D structure of POUs is similar to 3D structure of bacteriophage lambda repressor and other members of HTH_3 family [2,3]. In the wide family of well-conserved homeodomains those that occur in POU-proteins form the POU-homeodomain subfamily. POU-homeodomains, the homeodomains that are involved into bipartite POU domain, are highly conservative. We developed the profile for the POU-homeodomain subfamily of the homeodomain family. Proteins known to belong to the family are all POU-proteins; they are listed in . -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Note: for full-length sequences the set of the hits of this profile coincides with the set of the hits of patterns POU_1 and POU_2 . -Expert(s) to contact by email: A.Alexeevski; aba@belozersky.msu.ru S.Spirin; sas@belozersky.msu.ru -Last update: June 2002 / First entry [ 1] Andersen B., Rosenfeld M.G. Endocr Rev. 22(1):2-35(2001) [ 2] Phillips, K. and Luisi, B. J. Mol. Biol. 302:1023-1039(2000) [ 3] Klemm, J.D., Rould, M.A., Aurora, R., Herr, W., and Pabo, C.O. Cell 77:21-32(1994) [ 4] Jacobson, E.M., Li, P., Leon-del-Rio, A., Rosenfeld, M.G., and Aggarwal, A.K.., Genes Dev. 11(2):198-212( 1997) [ 5] Chasman, D.I., Cepek, K., Sharp, P.A., and Pabo, C.O., Genes Dev. 13(20):2650-2657(1999) [ 6] Remenyi A, Tomilin A, Pohl E, Lins K, Philippsen A, Reinbold R, Scholer HR, Wilmanns M. Cell 103(6):853-864 (2000) {END} ************************************************************************************************ SWISS-PROT release number: 40.19, total number of sequence entries in that release: 109708. Total number of hits in SWISS-PROT: 72 hits in 72 different sequences Number of hits on proteins that are known to belong to the set under consideration: 72 hits in 72 different sequences Number of hits on proteins that could potentially belong to the set under consideration: 0 hits in 0 different sequences Number of false hits (on unrelated proteins): 0 hits in 0 different sequences Number of known missed hits: 0 Number of partial sequences which belong to the set under consideration, but which are not hit by the pattern or profile because they are partial (fragment) sequences: 6 Precision (true hits / (true hits + false positives)): 100.00 % Recall (true hits / (true hits + false negatives)): 100.00 % ***************************************************************************************** Proteins known to belong to the family are all POU-proteins; they are listed in . Here they are: - Oct-1 (or OTF-1, NF-A1) (gene POU2F1), a transcription factor for small nuclear RNA and histone H2B genes. - Oct-2 (or OTF-2, NF-A2) (gene POU2F2), a transcription factor that specifically binds to the immunoglobulin promoters octamer motif and activates these genes. - Oct-3 (or Oct-4, NF-A3) (gene POU5F1), a transcription factor that also binds to the octamer motif. - Oct-6 (or OTF-6, SCIP) (gene POU3F1), an octamer-binding transcription factor thought to be involved in early embryogenesis and neurogenesis. - Oct-7 (or N-Oct 3, OTF-7, Brn-2) (gene POU3F2), a nervous-system specific octamer-binding transcription factor. - Oct-11 (or OTF-11) (gene POU2F3), an octamer-binding transcription factor. - Pit-1 (or GHF-1) (gene POU1F1), a transcription factor that activates growth hormone and prolactin genes. - Brn-1 (or OTF-8) (gene POU3F3). - Brn-3A (or RDC-1) (gene POU4F1), a probable transcription factor that may play a role in neuronal tissue differentiation. - Brn-3B (gene POU4F2), a probable transcription factor that may play a role in determining or maintaining the identities of a small subset of visual system neurons. - Brn-3C (gene POU4F3). - Brn-4 (or OTF-9) (gene POU3F4), a probable transcription factor which exert its primary action widely during early neural development and in a very limited set of neurons in the mature brain. - Mpou (or Brn-5, Emb) (gene POU6F1), a transcription factor that binds preferentially to a variant of the octamer motif. - Skn, that activates cytokeratin 10 (k10) gene expression. - Sprm-1, a transcription factor that binds preferentially to the octamer motif and that may exert a regulatory function in meiotic events that are required for terminal differentiation of male germ cell. - Unc-86, a Caenorhabditis elegans transcription factor involved in cell lineage and differentiation. - Cf1-a, a Drosophila neuron-specific transcription factor necessary for the expression of the dopa decarboxylase gene (dcc). - I-POU, a Drosophila protein that forms a stable heterodimeric complex with Cf1-a and inhibits its action. - Drosophila protein nubbin/twain (PDM-1 or DPou-19). - Drosophila protein didymous (PDM-2 or DPou-28) that may play multiple roles during development. - Bombyx mori silk gland factor 3 (SGF-3). - Xenopus proteins Pou1, Pou2, and Pou3. - Zebrafish proteins Pou1, Pou2, Pou[C], ZP-12, ZP-23, ZP-47 and ZP-50. - Caenorhabditis elegans protein ceh-6. - Caenorhabditis elegans protein ceh-18. ************************************************************************************* References. 1. Andersen B., Rosenfeld M.G. POU Domain Factors in the Neuroendocrine system: Lessons from Developmental Biology Provide Insights into Human Disease. Endocr Rev. 2001 Feb;22(1):2-35. Review. PMID: 11159814 [PubMed - indexed for MEDLINE] 2. Phillips, K. and Luisi, B. The virtuoso of versatility: POU proteins that flex to fit. J. Mol. Biol., 2000, vol. 302, pp. 1023--1039. PMID: 11183772 3. Klemm, J.D., Rould, M.A., Aurora, R., Herr, W., and Pabo, C.O. Crystal structure of the Oct-1 POU domain bound to an octamer site: DNA recognition with tethered DNA-binding modules. Cell, 1994, vol. 77, pp. 21--32. PMID: 8156594 4. Jacobson, E.M., Li, P., Leon-del-Rio, A., Rosenfeld, M.G., and Aggarwal, A.K.., Structure of Pit-1 POU domain bound to DNA as a dimer: unexpected arrangement and flexibility. Genes Dev. 1997 Jan 15;11(2):198-212. PMID: 9009203 5. Chasman, D.I., Cepek, K., Sharp, P.A., and Pabo, C.O., Crystal structure of an OCA-B peptide bound to an Oct-1 POU domain/octamer DNA complex: specific recognition of a protein-DNA interface. Genes Dev. 1999 Oct 15;13(20):2650-7. PMID: 10541551 6. Remenyi A, Tomilin A, Pohl E, Lins K, Philippsen A, Reinbold R, Scholer HR, Wilmanns M. Synergism with the coactivator OBF-1 (OCA-B, BOB-1) is mediated by a specific POU dimer configuration. Cell. 2000 Dec 8;103(6):853-64. PMID: 11136971 {QDOC50552} {PS50552; PAIRED-LIKE_HOMEODOMAIN} {Status=preliminary} {BEGIN} *********************************** * Paired-like homeodomain profile * *********************************** The wide family of homeobox transcription factors of eukaryota can be devided into several subfamilies. One of them is paired-like family of homeobox proteins. The word 'paired' originates from Drosophila paired segmentation gene [1]. Paired protein, the product of the Drosophila paired gene, contains two structural domains, paired domain [2] , also known as Pax (paired axial) domain in mammals, and a homeodomain. The tandem 'paired domain - homeodomain' was found in a variety of eukaryotic transcription factors. Homedomains from the tandem are highly conserved. They give rise to the paired-like family of homeodomains [4]. The family is defined on the base of sequence homology and includes also homeoproteins having no Paired domains. Similarly, several paired/Pax proteins contain a paired domain and lack a homeodomain. Except for paired domains, another convserved motifs are known in the family of proteins containing piared-like homeodomain. A group of paired-like homeoproteins contains a well-conserved 14 amino acid motif of unknown function [10], the OAR domain , named using initials of homeoproteins otp, aristaless and rax [3,4]. An OAR domain is located C-terminal to a homeodomain. All known OAR containing proteins contain also paired-like homeodomain and lack paired domain. In a number of proteins a paired-like homeopdomain is precieded by (more or less) conserved octapeptide, identified also in homeoproteins of another subfamilies [4]. Systematic analysis and evolutionary relationships between prd-like genes, genes encoding paired-like homeodomains, can be found in [4]. The identification of prd-like genes in Cnidarians [5] is consistent with an early origin for this family. Prd-like genes encode heterogeneous proteins, mainly exerting key developmental functions. We developed the profile for the paired-like homeodomain subfamily of the homeodomain family. Some of proteins that are known to contain paired-like homeodomain are as follows. All homeoproteins that contain paired domain : - Mammalian protein Pax3. Pax3 is expressed during early neurogenesis. In Man, defects in Pax3 are the cause of Waardenburg's syndrome (WS), an autosomal dominant combination of deafness and pigmentary disturbance. - Mammalian protein Pax4. - Mammalian protein Pax6 (oculorhombin). Pax6 is probably a transcription factor with important functions in eye and nasal development. In Man, defects in Pax6 are the cause of aniridia type II (AN2), an autosomal dominant disorder characterized by complete or partial absence of the iris. - Mammalian protein Pax7 [9]. - Drosophila segmentation polarity class proteins gooseberry proximal (gsb-p), and gooseberry distal (gsb-d). All proteins that contain OAR domain : - Human RIEG. Defects in this protein are the cause of Rieger syndrome, an autosomal dominant disorder that includes anomalies of the anterior chamber of the eye, dental hypoplasia and a protuberant umbilicus [10]. - Human OG12X and murine Og12x. The function of these proteins is not yet known [11]. - Vertebrate Rax, encoded by the retina and anterior neural fold homeobox gene. This protein plays a role in the proliferation and/or differentiation of retinal cell [3]. - Drosophila DRX. It could be a homolog of vertebrate Rax. It appears to be important in brain development [14]. - Human SHOX, encoded by the short stature homeobox-containing gene. Defects or lack of this protein are the cause of short stature associated with the Turner syndrome [13]. - Human PITX3 and murine Pitx3. They appear to be involved in normal eye anterior-chamber and lens development [15,12]. In human, defects in PITX3 are associated with anterior segment mesenchymal dysgenesis (ASMD) and autosomal-dominant congenital cataracts (ADCC) [12]. Other: -Drosophila orthodenticle (otd) protein. Mutational inactivation of otd gene results in defects in head structures and deletions in anterior parts of the brain as well as in ventral nerve cord defects [7] -Vertebrate Otx proteins, Drosophila orthodentical orthologs. Otx genes contributes to the genetic program required for the specification of the development of the vertebrate head [8,6] -Vertebrate atrial natriuretic factor (ANF). ANF gene is specifically expressed in the developing chamber miocardium and is one of the first hallmarks of chamber formation [16]. ------------- -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: A.Alexeevski; aba@belozersky.msu.ru -Last update: July 2002 / First entry ------------- [ 1] Frigerio G., Burri M., Bopp D., Baumgartner S., Noll M. Cell 47, 735-746(1986). PMID: 2877746 [ 2] Bopp D., Burri M., Baumgartner S., Frigerio G., Noll M. Cell 47:1033-1040(1986). PMID: 2877747 [ 3] Furukawa T., Kozak C.A., Cepko C.L. Proc. Natl. Acad. Sci. U.S.A. 94:3088-3093(1997). PMID: 9096350 [ 4] Galliot B., de Vargas C., Miller D. Dev. Genes Evol. 209:186-197(1999) PMID: 10079362 [ 5] Miller D.J., Hayward D.C., Reece-Hoyes J.S., Scholten I., Catmull J., Gehring W.J., Callaerts P., Larsen J.E., Ball E.E. Proc Natl Acad Sci USA. 97(9):4475-80(2000). PMID: 10781047 [ 6] Montalta-He H, Leemans R, Loop T, Strahm M, Certa U, Primig M, Acampora D, Simeone A, Reichert H. Genome Biol 2002;3(4):RESEARCH0015 PMID: 11983056 [ 7] Finkelstein R, Smouse D, Capaci TM, Spradling AC, Perrimon N. Genes Dev 1990 Sep;4(9):1516-27 PMID: 1979296 [ 8] Acampora D, Simeone A. Trends Neurosci 1999 Mar;22(3):116-22(1999) PMID: 10199636 [ 9] Schafer B.W., Czerny T., Bernasconi M., Genini M., Busslinger M. Nucleic Acids Res. 22:4574-4582(1994). PMID: 7527137 [10] Semina E.V., Reiter R., Leysens N.J., Alward W.L., Small K.W., Datson N.A., Siegel-Bartelt J., Bierke-Nelson D., Bitoun P., Zabel B.U., Carey J.C., Murray J.C. Nat. Genet. 14:392-399(1996). PMID: 8944018 [11] Semina E.V., Reiter R.S., Murray J.C. Hum. Mol. Genet. 7:415-422(1998). PMID: 9466998 [12] Semina E.V., Ferrell R.E., Mintz-Hittner H.A., Bitoun P., Alward W.L., Reiter R.S., Funkhauser C., Daack-Hirsch S., Murray J.C. Nat. Genet. 19:167-170(1998). PMID: 9620774 [13] Rao E., Weiss B., Fukami M., Rump A., Niesler B., Mertz A., Muroya K., Binder G., Kirsch S., Winkelmann M., Nordsiek G., Heinrich U., Breuning M.H., Ranke M.B., Rosenthal A., Ogata T., Rappold G.A. Nat. Genet. 16:54-63(1997). PMID: 9140395 [14] Eggert T., Hauck B., Hildebrandt N., Gehring W.J., Walldorf U. Proc. Natl. Acad. Sci. U.S.A. 95:2343-2348(1998). PMID: 9482887 [15] Semina E.V., Reiter R.S., Murray J.C. Hum. Mol. Genet. 6:2109-2116(1997). PMID: 9328475 [16] Habets PE, Moorman AF, Clout DE, van Roon MA, Lingbeek M, van Lohuizen M, Campione M, Christoffels VM. Genes Dev 16(10):1234-46(2002) PMID: 12023302 {END} {QDOC50553} {PS50553; SIX_DOMAIN} {PS50554; SIX_HOMEODOMAIN} {Status=preliminary} {BEGIN} ******************************* * SIX family of homeoproteins * ******************************* The Six (Drosophila sine oculis homeobox homologue) protein family [1] consists of eukaryotic transcription factors, that include a Six domain adjacent to a homeodomain . A number of studies suggest important roles for the Six genes in the development of the anterior part of the vertebrate CNS and eye, in myogenesis and perhaps also in the development of the auditory system, kidneys, digits and connective tissue [1,4]. In each of C.elegans, Drosophila, Mouse and Human genomes were determined several members of Six family of genes (named Six1, Six2, Six3, Six4, Six5, Six6 in vertebrates). Six family proteins possess extensive sequence similarity between one another in the Sine oculis-homologous region (Six domain and homeodomain) but they differ greatly in structure in some other regions. Six domain is of approximately 120 amino acids motif adjacent to the N-terminus of a homeodomain. Both domains are essential for specific DNA binding [ 1]. 3D structure of Six domain is not yet determined. We developed two profiles for Six family of proteins. The first profile describes Six domain, the second describes Six/sine subfamily of homeodomains. Selected proteins known to belong to the family are listed below. -Drosophila sine oculis protein (so). Sine oculis gene is a high hierarchy and essential gene in the developmental pathway of the fly visual system [5]. - Human (and vertabrates) SIX3 protein. SIX3 is considered to be the functional homologue of Drosophila so gene. SIX3 is a transcription factor essential for vertebrate eye development [3]. SIX3 has been found to be mutated in some patients with holoprosencephaly type 2 (HPE2), suggesting that SIX3 has wide implications for head development [7,6]. - Human SIX5 protein (previously known as myotonic dystrophy associated homeodomain protein-DMAHP). The disfunction of SIX5 gene in human contributes to myotonic distrophy type I syndrome [10] - Human (&mouse) SIX1 (Sine oculis homeobox homolog 1). - Human (&mouse) SIX2 (Sine oculis homeobox homolog 2) [8]. - Human (&mouse) SIX4 (Sine oculis homeobox homolog 4). - C.elegans protein ceh-32; it plays a role in head morphogenesis. - C.elegans protein ceh-33. - C.elegans protein ceh-34. ------------------------- -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Note: for full-length protein sequences both profiles match the same set of entries -Expert(s) to contact by email: A.Alexeevski; aba@belozersky.msu.ru S.Spirin; sas@belozersky.msu.ru -Last update: July 2002 / First entry --------------------------- [ 1] Kawakami K, Sato S, Ozaki H, Ikeda K. Bioessays 22(7):616-26(2000) PMID: 10878574 [ 3] Granadino B, Gallardo ME, Lopez-Rios J, Sanz R, Ramos C, Ayuso C, Bovolenta P, Rodriguez de Cordoba S. Genomics 55(1):100-5(1999) PMID: 9889003 [ 4] Cordoba SR, Gallardo ME, Lopez-Rios J., Bovolenta P. Current Genomics 2(3):231-242(2001) [ 5] Serikaku M.A., O'Tousa J.E., Genetics 138:1137 (1994) PMID 7896096 [ 6] Pasquier L, Dubourg C, Blayau M, Lazaro L, Le Marec B, David V, Odent S. Eur J Hum Genet 8(10):797-800(2000) [ 7] Wallis DE, Roessler E, Hehr U, Nanni L, Wiltshire T, Richieri-Costa A, Gillessen-Kaesbach G, Zackai EH, Rommens J, Muenke M. Nat Genet 22(2):196-8(1999) PMID: 10369266 [ 8] Boucher CA, Winchester CL, Hamilton GM, Winter AD, Johnson KJ, Bailey ME. Gene 247(1-2):145-51(2000) PMID: 10773454 [10] Harris SE, Winchester CL, Johnson KJ. Nucleic Acids Res. 28(9):1871-8 (2000). PMID: 10756185 {END} {QDOC50558} {PS50558; LIM_HOMEODOMAIN} {Status=preliminary} {BEGIN} **************************** * LIM-homeodomain profile * **************************** Homeobox transcription factors form one of the largest families of specifically eukariotic transcription factors. They are characterized by the presence of a homeodomain , which is a conserved DNA-binding domain. The wide family of homeodomains can be divided into several subfamilies. One of them is LIM-homeodomain subfamily. LIM-homeodomain proteins contain two tandem LIM domains N-terminal to a homeodomain. Tandem LIM domains and a homeodomain are connected by a spacer variable in length (10-100 aa) and sequence. The variety of LIM-homeodomain proteins are involved in the regulation of patterning or the specification and differentiation of different cell types during embryonic development [1]. Whereas homeodomains are DNA-binding domains, LIM domains are essential for regulating the activity of LIM-homeodomain proteins by interacting with other proteins. LIM domains are also found in proteins that do not contain homeodomains [2]. LIM homeodomains are well conserved. We developed the profile for LIM-homeodomain sequences (57 aa). The profile matches proteins from LIM-homeodomain family. Selected LIM-homeodomain proteins are listed below. - C. elegans Lin-11, required for the asymmetric division of a vulval precursor cell type [ 3]. - Vertebrate insulin gene enhancer binding protein Isl-1. Isl-1 binds to one of the two cis-acting protein-binding domains of the insulin gene [ 4]. - C. elegans Mec-3, required for the differentiation of the set of six touch receptor neurons in this nematode [ 5]. - Vertebrate Lmx-1, which acts as a transcriptional activator by binding to the FLAT element, a beta-cell-specific transcriptional enhancer found in the insulin gene [7]. - Drosophila protein apterous, required for the normal development of the wing and halter imaginal discs. - C. elegans homeobox protein Ceh-14; Ceh-14 confers thermosensory function to neurons [ 6]. - Vertebrate homeobox proteins Lim-1, Lim-2 (Lim-5) and Lim-3. ---------------------------- -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: A.Alexeevski; aba@belozersky.msu.ru S.Spirin; sas@belozersky.msu.ru -Last update: July 2002 / First entry [ 1] Hobert O, Westphal H. Trends Genet 16(2):75-83 (2000) PMID: 10652534 [ 2] Dawid IB, Breen JJ, Toyama R. Trends Genet, 14(4):156-62 (1998) PMID: 9594664 [ 3] Freyd G, Kim SK, Horvitz HR. Nature, 344(6269):876-9, ( 1990) PMID: 1970421 [ 4] Karlsson O, Thor S, Norberg T, Ohlsson H, Edlund T. Nature,344(6269):879-82 (1990) PMID: 1691825 [ 5] Way JC, Chalfie M. Cell 1988 Jul 1;54(1):5-16 PMID: 2898300 [ 6] Cassata G, Kagoshima H, Andachi Y, Kohara Y, Durrenberger MB, Hall DH, Burglin TR. Neuron,25(3):587-97, (2000) PMID: 10774727 [ 7] German MS, Wang J, Chadwick RB, Rutter WJ. Genes Dev 6(11):2165-76 (1992) PMID: 1358758 {END} {QDOC50560} {PS50560; CUT_HOMEODOMAIN} {Status=preliminary} {BEGIN} ******************************** * Cut-homeodomain profile * ******************************** Homeobox transcription factors form one of the largest families of specifically eukariotic transcription factors. They are characterized by the presence of a homeodomain , which is a conserved DNA-binding domain. The wide family of homeodomains can be divided into several subfamilies. One of them is cut-homeodomain subfamily. The word 'cut' originates from Drosophila cut gene encoding a cut-homeodomain transcription factor. Mammalian counterparts of the Drosophila cut homeoprotein are human CCAAT displacement protein (CDP) and murine Cux proteins [1]. Cut-homeodomain proteins contain one, two or three cut domains and one homeodomain downstream to the cut domains. Cut domains as well as homeodomains are DNA-binding domains. They are not yet found in proteins that do not contain a homeodomain. See [2] for domain organisation and classification of cut-homeoproteins. CDP/Cux/Cut proteins form an evolutionarily conserved family. In Drosophila melanogaster cut gene functions as a determinant of cell-type specification in several tissues, notably in the peripheral nervous system, the wing margin and the Malpighian tubule. In vertebrates, the same functions appear to be fulfilled by two cut-related genes with distinct patterns of expression. The human CCAAT-displacement protein (CDP) was later found to be the DNA binding protein of the previously characterized histone nuclear factor D (HiNF-D). Various combinations of Cut repeats and the Cut homeodomains can generate distinct DNA binding activities. These activities are elevated in proliferating cells and decrease during terminal differentiation. CDP/Cux/Cut proteins role is reviewed in [1]. We have developed a profile for the cut-homeodomain family. The profile spans the whole homeodomain (59-61 aa in total). Proteins known to belong to the family are listed below - Mammalian Cux/CDP, the CCAAT displacement protein; the CUTL1 gene encoding CDP in human was mapped to 7q22, a chromosomal region that is frequently rearranged in various cancers [3] - The Drosophila onecut protein, which is a neural-specific transcriptional regulator [4] - Hepatocyte nuclear factor 6 (HNF-6) (ONECUT family member) [6] - Human ONECUT-2 transcription factor (OC-2); OC-2 participates in the network of transcription factors required for liver differentiation and metabolism [5] - Mammalian protein Cux-2 (Cut-like 2), a transcription factor which is involved in neural specification in mammals [7] - C.elegans protein ceh-21 - C.elegans protein ceh-38 - C.elegans protein ceh-39 - DNA-binding protein SATB1 (Special AT-rich sequence binding protein 1) [2]. --------------------------- -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Expert(s) to contact by email: A.Alexeevski; aba@belozersky.msu.ru S.Spirin; sas@belozersky.msu.ru -Last update: August 2002 / First entry [ 1] Nepveu, A. Gene 270(1-2):1-15 (2001) PMID: 11403998 [ 2] Burglin TR, Cassata G. Int J Dev Biol 46(1):115-23 (2002) PMID: 11991046 [ 3] Zeng WR, Scherer SW, Koutsilieris M, Huizenga JJ, Filteau F, Tsui LC, Nepveu A. Oncogene 14(19):2355-65 (1997) [ 4] Nguyen DN, Rohrbaugh M, Lai Z. Mech Dev 97(1-2):57-72 (2000) PMID: 11025207 [ 5] Jacquemin P., Lannoy V., Rousseau G.G., Lemaigre F.P. J. Biol. Chem. 274:2665-2671(1999) PMID: 9915796 [ 6] Lannoy VJ, Burglin TR, Rousseau GG, Lemaigre FP J Biol Chem 1998 May 29;273(22):13552-62 PMID: 9593691 [ 7] Quaggin SE, Heuvel GB, Golden K, Bodmer R, Igarashi P. J Biol Chem 13;271(37):22624-34 (1996) PMID: 8798433 {END} {QDOC50800} {PS50800; SAP_MOTIF} {Status=preliminary} {BEGIN} ************** * Sap motif. * ************** This motif of 35 amino acid residues is found in a variety of nuclear proteins involved in transcription, DNA repair, RNA processing or apoptotic chromatin degradation. It was named SAP after SAF-A/B, Acinus and PIAS, three proteins known to contain it [1]. A multiple alignment of the SAP motif reveals a bipartite distribution of strongly conserved hydrophobic, polar and bulky amino acids separated by a region that contains a glycine [1]. Secondary structure predictions suggest that the SAP motif could form two alpha helices separated by a turn [1]. As the sap motif of SAF-A has been shown to be essential for specific DNA binding activity, it has been proposed that it could be a DNA-binding motif [1, 2]. The following proteins have been shown to contain a SAP motif: - The scaffold attachment factors A and B (SAF-A/B). These two proteins are heterogeneous nuclear ribonucleoproteins (hnRNPs) that bind to AT-rich chromosomal region. It has been proposed that they couple RNA metabolism to nuclear organization [2, 3, 4]. The SAF-A protein is cleaved by caspase-3 during apoptosis [2, 5]. - Acinus, a mammalian protein which induces apoptotic chromatin condensation after cleavage by caspase-3 [6]. Acinus also contains a RNA-recognition motif. - The eukaryotic proteins of the PIAS (protein inhibitor of activated STAT) family. These proteins interact with phosphorylated STAT dimers and inhibit STAT mediated gene activation. Deletion of the first 50 amino acid residues containing the SAP domain allows the interaction of PIAS1 with STAT1 monomer [7]. - Plant poly(ADP-ribose) polymerase (PARP). PARP is a nuclear protein that catalyzes the poly(ADP-ribosyl)ation of proteins. It is involved in responses to mild and severe oxidative stresses, by mediating DNA repair and programmed cell death processes, respectively [8]. PARP is tightly bound to chromatin or nuclear matrix. - Arp, an apurinic endonuclease-redox protein from Arabidopsis thaliana. - Tho1p, a yeast protein that could be involved in the regulation of transcriptional elongation by RNA polymerase II [9]. - Ku70. Together with Ku86, it forms a DNA ends binding complex that is involved in repairing DNA double-strand breaks. - RAD18, a yeast protein involved in DNA repair. - UVS-2, a protein from Neurospora crassa that is homologous to RAD18. [ 1] Aravind L., Koonin E.V. Trends Biochem. Sci. 25:112-114(2000). [ 2] Gohring F., Schwab B.L., Nicotera P., Leist M., Fackelmayer F.O. EMBO J. 16:7361-7371(1997). [ 3] Weighardt F., Cobianchi F., Cartegni L., Chiodi I., Villa A., Riva S., Biamonti G. J. Cell Sci. 112:1465-1476(1999). [ 4] Nayler O., Stratling W., Bourquin J.P., Stagljar I., Lindemann L., Jasper H., Hartmann A.M., Fackelmayer F.O., Ullrich A., Stamm S. Nucleic Acids Res. 26:3542-3549(1998). [ 5] Kipp M., Schwab B.L., Przybylski M., Nicotera P., Fackelmayer F.O. J. Biol. Chem. 275:5031-5036(2000). [ 6] Sahara S., Aoto M., Eguchi Y., Imamoto N., Yoneda Y., Tsujimoto Y. Nature 401:168-173(1999). [ 7] Liao J., Fu Y., Shuai K. Proc. Natl. Acad. Sci. U.S.A. 97:5267-5272(2000). [ 8] Amor Y., Babiychuk E., Inze D., Levine A. FEBS Lett. 440:1-7(1998). [ 9] Piruat J.I., Aguilera A. EMBO J. 17:4859-4872(1998). -Last Update: June 2000 (C.J.A. Sigrist). [D1] INTERPRO:IPR003034 [D2] PFAM:PF02037 [D3] PRINTS: [D4] SMART: {END} {QDOC50807} {PS50807; GCM} {Status=preliminary} {BEGIN} ********************* * GCM motif profile * ********************* The GCM motif is an around 150 amino acid residues domain that has been identified in the N-terminal part of proteins belonging to a family of transcriptional regulators which comprise drosophila GCM and its mammalian homologs [1,2]. The GCM motif has been shown to be a DNA binding domain that recognizes preferentially the nonpalindromic octamer 5'-ATGCGGGT-3' [1,2,3,4]. The GCM motif contains many conserved basic amino acid residues, seven cysteine residues, and four histidine residues [1]. The conserved cysteines are involved in shaping the overall conformation of the domain, in the process of DNA binding and in the redox regulation of DNA binding [3]. Proteins known to contain a GCM motif are listed below: - Drosophila glial cell missing (GCM) protein, functions as an important switch during early neurogenesis by committing cells to the glial cell fate [1,2]. - Mammalian GCMa (or GCM1) protein. GCMa is primarily expressed in trophoblasts of the placenta and is possibly involved in the expression of multiple placenta-specific genes [4,5]. - Mammalian GCMb (or GCM2) protein. The function of this protein that is selectively detected in the forming parathyroid gland is not yet known [4]. The profile covers the entire GCM motif. -Sequences known to belong to this class detected by the repeat profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: September 2000 / First entry. [ 1] Akiyama Y., Hosoya T., Poole A.M., Hotta Y. Proc. Natl. Acad. Sci. U.S.A. 93:14912-14916(1996). [ 2] Schreiber J., Sock E., Wegner M. Proc. Natl. Acad. Sci. U.S.A. 94:4739-4744(1997). [ 3] Schreiber J., Enderich J., Wegner M. Nucleic Acids Res. 26:2337-2343(1998). [ 4] Tuerk E.E., Schreiber J., Wegner M. J. Biol. Chem. 275:4774-4782(2000). [ 5] Yamada K., Ogawa H., Honda S., Harada N., Okazaki T. J. Biol. Chem. 274:32279-32286(1999). {END} {QDOC50813} {PS50813; GAF} {Status=preliminary} {BEGIN} ********************** * GAF domain profile * ********************** GAF domains are ubiquitous motifs present in one, two, or three degenerate copies in cyclic GMP (cGMP)-regulated cyclic nucleotide phosphodiesterases (PDEs), certain adenylyl cyclases, the bacterial transcription factor FhlA, and hundreds of other signaling and sensory proteins from all three kingdoms of life. Given the diverse evolutionary and functional contexts within which GAF domains are found, it has been proposed that not all these domains possess identical functions. Hence, the GAF domains of cGMP-regulated PDEs and some adenylyl cyclases contain a (R/K)X(m)NKX(n)D motif implicated in cGMP binding, whereas the GAF domains of plant and cyanobacterial phytochromes (see ) contain a cysteine that mediates chromophore attachment [1,2]. Resolution of the crystal structure of Saccharomyces cerevisiae YKG9 GAF domain has revealed that the GAF domain forms dimers and consists of a central antiparallel six-stranded beta-sheet and two outer layer. One of the outer layer is made of four alpha-helices whereas the opposite outer layer, called the 'irregular' layer, is a mixture of loops and a short alpha-helix. Despite a lack of sequence homology to any previously determined structure, the GAF domain shows structural similarities to PAS domains (see ) and profilin (see ). The GAF fold is a hybrid between the profilin fold, manifested in the helical and central layers, and the PAS fold, manifested in the central and irregular layers [2]. Some proteins known to contain a GAF domain are listed below: - Vertebrate and invertebrate cGMP-stimulated PDEs. - Plant and cyanobacterial phytochromes. Phytochromes are a major class of signal transducing photoreceptors in plants. - Plant ETR1, an ethylene receptor which is similar in domain architecture to phytochromes. - Bacterial FhlA, required for induction of expression of the formate dehydrogenase h and hydrogenase-3 structural genes. - Bacterial nifA, a transcriptional activator, is required for activation of most nif operons, which are directly involved in nitrogen fixation. nifA interacts with sigma-54. - Cyanobacterial adenylate cyclases (EC 4.6.1.1). -Last update: January 2000 (C.J.A. Sigrist). [ 1] Aravind L., Ponting C.P. Trends Biochem. Sci. 22:458-459(1997). [ 2] Ho Y.-S.J., Burden L.M., Hurley J.H. EMBO J. 19:5288-5299(2000). [D1] INTERPRO:IPR003018 [D2] PFAM:PF01590 [D3] PRINTS: [D4] SMART:GAF {END} {QDOC50814} {PS50114; WIF} {Status=preliminary} {BEGIN} ********************** * WIF module profile * ********************** The WIF module is an around 150 amino acid domain which has been identified in the extracellular region of the Ryk receptor tyrosine kinases and in the secreted Wnt-inhibitory-factor-1 (WIF-1) proteins. As the WIF module is both necessary and sufficient for Wnt binding by the WIF-1 proteins, it has been proposed that the WIF module of Ryk receptors might also serve to bind to Wnt proteins or related ligands. The WIF modules found in Ryk receptor tyrosine kinases and WIF-1 proteins contain conserved sequence motifs, including two conserved cysteines which might form a disulphide bridge [1]. Some proteins known to contain a WIF module are listed below: - Mammalian Ryk proteins. Although mammalian Ryk proteins are found in most tissues, little is known about their function. They are believed to be involved in cellular recognition processes. - Drosophila melanogaster derailed (drl) and doughnut (dnt) proteins. The drl protein plays a crucial role in cell-recognition processes controlling nervous system development and muscle development. The dnt receptor has a related and significantly overlapping biochemical function. - Caenorhabditis elegans Ryk protein, encoded by the lin-18 gene. It is required for cell-cuticle recognition. - Animal WIF-1 proteins. WIF-1s are secreted proteins, consisting of an N- terminal WIF domain and five EGF-like repeats (see ). They bind to wnt proteins and inhibit their activities. The profile we developed covers the entire WIF domain. -First entry: January 2001 (C.J.A. Sigrist). [ 1] Patthy L. Trends Biochem. Sci. 25:12-13(2000). [D1] INTERPRO:IPR003306 [D2] PFAM:PF02019 [D3] PRINTS: [D4] SMART: {END} {QDOC50815} {PS50815} {Status=preliminary} {BEGIN} ************************ * HORMA domain profile * ************************ The HORMA domain (for Hop1p, Rev7p and MAD2) is an about 180-240 amino acids region containing several conserved motifs. Whereas the MAD2 and the Rev7p proteins are almost entirely made up of HORMA domains, Hop1p contains a HORMA domain in its N-terminal region and a Zn-finger domain, whose general arrangement of metal-chelating residues is similar to that of the PHD finger, in the C-terminal region. The HORMA domain is found in proteins showing a direct association with chromatin of all crown group eucaryotes. It has been suggested that the HORMA domain recognizes chromatin states that result from DNA adducts, double-stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins involved in repair [1]. Secondary structure prediction suggests that the HORMA domain is globular and could potentially form a complex beta-sheet(s) with associated alpha-helices [1]. Some proteins known to contain a HORMA domain are listed below: - Eucaryotic Hop1p, a conserved protein that is involved in meiotic- synaptonemal-complex assembly. - Eucaryotic mitotic-arrest-deficient 2 protein (MAD2), a key component of the mitotic-spindle-assembly checkpoint. - Eucaryotic Rev7p, a subunit of the DNA polymerase zeta that is involved in translesion, template-independent DNA synthesis. We have developed a profile that covers the entire HORMA domain. -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -First entry: January 2001 (C.J.A. Sigrist). [ 1] Aravind L., Koonin E.V. Trends Biochem. Sci. 23:284-286(1998). {END} {QDOC50816} {PS50816; NAF} {Status=preliminary} {BEGIN} ********************** * NAF domain profile * ********************** The NAF domain is a 24 amino acid domain that is found in a plant-specific subgroup of serine-threonine protein kinases (CIPKs), that interact with calcineurin B-like calcium sensor proteins (CBLs). Whereas the N-terminal part of CIPKs comprises a conserved catalytic domain typical of Ser-Thr kinases, the much less conserved C-terminal domain appears to be unique to this subgroup of kinases. The only exception is the NAF domain that forms an 'island of conservation' in this otherwise variable region. The NAF domain has been named after the prominent conserved amino acids Asn-Ala-Phe. It represents a minimum protein interaction module that is both necessary and sufficient to mediate the interaction with the CBL calcium sensor proteins [1]. The secondary structure of the NAF domain is currently not known, but secondary structure computation of the C-terminal region of Arabidopsis thaliana CBL-interacting protein kinase 1 revealed a long helical structure [1]. -Last update: March 2001 (C.J.A. Sigrist). [ 1] Albrecht V., Ritz O., Linder S., Harter K., Kudla J. EMBO J. 20:1051-1063(2001). [D1] INTERPRO: [D2] PFAM: [D3] PRINTS: [D4] SMART: {END}{QDOC50825} {PS50825; HYR} {Status=preliminary} {BEGIN} ********************** * HYR domain profile * ********************** The HYR (HYalin Repeat) domain is an extracellular domain of about 80-100 amino acids. It has been named after the hyalin protein, which is composed exclusively of repeats of this domain. The HYR domain is found in several eucaryotic proteins, either in multiple copies as in hyalin or in association with other domains like the CCP (sushi) domain, the Von Willebrand factor type A (VWA) domain (see ), the EGF-like (see ), the calcium- binding EGF-like domain (see ), the pentraxin domain (see ), the CUB domain (see ), the LDL-receptor class A domain (see ), the C-type lectin domain (see ) or the discoidin domain. As the HYR domains of hyalin have been shown to contain the ligand for the hyalin cell surface receptor, the HYR domain can also be expected to play a direct role in cellular adhesion in other proteins in which it is present [1]. Secondary structure predictions of the HYR domain indicate an all-beta fold including seven beta-strands. The HYR domain share clear sequences similarities limited to the C-terminal regions with the Fn3 and PKD domains and is believed to belong to the immunoglobulin-like fold [1]. Some proteins containing a HYR domain are listed below [1]: - Hyalin, a protein of the echinoderme extra-embryonic matrix. - Polydom, a secreted protein from mouse with pentraxin, complement control protein, epidermal growth factor and von willebrand factor A domains. - SRPX, a sushi-repeat-containing protein from mammals. In human, the gene encoding SRPX is deleted in patients with X-linked retinitis pigmentosa. - F55H12.3, F47C12.10 and W02C12.1 proteins from Caenorhabditis elegans. The profile we developed covers the entire HYR domain. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Callebaut I., Gilges D., Vigon I., Mornon J.-P. Protein Sci. 9:1382-1390(2000). [D1] INTERPRO:IPR003410 [D2] PFAM:PF02494 [D3] PRINTS: [D4] SMART: {END} {QDOC50826} {PS50826; RUN} {Status=preliminary} {BEGIN} ********************** * RUN domain profile * ********************** The RUN domain, named after RPIP8, UNC-14 and NESCA, is organized into six conserved blocks (A-F), which are predicted to constitute the 'core' of a globular structure tolerating insertions of considerable length between the conserved blocks. The RUN domain is found in one or two copies in several proteins that are linked particularly to the functions of GTPases in the Rap and Rab families. RUN domains can be associated with TBC, FYVE, DENN, SH3 (see ), C1, PLAT/LH2, GST, PH (see ) or PX domains. The RUN domain probably function as a specific effector for some proteins of the Ras superfamily, although a catalytic function can not be excluded [1]. The predicted secondary structures of the RUN domain core indicate a predominantly alpha fold [1]. Some proteins known to contain a RUN domain are listed below: - Mammalian Rap2 interacting protein 8 (RPIP8). A probable specific effector of the small GTP-binding protein Rap2 in cells exhibiting neuronal properties. - Human Nesca (new molecule containing SH3 at the C-terminus), an ubiquitously expressed protein. - Caenorhabditis elegans UNC-14, a protein required for axonal elongation and guidance that interacts with the serine/threonine kinase UNC-51. - Mouse GTP-binding protein-associated protein B (GBPAP-B), a protein found in yeast two-hybrid screen with Rab6 and which specifically interacts with this GTPase bound to GTP. It contains two RUN domains. The profile we developed covers the entire RUN domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Callebaut I., de Gunzburg J., Goud B., Mornon J.-P. Trends Biochem. Sci. 26:79-83(2001). [D1] INTERPRO: [D2] PFAM:PF02759 [D3] PRINTS: [D4] SMART: {END}{QDOC50827} {PS50823; DDT} {Status=preliminary} {BEGIN} ********************** * DDT domain profile * ********************** The DDT has been named after the better characterized DNA-binding homeobox- containing proteins and the Different Transcription and chromatin remodeling factors in which it is found. It is a domain of generally about 60 amino acids which is exclusively associated with nuclear domains like AT-Hook (see ), PHD finger, methyl-CpG-binding domain, bromodomain (see ) and DNA-binding homeodomain (see ). The DDT domain is characterized by a number of conserved aromatic and charged residues and is predicted to consist of three alpha helices. A DNA-binding function for the DDT domain has been proposed [1]. Proteins known to contain a DDT domain are listed below: - Bromodomain PHD finger transcription factors (BPTFs) from Caenorhabditis elegans, Drosophila and human. The human protein FALZ or FAC1 is believed to play arole in Alzheimer's disease. - Chromatin remodeling factors of the BAZ (bromodomain adjacent to zinc finger proteins)-family from Caenorhabditis elegans, Drosophila and human. The human BAZ protein WSTF is implicated in Wiliams Syndrome, a complex developmental disorder with multisystemic defects. - Hox-domain-containing proteins from Arabidopsis thaliana. - Hypothetical PHD-domain-containing protein from Arabidopsis thaliana. The profile we developed covers the entire DDT domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Note: The DDT was first identified in the BAZ family as part of a larger LH (Leucine-rich Helical) domain [2]. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Doerks T., Copley R., Bork P. Trends Biochem. Sci. 26:145-146(2001). [ 2] Jones M.H., Hamana N., Nezu J.I., Shimane M. Genomics 63:40-45(2000). [D1] INTERPRO: [D2] PFAM:PF02791 [D3] PRINTS: [D4] SMART:DDT {END} {QDOC50828} {PS50828; SMR} {Status=preliminary} {BEGIN} ********************** * Smr domain profile * ********************** The Smr domain is an around 90-residue domain found in : - the C-terminal region of the mutS2 proteins from bacteria and plants, - the small mutS related (smr) proteins from bacteria and eukaryotes. These proteins could be involved in mismatch repair (MMR) or/and chromosome crossing-over and segregation. It has been proposed that the Smr domain acts as a nicking endonuclease [1,2] The profile we developed spans the entire Smr domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Moreira D., Philippe H. Trends Biochem. Sci. 24:298-300(1999). [ 2] Malik H.S., Henikoff S. Trends Biochem. Sci. 25:414-418(2000). [D1] INTERPRO:IPR002625 [D2] PFAM:PF01713 [D3] PRINTS: [D4] SMART:SMR {END} {QDOC50829} {PS50829; GYF} {Status=preliminary} {BEGIN} ********************** * GYF domain profile * ********************** The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid (aa) domain which contains a conserved GPY[or F]xxxxM[or V]xxWxxxG[or N]YF motif. It was identified in the human intracellular protein termed CD2 binding protein 2 (CD2BP2), which binds to a site containing two tandem PPPGHR segments within the cytoplasmic region of CD2. Binding experiments and mutational analyses have demonstrated the critical importance of the GYF tripeptide in ligand binding. A GYF domain is also found in several other eucaryotic proteins of unknown function [1]. It has been proposed that the GYF domain found in these proteins could also be involved in proline-rich sequence recognition [2]. Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy revealed a compact domain with a beta-beta-alpha-beta-beta topology, where the single alpha-helix is tilted away from the twisted, anti-parallel beta-sheet. The conserved residues of the GYF domain create a contiguous patch of predominantly hydrophobic nature which forms an integral part of the ligand- binding site [2]. The profile we developed spans the entire GYF domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Nishizawa K., Freund C., Li J., Wagner G., Reinherz E.L. Proc. Natl. Acad. Sci. U.S.A. 95:14897-14902(1998). [ 2] Freund C., Doetsch V., Nishizawa K., Reinherz E.L., Wagner G. Nat. Struct. Biol. 6:656-660(1999). [D1] INTERPRO:IPR003169 [D2] PFAM:PF02213 [D3] PRINTS: [D4] SMART:GYF {END}{QDOC50829} {PS50829; SOHO} {Status=preliminary} {BEGIN} *********************** * SoHo domain profile * *********************** The c-Cbl-associated protein (CAB), ArgBP2 and vinexin-alpha each contain three C-terminal SH3 domains (see ) and an N-terminal region with similarity to the gut peptide sorbin, termed the sorbin homology (SoHo) domain [1,2,3]. Whereas the SH3 domains of these proteins can bind to different signaling or cytoskeletal molecules, the SoHo domains of CAP and vinexin has been shown to interact specifically the lipid raft-associated protein flotilin. Thus these proteins serve as adapters that link signaling or cytoskeletal proteins to the lipid raft, a microdomain of the plasma membrane enriched in cholesterol and shingolipids that concentrates certain signaling molecules [4] The profile we developed covers the region of the SoHo domain necessary for the interaction with flotilin. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Sparks A.B., Hoffman N.G., McConnell S.J., Fowlkes D.M., Kay B.K. Nat. Biotechnol. 14:741-744(1996). [ 2] Ribon V., Printen J.A., Hoffman N.G., Kay B.K., Saltiel A.R. Mol. Cell. Biol. 18:872-879(1998). [ 3] Kawabe H., Hata Y., Takeuchi M., Ide N., Mizoguchi A., Takai Y. J. Biol. Chem. 274:30914-30918(1999). [ 4] Kimura A., Baumann C.A., Chiang S.-H., Saltiel A.R. Proc. Natl. Acad. Sci. U.S.A. 98:9098-9103(2001). [D1] INTERPRO:IPR003127 [D2] PFAM:PF02208 [D3] PRINTS: [D4] SMART:Sorb {END}{QDOC50833} {PS50833; BRIX} {Status=preliminary} {BEGIN} *********************** * Brix domain profile * *********************** Analysis of the Brix (biogenesis of ribosomes in Xenopus) protein leaded to the identification of a region of 150-180 residues length, called the Brix domain, which is found in six protein families: one archaean family (I) including hypothetical proteins (one per genome); and five eukaryote families, each named according to a representative member and including close homologues of this prototype: (II) Peter Pan (D. melanogaster) and Ssf1/2 (S. cerevisiae); (III) yhr088wp/Rpf1p (S. cerevisiae); (IV) IMP4 (S. cerevisiae); (V) Brix (X. laevis) and yol077c/brx1 (S. cerevisiae); and (VI) ykr081cp (S. cerevisiae). Typically, a protein sequence belonging to the Brix domain superfamily contains a highly charged N-terminal segment (about 50 residues) followed by a single copy of the Brix domain and another highly charged C- terminal region (about 100 residues). The archean sequences have two unique characteristics: (1) the charged regions are totally absent at the N-terminus and are reduced in number to about 10 residues at the C-terminus; and (2) the C-terminal part of the Brix domain itself is minimal. Two eucaryote groups have large insertions within the C-terminal region: about 70 residues in the group III and about 120 in the group II. Biological data for some proteins in this superfamilly suggest a role in ribosome biogenesis and rRNA binding [1,2]. The profile we developed spans the entire Brix domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Eisenhaber F., Wechselberger C., Kreil G. Trends Biochem. Sci. 26:345-347(2001). [ 2] Mayer C., Suck D., Poch O. Trends Biochem. Sci. 26:143-144(2001). [D1] INTERPRO:IPR002799 [D2] PFAM:PF01945 [D3] PRINTS: [D4] SMART: {END}{QDOC50835} {PS50835; IG_LIKE} {Status=preliminary} {BEGIN} ************************** * Ig-like domain profile * ************************** The Ig-like domain is probably the most widespread domain, at least in animals. This domain can be consider as an heterogeneous group built on a common fold. Protein containing an Ig-like domain differ in their tissue distribution, amino acid composition, and biological role. All Ig-like domains appear to be involved in binding functions. The ligands of the Ig-like domains range from small molecules (antigens, chromophores), to hormones (growth hormone, interferons, prolactin, etc.), up to giant molecules (muscle proteins). Binding sites are localized either in the loops regions (the most variable parts of the immunoglobulins) or in strands. For instance, distinct areas of the sheets are used to bind the ligands of the MHC, CD8, CD4, and PapD molecules or of the growth hormone receptor (GHR). These binding sites may be formed by a single chain (CD2, CD4), by homodimers (GHR, CD8), or by heterodimers. Classical Ig-like domain are composed of 7 to 10 beta strands, distributed between two sheets with typical topology and connectivity. Ig-like domains have similar general shapes, but differ significantly in their sizes, owing to high variability of the loops. While a classical domain contains about 100 residues, smaller ones (74-90 residues) have been observed in several Ig-related molecules (CD2, CD4). Large decorations within loops, sometimes including extra domains, are found in hemocyanin (238 amino acids), transcription factor NFkappaB (201 amino acids) and cytochrome f (214 amino acids).The schematic representation of the structure of a typical Ig-like domain is shown below: ----------======D======>-- | | | ------<=====E=======-- | | | | --======B======>------ | | | | | | --<=====A======= | | | | | | | | | <=====G=======-- | | | | | | ------======F======>-- | | | ----------<=====C=======------ '=>': indicates the direction of the beta-strands 'A' to 'G'. Ig-like domains can be classified according to the numbers of beta strands: C1-type: classical Ig-like domain, described in the schematic representation. Sheet I:ABED, sheet II:CFG. - Domain C1 is found only in the molecules involved in the immune system: Immunoglobulins (Ig), T-cell receptors (TcR) and Major Histocompatibility Complex (MHC) molecules. C2-type: strand D is deleted and replaced by strand C' directly connected to strand E. Sheet I:ABE, sheet II: C'CFG. - second domain of the vascular cell adhesion molecule-1. - neural cell adhesion molecule 2. - vascular endothelial growth factor receptor 3. - fibroblast growth factor receptor 4. - interleukin-6 receptor alpha chain. V-type:extra strands C' and C" between strand C and D. Sheet I: ABED, sheet II:C"C'CFG. - variable domain of the immunoglobulin heavy chain. - T-cell surface glycoprotein CD8 alpha chain. - viral Hemagglutinin. - Programed cell death protein 1. - Neurocan core protein - Myelin protein zero. H-type:extra strand C' between strand C and D. Sheet I: ABE, sheet II:CFG. Strand C'/D links sheet I and II. - cellulase c We developed a profile based on structural alignment that covers the whole domain. -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: September 2001 / First entry. [ 1] Bork P., Holm L., Sander C. J. Mol. Biol. 242:309-320(1994). [ 2] Halaby D.M., Poupon A., Mornon J.P. Protein Eng. 12:563-571(1999). [ 3] Halaby D.M., Mornon J.P. J. Mol. Evol. 1998 46:389-400 [D1] INTERPRO:IPR003006 [D2] PFAM:PF00047 [D3] PRINTS: [D4] SMART: {END} {QDOC50836} {PS50836; DOMON} {Status=preliminary} {BEGIN} ************************ * DOMON domain profile * ************************ The DOMON domain is an 110-125 residue long domain which has been identified in the physiologically important enzyme dopamine beta-monooxygenase and in several other secreted and transmembrane proteins from both plants and animals. It has been named after DOpamine beta-MOnooxygenase N-terminal domain. The DOMON domain can be found in one to four copies and in association with other domains, such as the Cu-ascorbate dependent monooxygenase domain, the epidermal growth factor domain (see ), the trypsin inhibitor- like domain (TIL), the SEA domain (see ), and the Reelin domain. The architectures of the DOMON domain proteins strongly suggest a function in extracellular adhesion [1]. The sequence conservation is predominantly centered around patches of hydrophobic residues. The secondary structure prediction of the DOMON domain points to an all-beta-strand fold with seven or eight core strands supported by a buried core of conserved hydrophobic residues. There is a chraracteristic motif with two small positions (Gly or Ser) corresponding to a conserved turn immediately C-terminal to strand three. It has been proposed that the DOMON domain might form a beta-sandwich structure, with the strands distributed into two beta sheets as is seen in many extracellular adhesion domains such as the immunoglobulin, fibronectin type III, cadherin and PKD (see ) domains [1]. Some proteins known to contain a DOMON domain are listed below: - Mammalian dopamine beta-monooxygenase (DM). It is involved in the conversion of dopamine to the catecholamine noradrenaline, a crucial physiological modulator of the sympathetic nervous system, T-cell-mediated immunity and fetal development. - Drosophila tyramine-beta-hydroxylase (TBH), an orthologue of DM. It is needed for the biosynthesis of the neurotransmitter octopamine from tyramine. - Human brain protein CG-6. - Mouse SDR2 protein. - Botryllus schlosseri PAR protein, a soluble immunoglobulin molecule homologue. - Caenorhabditis elegans uncharacterized protein C09F9.2. - Arabidopsis thaliana uncharacterized protein MBG8.9. The profile we developed covers the entire DOMON domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: September 2000 / First entry (C.J.A. Sigrist). [ 1] Aravind L. Trends Biochem. Sci. 26:524-526(2001). [D1] INTERPRO: [D2] PFAM: [D3] PRINTS: [D4] SMART: {END}{QDOC50840} {PS50840; PA} {BEGIN} ************************************** * Protease-associated domain profile * ************************************** The protease-associated (PA) domain is a region of about 120 amino acids found in proteins belonging to different protease superfamilies, the subtilases and the Zn-containing metalloproteases, and in two classes of plant transmembrane proteins, which are thought to be vacuolar sorting receptors. In the proteins belonging to subtilases family (peptidase family S8), the PA domain is inserted into their catalytic domain. The PA domain can be found associated with fibronectin type-III, subtilase, carboxypeptidase, EGF (see ) and RING-finger (see ) domains. The role of the PA domain in the proteases and the sorting receptors remains somewhat elusive. However in the human transferrin receptor, a catalytically inactive protein, the apical PA domain is positioned as a lid covering the remnants of the active site. Proteins known to contain a PA domain are listed below: - Streptococcal C5a peptidase. - Lactococcal lactocepin, a cell wall protease. - Several plant subtilases of the P69 family. - Plant BP-80, AtELP and PV72 proteins. They are thought to be involved in targeting multiple proteins to the plant lytic vacuoles by recognizing an N-terminal pro-sequence. - Arabidopsis thaliana ReMembR-H2 protein. It targets proteins to the storage vacuoles. - Yeast aminopeptidase Y. - Human glutamate-releasing NAALADase. - Mammalian transferrin receptors TR1 and TR2. They are related to the NAALADases but have lost their metal-coordinating residues and thus their catalytic activity. - Mammalian prostate-specific membrane antigen (PSM). It has folylpoly-gamma- glutamate carboxypeptidase activity. The profile we developed covers the entire PA domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: October 2001 / First entry (C.J.A. Sigrist). [ 1] Luo X., Hofmann K. Trends Biochem. Sci. 26:147-148(2001). [D1] INTERPRO:IPR003137 [D2] PFAM:PF02225 [D3] PRINTS: [D4] SMART: {END} {QDOC50841} {PS50841; DIX} {BEGIN} ********************** * DIX domain profile * ********************** Proteins of the dishevelled family (Dsh and Dvl) play a key role in the transduction of the Wg/Wnt signal (see ) from the cell surface to the nucleus: in response to Wnt signal, they block the degradation of beta- catenin by interacting with the scafolding protein axin. The N-terminus of proteins of the dishevelled family and the C-terminus of proteins of the axin family share a region of homology of about 80 amino acids, which has been called DIX for DIshevelled and aXin [1]. The DIX domain is found associated with PDZ (see ) and DEP (see ) domains in proteins of the dishevelled family and with an RGS domain (see ) in proteins of the axin family. The DIX domain has been shown to be a protein-protein interaction domain that is important for homo- and hetero-oligomerization of proteins of the dishevelled and axin families [2,3,4,5] The profile we developed covers the entire DIX domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: October 2001 / First entry (C.J.A. Sigrist). [ 1] Cadigan K.M., Nusse R. Genes Dev. 11:3286-3305(1997). [ 2] Kishida S., Yamamoto H., Hino S.-I., Ikeda S., Kishida M., Kikuchi A. Mol. Cell. Biol. 19:4414-4422(1999). [ 3] Sakanaka C., Williams L.T. J. Biol. Chem. 274:14090-14093(1999). [ 4] Fukui A., Kishida S., Kikuchi A., Asashima M. Dev. Growth Differ. 42:489-498(2000). [ 5] Julius M.A., Schelbert B., Hsu W., Fitzpatrick E., Jho E., Fagotto F., Costantini F., Kitajewski J. Biochem. Biophys. Res. Commun. 276:1162-1169(2000). [D1] INTERPRO:IPR001158 [D2] PFAM:PF00778 [D3] PRINTS: [D4] SMART:SM0021 {END} {QDOC50849} {PS50849; CUPIN} {BEGIN} ************************ * Cupin domain profile * ************************ The term cupin (from the Latin term 'cupa', for a small barrel or cask) has been given to a beta-barrel structural domain identified in a superfamily of prokaryotic and eukaryotic proteins. The cupin domain is found in one or two copies in proteins whose functions vary from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals. The characteristic cupin domain (see ) comprises two conserved motifs, each corresponding to two beta strands. For the first conserved motif, the characteristic conserved sequence is G-x(5)-H-x-H-x(3,4)-E-x(6)-G, and for second conserved motif this is G-x(5)-P-x-G-x(2)-H-x(3)-N. Between these two motifs (usually His containing) is a less conserved region composed of two beta strands with an intervening loop of variable length. It has been proposed that the compact beta-barrel structure that makes up the cupin core has been coopted for a variety of purposes, many or all of which require a thermostable, pepsin- resistant framework, usually containing metal-binding ligands [1,2,3]. Some proteins known to contain a cupin domain are listed below: - Various eukaryotic zinc finger transcription factors. - Eukaryotic centromeric proteins (CENP-C). - Higher plant germin and germin-like proteins (see ). - Plant sucrose-binding proteins. - Plant seed storage proteins. - Physarum polycephalum spherulins. - Several types of dioxygenase enzymes. - Microbial type II phosphomannose isomerases (PMIs) (EC 5.3.1.8). - Microbial epimerases involved in the synthesis of bacterial and archaeal cell wall components. - Microbial oxalate decarboxylases. - Bacterial polyketide synthases (putative cyclases). - A subset of the bacterial AraC transcription factors.. The profile we have developed covers the entire cupin domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: January 2002 / First entry (C.J.A Sigrist). [ 1] Dunwell J.M., Culham A., Carter C.E., Sosa-Aguirre C.R., Goodenough P.W. Trends Biochem. Sci. 26:740-746(2001). [ 2] Khuri S., Bakker F.T., Dunwell J.M. Mol. Biol. Evol. 18:593-605(2001). [ 3] Dunwell J.M., Khuri S., Gane P.J. Microbiol. Mol. Biol. Rev. 64:153-179(2000). {END} {QDOC50850} {PS50850; MFS} {BEGIN} *********************************************** * Major Facilitator Superfamily (MFS) profile * *********************************************** Among the different families of transporter only two occur ubiquitously in all classifications of organisms. These are the ATP-Binding Cassette (ABC) superfamily and the Major Facilitator Superfamily (MFS). The MFS transporter are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradiants [1,2]. MFS family contains members that function as uniporters, symporters or antiporters. In addition their solute specificity are also diverse. MFS proteins contain 12 transmembrane regions (with some variations). Some proteins known to belong to the MFS superfamily are listed below: - Sugar transporter. - Drug: H+ antiporter. - Organophophate: Pi antiporter. - Oligosaccharide: H+ symporter. - Metabolite: H+ symporter. - Nitrate/nitrite symporter. - Phosphate: H+ symporter. - Nucleoside: H+ symporter. - Oxalate/formate antiporter. - Sialate: H+ symporter. - Monocarboxylate porter. - Anion:cation symporter. - Aromatic acid: H+ symporter. - Cyanate permease. - Proton-dependent oligopeptide transporter. The profile we developed covers the 12 transmembrane regions. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: January 2002 / First entry. [ 1] Pao S.S., Paulsen I.T., Saier M.H. Jr. Microbiol. Mol. Biol. Rev. 62:1-34 (1998). [ 2] Walmsley A.R., Barrett M.P., Bringaud F., Gould G.W. Trends Biochem. Sci. 23:476-81 (1998). {END}