------------------------------------------------------------------------------ SWISS-PROT Protein Sequence Data Bank. Release 36.0, July 1998 ------------------------------------------------------------------------------ Classification of metallothioneins and index of the entries in SWISS-PROT ------------------------------------------------------------------------------ Amos Bairoch Email: bairoch@medecine.unige.ch Swiss Institute of Bioinformatics and University of Geneva Switzerland ------------------------------------------------------------------------------ Document name: METALLO.TXT ------------------------------------------------------------------------------ This document provides both a brief description of the classification of metallothioneins into families as developed by Pierre-Alain Binz and Jeremias H.R. Kagi as well as the list, classified by family, of the metallothionein entries in SWISS-PROT. The definition of the metallothionein superfamily and its subdivisions is presented at the bottom of this file. The description of the criteria characterising all families and subfamilies is to be found at the web site http://www.unizh.ch/~mtpage/classif.html Comments about the metallothionein families classification should be sent to: Pierre-Alain Binz Medical Biochemistry Department University of Geneva 1, Rue Michel Servet 1211 Geneva 4 Switzerland E-mail: binz@dim.hcuge.ch ======================== Metallothionein families ======================== Introduction ------------ Metallothioneins (MTs) constitute a superfamily of low molecular weight cysteine-rich metalloproteins and metallopeptides responsible for regulating the intracellular supply of the biologically essential zinc and copper ions and for protecting cells from the deleterious effects of exposure to elevated amounts of these and non-essential polarizable transition and posttransition metal ions such as Cd2+, Hg2+ and others. Since its discovery as a cadmium and zinc containing protein in horse kidney by Margoshes and Vallee [1] MT has been an object of intensive study in various branches of the life sciences [2,3]. MT is now known to occur in all animal phyla examined so far as well as in certain fungi, plants and cyanobacteria. In mammals, 20 totally conserved cysteine residues (Cys) bind, in the reduced form, a complement of 7 equivalents of polarizable bivalent metal ions giving rise to two unique metal-thiolate clusters with characteristic spectroscopic features [4]. The spontaneous refolding of the native structure upon metal addition attests to a guiding role of the positions of the Cys and other AAs conserved in the polypeptide chain. The invertebrate holo-MTs studied thusfar display similar clusters with structural and compositional variations due to different numbers and relative positions of the Cys residues on the polypeptide chain. An empirical classification into three classes was proposed by Fowler et al. [5] and Kojima [6]. Members of Class I are defined to include polypeptides related in the positions of the Cys to equine MT-1B while those of class II display none or very distant correspondence in this respect. Class I subsumes besides the vertebrate MTs all presently known crustacean and molluscan sequences. Class III MTs are polyisopeptides composed of atypical gamma- glutamylcysteinyl units and are no therefore direct gene products [5]. Since this classification system does not allow to differentiate clearly patterns of structural similarities we have now grouped all proteinaceous MT sequences (Class I and Class II) into families of phylogenetically related and, accordingly, alignable sequences. The subclassification discriminating subfamilies and subgroups of MT sequences is based on phylogenetic analyses inferred from both amino-acid and polynucleotide sequences [7-9]. The web page containing all details of the classification system, including strategy and results is to be found at: http://www.unizh.ch/~mtpage/MT.html References ---------- [1] Margoshes M., Vallee B.L. A cadmium protein from equine kidney cortex. J. Am. Chem. Soc. 79:4813-4814(1957). [2] Kagi J.H.R. Overview of Metallothioneins. Meth. Enzymol. 205:613-626(1991). [3] Kagi J.H.R., Kojima Y. Chemistry and biochemistry of metallothionein. Experientia Suppl. 52:25-61(1987). [4] Kagi J.H.R., Schaffer A. Biochemistry of metallothionein. Biochemistry 27:8509-8515(1988). [5] Fowler B.A., Hildebrand C.E., Kojima Y., Webb M. Nomenclature of metallothionein. Experientia Suppl. 52:21(1987). [6] Kojima Y. Definitions and nomenclature of metallothioneins. Meth. Enzymol. 205:8-10(1991). [7] Binz P.-A. Metallothioneins: studies on molecular evolution and on the structural and chiroptical features of their metal thiolate clusters PhD thesis, University Zurich, Switzerland (1996). [8] Binz P.-A., Kagi J.H.R. Molecular evolution of the metallothionein. Suggestions for a natural classification system Fourth International Meeting on Metallothionein, Kansas City (USA), 1997. [9] Kojima Y., Binz P.-A., Kagi J.H.R. Metallothionein: Nomenclature and Classification (submitted) ------------------------------------------------------------------------------ ============================================= SWISS-PROT entries for metallothioneins (MTs) ============================================= Notes: Subdiv. = subdivision Subdivisions are described for each family Rem. = remarks: v: variant(s) known c: conflict(s) known p: partial sequence i: sequence identical in more than one species Source = molecular source: AA: protein sequence NA: nucleic acid sequence The criteria used for the classifications are described at the bottom of this document. ======================== Family 1: vertebrate MTs ======================== Pattern: K-x(1,2)-C-C-x-C-C-P-x(2)-C 3D structure status: available for human Cd7-MT-2 (PDB: 1hmu, 2hmu) rabbit Cd7-MT-2 (PDB: 1mrb, 2mrb) rat Cd7-MT-2 (PDB: 1mrt, 2mrt) rat Cd5Zn2-MT-2 (Braun et al. 1992, PNAS USA 89:10124-10128, not in PDB) Known taxonomic range: vertebrata Multiple sequence alignment: yes Phylogenetic trees: yes Subdivisions: m1: mammalian MT-1 m2: mammalian MT-2 m3: mammalian MT-3 m4: mammalian MT-4 m : mammalian MT a1: avian MT-1 a2: avian MT-2 a : avian MT t : teleost MT b : batracian MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT1_CANFA m1 NA MT1_CRIGR m1 NA MT2_CRIGR m2 NA MT2_MESAU m2 NA MT1A_HORSE m1 v AA MT1B_HORSE m2 AA MT1_MOUSE m1 c AA/NA MT2_MOUSE m2 c AA/NA MT1_PIG m1 NA MT1A_RABIT m1 AA/NA MT2A_RABIT m AA MT2B_RABIT m AA MT2C_RABIT m AA MT2D_RABIT m AA MT2E_RABIT m AA MT1_RAT m1 AA/NA MT2_RAT m2 AA/NA MT1A_BOVIN m1 c i AA/NA MT1H_BOVIN m AA MT1B_SHEEP m1 NA MT1C_SHEEP m1 NA MT2_BOVIN m2 i AA/NA MT2H_BOVIN m AA MT2_STECO m1 AA MT1_CERAE m1 NA MT2_CERAE m2 i NA MT1A_HUMAN m1 NA MT1B_HUMAN m1 NA MT1E_HUMAN m1 AA/NA MT1F_HUMAN m1 AA/NA MT1G_HUMAN m1 NA MT1H_HUMAN m1 AA/NA MT1I_HUMAN m1 AA MT1K_HUMAN m1 AA MT1L_HUMAN m1 AA/NA MT1R_HUMAN m1 NA MT2_HUMAN m2 AA/NA MT3_BOVIN m3 AA MT3_HORSE m3 AA MT3_HUMAN m3 AA/NA MT3_MOUSE m3 NA MT3_PIG m3 NA MT3_RAT m3 NA MT4_HUMAN m4 NA MT4_MOUSE m4 NA MT_CHICK a1 i AA/NA MT1_COLLI a1 AA MT2_COLLI a2 AA MTA_COLVI a p NA MTB_COLVI a p NA MT_BRARE t NA MT_CARAU t c NA MT_CHAAC t NA MT_ESOLU t NA MT_GADMO t c NA MT_NOEBA t NA MT_PERFL t NA MTA_ONCMY t i NA MTB_ONCMY t i NA MTB_SALSA t NA MT_OREMO t NA MT_PLEPL t AA/NA MT_PSEAM t AA/NA MT_RUTRU t AA MTA_SPAAU t NA MTA_THECR t NA MT_TREBE t i NA MT_ZOAVI t NA MT_XENLA b NA ===================== Family 2: mollusc MTs ===================== Pattern: C-x-C-x(3)-C-T-G-x(3)-C-x-C-x(3)-C-x-C-K 3D structure status: not available Known taxonomic range: pelecypoda, gastropoda Multiple sequence alignment: yes Phylogenetic trees: yes Subdivisions: mo1: mussel MT-1 mo2: mussel MT-2 mog: gastropod MT mo : other mollusc MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT_CRAVI mo AA/NA MT_DREPO mo NA MT11_MYTED mo1 AA MT12_MYTED mo1 AA MT13_MYTED mo1 AA MT14_MYTED mo1 AA MT21_MYTED mo2 v AA MT22_MYTED mo2 AA MT23_MYTED mo2 v AA MT_ARIAR mog v AA MTCD_HELPO mog AA MTCU_HELPO mog AA ======================== Family 3: crustacean MTs ======================== Pattern: P-[GD]-P-C-C-x(3,4)-C-x-C 3D structure status: available for blue crab Cd6-MT-I (PDB: 1dmc, 1dmd, 1dme) for mud crab Cd6-MT1 (Not yet in PDB) Known taxonomic range: crustacea Multiple sequence alignment: yes Phylogenetic trees: yes Subdivisions: c : crustacean MT c1: crustacean MT-1 c2: crustacean MT-2 SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT_CARMA c2 v AA MT1_CALSI c1 v AA MT2_CALSI c2 v AA MT1_SCYSE c1 AA MT2_SCYSE c2 AA MT1_HOMAM c AA MT_ASTFL c v AA MT_POTPO c AA =========================== Family 4: echinodermata MTs =========================== Pattern: P-D-x-K-C-V-C-C-x(5)-C-x-C-x(4)-C-C-x(4)-C-C-x(4,6)-C-C 3D structure status: available for sea urchin Cd7-MTA (Wang Y, PhD thesis, Univ. Zurich, 1996, Not yet in PDB) Known taxonomic range: echinoidea Multiple sequence alignment: yes Phylogenetic trees: yes Subdivisions: e1: echinodermata MT type 1 e2: echinodermata MT type 2 SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT_PARLI e2 AA MTA_STRPU e1 c NA MTB_STRPU e1 NA MTA_SPHGR e2 p NA MTB_SPHGR e2 p NA MT_STENE e1 NA ===================== Family 5: diptera MTs ===================== Pattern: C-G-x(2)-C-x-C-x(2)-Q-x(5)-C-x-C-x(2)-D-C-x-C 3D structure status: not available Known taxonomic range: Multiple sequence alignment: yes Phylogenetic trees: no Subdivisions: d1: diptera MT type 1 d2: diptera MT type 2 SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT1_DROME d1 NA MT1_DROSI d1 i NA MT2_DROME d2 NA ====================== Family 6: nematoda MTs ====================== Pattern: K-C-C-x(3)-C-C 3D structure status: not available Known taxonomic range: secernentea Multiple sequence alignment: yes Phylogenetic trees: no Subdivisions: n1: nematoda MT type 1 n2: nematoda MT type 2 SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT1_CAEEL n1 c NA MT2_CAEEL n2 NA ===================== Family 7: ciliata MTs ===================== Pattern: - 3D structure status: not available Known taxonomic range: ciliata (protozoa) Multiple sequence alignment: no Phylogenetic trees: no Subdivisions: ci: ciliata MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT1_TETPI ci i v AA ===================== Family 8: fungi-I MTs ===================== Pattern: C-G-C-S-x(4)-C-x-C-x(3,4)-C-x-C-S-x-C 3D structure status: available for N.crassa Cu6-MT (Not yet in PDB) Known taxonomic range: basidiomycotina, deuteromycotina, ascomycotina Multiple sequence alignment: yes Phylogenetic trees: no Subdivisions: f1: fungi-I MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT_AGABI f1 AA MT1_COLGL f1 NA MT2_COLGL f1 NA MT_NEUCR f1 c AA/NA ====================== Family 9: fungi-II MTs ====================== Pattern: - 3D structure status: not available Known taxonomic range: deuteromycotina Multiple sequence alignment: no Phylogenetic trees: no Subdivisions: f2: fungi-II MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT1_CANGA f2 AA/NA ======================== Family 10: fungi-III MTs ======================== Pattern: - 3D structure status: not available Known taxonomic range: deuteromycotina Multiple sequence alignment: no Phylogenetic trees: no Subdivisions: f3 SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT2_CANGA f3 v AA/NA ======================= Family 11: fungi-IV MTs ======================= Pattern: C-X-K-C-x-C-x(2)-C-K-C 3D structure status: not available Known taxonomic range: ascomycotina Multiple sequence alignment: no Phylogenetic trees: no Subdivisions: f4: fungi-IV MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT1_YARLI f4 NA MT2_YARLI f4 NA ====================== Family 12: fungi-V MTs ====================== Pattern: - 3D structure status: PDB: 1aoo, 1aqq, 1aqr, 1aqs Known taxonomic range: ascomycotina Multiple sequence alignment: yes Phylogenetic trees: no Subdivisions: f5: fungi-V MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MTC_YEAST f5 AA/NA ======================= Family 13: fungi-VI MTs ======================= Pattern: - 3D structure status: not available Known taxonomic range: ascomycotina Multiple sequence alignment: no Phylogenetic trees: no Subdivisions: f6: fungi-VI MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ CRS5_YEAST f6 NA ========================= Family 14: prokaryota MTs ========================= Pattern: K-C-A-C-x(2)-C-L-C 3D structure status: not available Known taxonomic range: cyanobacteria Multiple sequence alignment: yes Phylogenetic trees: no Subdivisions: pr: prokaryota MT SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT_SYNP7 pr AA/NA MT_SYNSP pr AA MT_SYNVU pr NA ==================== Family 15: plant MTs ==================== Pattern: [YFH]-x(5,25)-C-[SKD]-C-[GA]-[SDPAT]-x(0,1)-C-x-[CYF] Note: yields all plant sequences, but also MTCU_HELPO and the non-MT ITB3_HUMAN 3D structure status: not available Known taxonomic range: angiospermae (magnolophyta) Multiple sequence alignment: yes Phylogenetic trees: yes Subdivisions: p1 : plant MT type 1 p2 : plant MT type 2 p2v: plant MT type 2 variant, described as a clan of p2 p3 : plant MT type 3 p21: plant MT type 2x1 pec: plant EC MT-like protein SWISS-PROT Subdiv. Rem. Source ---------- ------- ---- ------ MT1_CASGL p1 NA MT1_CICAR p1 NA MT1_HORVU p1 NA MT1_MAIZE p1 NA MT1_ORYSA p1 NA MT1_PEA p1 NA MTB_TRIRP p1 NA MT1A_VICFA p1 NA MT1B_VICFA p1 NA MT1_WHEAT p1 NA MT2_ACTCH p2 NA MT2A_ARATH p2 c NA MT2B_ARATH p2 NA MT21_BRAJU p2 NA MT23_BRAJU p2 NA MT25_BRAJU p2 NA MT2_BRANA p2 p NA MT2_BRARA p2 c i NA MT2_BRARP p2 NA MT2_CICAR p2 p NA MT1_COFAR p2 NA MT2_FRAAN p2v NA MT2A_LYCES p2 NA MT2B_LYCES p2 c NA MT2X_LYCES p2 NA MT2Y_LYCES p2 NA MT2Z_LYCES p2v NA MT2_MALDO p2 NA MT2_MUSAC p2 NA MT2_NICGU p2v NA MT2_NICPL p2 NA MT21_ORYSA p2 NA MT22_ORYSA p2v NA MT2_RICCO p2 NA MTA_TRIRP p2 NA MT2_VICFA p2 NA MT1_MIMGU p2v c NA MT1A_ARATH p21 NA MT1B_ARATH p21 NA MT1C_ARATH p21 NA MT54_BRANA p21 NA MT3_ACTCH p3 NA MT3_CARPA p3 NA MT3_MALDO p3 NA MT3_MUSAC p3 NA MT3_PICGL p3 NA MT3_PRUAV p3 NA EC1_ARATH pec NA EC2_ARATH pec p NA EC3_ARATH pec p NA EC_MAIZE pec NA EC1_WHEAT pec c AA/NA EC3_WHEAT pec NA ========================================================================== Family 99: phytochelatins and other non-proteinaceous MT-like polypeptides ========================================================================== Note: gammaglutamyl-cysteinyl units, these are not proteins. 3D structure status: not available Known taxonomic range: planta Subdivisions: not defined ------------------------------------------------------------------------------ ================================================================== Definition of the metallothionein superfamily and its subdivisions ================================================================== The metallothionein superfamily is defined phenomenologically as comprising all polypeptides which resemble equine renal metallothionein in several of their features (Nordberg & Kojima 1979, Fowler et al. 1987). Such general features are low molecular weight, high metal content, characteristic amino acid composition (high Cys content, low content of aromatic amino acid residues), unique amino acid sequence with characteristic distribution of Cys, i.e. Cys-X-Cys and spectroscopic manifestations characteristic of metal thiolate clusters. A MT family subsumes MTs which share a particular set of sequence specific characters. Members of a family can belong to only one family and are thought to be evolutionary related. The inclusion of an MT in a family presupposes that its amino acid sequence is alignable with that of all members. A common and exclusive sequence pattern, a profile and a phylogenetic tree can therefore be connected with each family. Each family is identified by its number and its taxonomic range. An example is Family 1: vertebrate MTs. A MT subfamily contains MTs which in addition to the family characters share a set of more stringent phylogenetic features. These extra criteria are usually specific monophyletic relationships among the sequences of proteins and/or of nucleotide segments in the genes (5' or 3' untranscribed portion of the genes, 5' or 3' untranslated portions of the nucleotide sequences, exons, introns). If relevant other differentiating criteria can also be included, such as presence, conservation or repetition of sequence patterns. A subfamily is usually abbreviated with a letter character followed, if necessary, with an arabic number. An example is m1: mammalian MT-1. A MT subgroup represents, as a result of statistically validated phylogenetic analyses, a branch of MT sequences of a subfamily which is clearly distinguishable in a tree by its monophyletic character. An example is m2U2: ungulate MT-2, subgroup of the m2 subfamily. Isoforms or allelic forms are specifiable as members of subgroups, subfamilies and families. They are named using the nomenclature system defined in Kojima et al. (submitted), i.e. human MT-1E. In addition, in cases where it is justifiable, one can define clans. A clan is a set of partial or total amino acid or polynucleotide sequences, subgroups, subfamilies, families or combinations of them which share characters not defined by the above classification criteria. They can be related to common spatial structure, thermodynamic properties, metal binding properties, functionally related characters or other relevant features. A clan is defined by the property common to its members. The abbreviation should reflect this property. ------------------------------------------------------------------------------