METAL COORDINATION GROUPS IN PROTEINS - CONSTITUTION AND SEQUENCE

(New April 2004)

The preparation of lists such as these, and comparisons and comments based on them are described in 'The Architecture of Metal Coordination Groups in Proteins', Acta Cryst. D60 (2004) in the press. The tables presented with that paper were based on structures in the PDB up to July 2001. The same procedures have been used to generate the new tables here which include structures in the PDB up to September 2003.

***** These tables can be studied by simply inspecting the web pages, OR they can be downloaded and input to a spreadsheet program (e.g. Microsoft EXCEL) where much more manipulation is possible. ******

FULL LISTING OF CONSTITUTION ETC FOR METAL COORDINATION GROUPS in representative sets

Choose metal: . . . Ca . . . Mg . . . Mn . . . Fe . . . Cu . . . Zn . . . Na . . . K . . . Co . . . Ni . . .

SUMMARIES OF METAL COORDINATION GROUPS (all proteins, resolution < 2.5 A)

Choose metal: . . . Ca . . . Mg . . . Mn . . . Fe . . . Cu . . . Zn . . . Na . . . K . . . Co . . . Ni . . .

In these lists metal coordination groups are categorised by the sequence of amino-acid donors and the relative positions of these amino-acids in the polypeptide chain, together with the total coordination number and information on other non-protein donors present. For example CHCC Zn 2 18 3 describes a coordination group in which Zn is coordinated to the thiolate S of cys(n), where n is the amino-acid residue number, an imidazole N of his(n+2), and the thiolate S's of cys(n+2+18) and cys(n+2+18+3). Using one letter codes for amino-acids the donors are thus C, H, C, C, total coordination number 4, and the sequence separations, seqdif, are 2, 18 and 3. Similarly HDDD Mn 23 4 104 describes a coordination group in which Mn is coordinated to his(n), asp(n+23), asp(n+23+4), asp(n+23+4+104); here the coordination group also contains one water molecule, so the total coordination number is 5. Coordination groups have been sorted into alphabetical order of the donors to allow similar or identical groups in different proteins to be recognised.

To get representative sets for each metal, the full lists of protein structures containing each metal were culled using the procedure PISCES by G. Wang and R. L. Dunbrack, Jr. Bioinformatics, submitted (2002), setting sequence identity <0.30, resolution 0-2.3 A, R factor 0-0.25. (Note that this is better than using a representative set of proteins as was done previously.) These tables can be downloaded, sorted or manipulated using, for example, Microsoft EXCEL.

Each coordination group is named by the pdbcode and the residue number of the first donor amino-acid.

nsp is nspan, the difference between residue numbers of last and first donors,

np is the number of donors from the protein chain,

nw is the number of water molecules,

nn is the number of non-protein donor groups,

dons are the amino-acid donor groups in the order in which they occur in polypeptide chain, using normal one letter codes for amino-acids, O for main chain carbonyl oxygen, # for terminal -NH2.

sd1 to sd7 are the seqdifs ( -99 signifies donors are from two different polypeptide chains, -1 is given when the second donor is water or other non-amino-acid donor); ( warning : when the residue numbers of donors have been assigned negative values in the PDB file these may be in error);

his indicates for each donor histidine whether coordination is by ND or NE (each non-histidine donor is represented by a dot)

cn is the total number of donor groups, including water molecules and small molecule ligands, treating carboxylate always as one group (coordination number,as it would be defined by chemist, is then number of donor groups + number of bidentate carboxylate groups).

cn2 is the change in coordination number if the coordination sphere is extended by 0.20 A; this number is also used to give coded information about whether the metal might be on a rotation axis - if the number is < 20 then the metal atom is not near a rotation axis ; - if 20- 29 then 20 has been added, signifying that a 2-fold axis is possible; - if 30-39 then 30 has been added, signifying that a 3-fold axis is possible; to establish with certainty it is necessary to look at the spacegroup and coordinates.

rms is the r.m.s. deviation of metal to donor atom distances within the coordination sphere from target distances - a useful indicator of quality ( 0 is good, 0.5 is poor).

res is the resolution (A) of the structure determination.

carbi indicates bidentate carboxylate groups, e.g. ..b. indicates that the third of four donor groups appears to be a bidentate carboxylate.

othdon indicates the type of other donor groups present; W is a water molecule, O, N, X indicate O, N, or other donors in non-protein (small) molecules or ions, other 'donors' may occasionally be close metal atoms.

header is the first part of the header name in the pdb file,

ecno is the E.C. enzyme number,when it is given in the PDB file

metal is the name of the metal in the pdb file,

startaa is the name in the pdb file of the first donor amino acid,

conformation is the sequence of residue conformations through the chelate loops - see Acta Cryst. D60 (2004) XXXX for more details.

The constitution of metal coordination groups is carefully defined; an atom is defined as a donor when its distance from the metal atom is within target distance + 0.75 Å; the target distances are those derived in Harding, Acta Cryst. D57, 401-411 (2001), and the software used for the listing of the coordination groups has been developed from software described there.

back to main page