Amino acid sequences of seven subfamilies of cytochromes c (mitochondrial cytochromes c, c(1); chloroplast cytochromes c(6), c(f); bacterial cytochromes c(2),c(550),c(551); in total 164 sequences) have been compared. Despite extensive homology within eukaryotic subfamilies, homology between different subfamilies is very weak. Other than the three heme-binding residues (Cys13, Cys14, His18, in numeration of horse cytochrome c) there are only four positions which are conserved in all subfamilies: Gly/Ala6, Phe/Tyr10, Leu/Val/Phe94 and Tyr/Trp/Phe97. In all 17 cytochromes c with known D-structures, these residues form a network Of conserved contacts (6-94, 6-97, 10-94, 10-97 and 94-97). Especially strong is the contact between aromatic groups in positions 10 and 97, which corresponds to 13 interatomic contacts. As residues 6, 10 and residues 94, 97 are in (i, i + 4) and (i, i + 3) positions in the N and C-terminal helices, respectively, the above mentioned system of con served contacts consists mainly of contacts between one turn of N-terminal helix and one turn of C-terminal helix. The importance of the contacts between interfaces of these helices has been confirmed by the existence of these contacts in both equilibrium and kinetic molten globule-like folding intermediates,as well as by mutational evidence that these contacts are involved in tight packing between the N and C-helices. Since these four residues are not involved in heme binding and have no other apparent functional role, their conservation in highly diverged cytochromes c suggests that they are of a critical importance for protein folding. The author assumes that they are involved in a common folding nucleus of all subfamilies of c-type cytochromes. (C) 1998 Academic Press Limited.