Discovery of Recurrent Sequence Motifs in Saccharomyces cerevisiae Cell Wall Proteins
This paper describes a procedure for the discovery of recurrent substrings in amino acid sequences of proteins, and its application to fungal cell walls. The evolutionary origins of fungal cell walls are an open biological question. This question can be approached by studies of similarity among the sequences and sub-sequences of fungal wall proteins and by comparison to proteins in animals. We describe here how we have discovered building blocks, represented as recurrent sequence motifs (sub-sequences), within fungal cell wall proteins. These motifs have not been systematically identified before, because the low Shannon entropy of the cell wall sequences has hindered searches for local sequence similarities by sequence alignments. Nonetheless, our new, composition-based scoring matrices for local alignment searches now support statistically valid alignments for such low entropy sequences (Coronado et al. 2006. Euk. Cell 5: 628-637). We have now searched for similarities in a set of 171 known and putative cell wall proteins from baker's yeast, Saccharomyces cerevisiae. The aligned segments were repeatedly subdivided and catalogued to identify 217 recurrent sequence motifs of length 8 amino acids or greater. 95% of these motifs occur in more than one cell wall protein. The median length of the motifs is 22 amino acid residues, considerably shorter than protein domains. For many cell wall proteins, these motifs collectively account for more than half of their amino acids. The prevalence of these motifs supports the idea of fungal cell wall proteins as assemblies of recurrent building blocks.