posted on 2006-09-01, 00:00authored byRose Hoberman, Roni Rosenfeld, Judith Klein-Seetharaman
This work attempts to understand and explain positional selection pressure in terms of underlying physical and
chemical properties. We propose a set of constraining assumptions about how these pressures behave, then describe
a procedure for analyzing and explaining the distribution of residues at a particular position in a multiple sequence
alignment. In contrast to previous approaches, our model takes into account both amino acid frequencies and a large
number of physical-chemical properties. By analyzing each property separately, we are able to identify positions
where an unusual conservation pattern is present. In addition, our model can easily incorporate sequence weights that
adjust for bias in the sample sequences. Finally, we provide a measure of statistical significance for our conservation
measure. We demonstrate the applicability of our method on two HIV-1 proteins: Nef and Env. Access to the data
and results presented in this paper are available at http://flan.blm.cs.cmu.edu