posted on 2008-06-01, 00:00authored byK. Arun, Christopher J. Langmead
Protein nuclear magnetic resonance (NMR) chemical shifts are among the most accurately measurable
spectroscopic parameters and are closely correlated to protein structure because of their dependence
on the local electronic environment. The precise nature of this correlation remains largely
unknown. Accurate prediction of chemical shifts from existing structures’ atomic co-ordinates will
permit close study of this relationship. This paper presents a novel non-linear regression based approach
to chemical shift prediction from protein structure. The regression model employed combines
quantum, classical and empirical variables and provides statistically significant improved
prediction accuracy over existing chemical shift predictors, across protein backbone atom types.
The results presented here were obtained using the Random Forest regression algorithm on a protein
entry data set derived from the RefDB re-referenced chemical shift database.