Abstract
Protein conformational landscapes contain the functionally relevant information useful for understanding biological processes. Mapping out conformational landscapes provides valuable insights into protein behaviors and biological phenomena, and has relevance to therapeutic design. While experimental structural biology (X-ray, Cryo-EM, NMR) can provide high resolution structures, they struggle to provide information about the full conformational landscapes of biomolecules. Molecular dynamics (MD) simulations are a powerful tool for exploring these landscapes at atomic-scale resolution. However, inferring functionally relevant information, such as the full conformational pathway of long-timescale processes, the impact of mutations on binding, or allosteric coupling between residues across long distances, requires too extensive sampling that a single MD simulation may not achieve. This sampling limitation can be circumvented by generating datasets of parallel molecular simulations, a powerful approach to sample long-timescale events and study complex biological phenomena. Here, we discuss recent advances and present a practical guide to generating massively parallel molecular dynamics datasets. We start by detailing the practical considerations prior to generating a dataset, spanning from storage needs to the timescales addressed by the dataset, as well as modern simulation engines. Subsequently, we discuss how to analyze thee datasets to build unified models of conformational space, including future insights to be made possible by distributed simulation architectures.
Supplementary weblinks
Title
Github repository with code for sample figures and toy landscapes
Description
Github repository with sample figures and toy landscapes for Figures 1 and 3. This repository may also be updated with sample code in the future.
Actions
View