Abstract
Molecular dynamics (MD) is a core methodology of molecular modeling and computational design for the study of the dynamics and temporal evolution of molecular systems. MD simulations have particularly benefited from the rapid increase of computational power that has characterized the past decades of computational chemical research, being the first method to be successfully migrated to GPU infrastructure. While new-generation MD softwares are capable of delivering simulations on an ever-increasing scale, relatively less effort is invested in developing post-processing methods that can keep up with the quickly expanding volumes of data that are being generated. Here, we introduce a new idea for sampling frames from large MD trajectories, based on the recently introduced framework of extended similarity indices. Our approach presents a new, linearly scaling alternative to the traditional approach of applying a clustering algorithm that usually scales as a quadratic function of the number of frames. When showcasing its usage on case studies with different system sizes and simulation lengths, we have registered speedups of up to two orders of magnitude, as compared to traditional clustering algorithms. The conformational diversity of the selected frames is also noticeably higher, which is a further advantage for certain applications, such as the selection of structural ensembles for ligand docking. The method is available open source at: https://github.com/ramirandaq/MultipleComparisons.
Supplementary materials
Title
Supplementary Information
Description
Supplementary Information
Actions
Supplementary weblinks
Title
MultipleComparisons repo
Description
Repository for the calculation of extended similarity indices and diversity pickers
Actions
View