Abstract
Polymer and chemically modified biopolymer systems present unique challenges to traditional molecular simulation preparation workflows. First, typical polymer and biomolecular input formats, such as protein data bank (PDB) files, lack adequate chemical information needed for the parameterization of new chemistries. Second, polymers are typically too large for accurate partial charge generation methods. In this work, we employed direct chemical perception through the Open Force Field toolkit to create a flexible polymer simulation workflow for organic polymers, encompassing everything from biopolymers to soft materials. We propose and test a new input specification for monomer information that can, along with a 3D conformational geometry, parameterize and simulate most soft material systems within the same workflow used for smaller ligands. The monomer format encompasses a subset of the SMIRKS substructure query language to uniquely identify chemical information and repeating charges in under-specified systems through matching atomic connectivity. This workflow is combined with several different approaches for automatic partial charge generation for larger systems. As an initial proof-of-concept, a variety of diverse polymeric systems were parameterized with the Open Force Field toolkit, including functionalized proteins, DNA, homopolymers, cross-linked systems, and sugars. Additionally, specific properties were computed for PEG, PNIPAAm, and PAAm homopolymers to demonstrate a start-to-finish workflow for high-throughput simulation and property prediction. It is expected that this work will greatly expedite the day-to-day computational research of soft-matter simulations and create a robust atomic-scale polymer specification in conjunction with existing polymer structural notations.