Abstract
The development of efficient models for predicting specific properties through machine learning is of great importance for the innovation of chemistry and material science. However, predicting global electronic structure properties like frontier molecular orbital HOMO and LUMO energy levels and their HOMO-LUMO gaps from the small-sized molecule data to larger molecules remains a challenge. Here we develop a multi-level attention neural network, named DeepMoleNet, to enable chemical interpretable insights being fused into multi-task learning through (1) weighting contributions from various atoms and (2) taking the atom-centered symmetry functions (ACSFs) as the teacher descriptor. The efficient prediction of 12 properties including dipole moment, HOMO, and Gibbs free energy within chemical accuracy is achieved by using multiple benchmarks, both at the equilibrium and non-equilibrium geometries, including up to 110,000 records of data in QM9, 400,000 records in MD17 and 280,000 records in ANI-1ccx for random split evaluation. The good transferability for predicting larger molecules outside the training set is demonstrated in both equilibrium QM9 and Alchemy datasets at density functional theory (DFT) level. Additional tests on non-equilibrium molecular conformations from DFT-based MD17 dataset and ANI-1ccx dataset with coupled cluster accuracy as well as the public test sets of singlet fission molecules, biomolecules, long oligomers, and protein with up to 140 atoms show reasonable predictions for thermodynamics and electronic structure properties. The proposed multi-level attention neural network is applicable to high-throughput screening of numerous chemical species in both equilibrium and non-equilibrium molecular spaces to accelerate rational designs of drug-like molecules, material candidates, and chemical reactions.