Abstract
This work presents a state-of-the-art hybrid kernel for molecular property predictions. The hybrid kernel consists of a marginalized graph kernel that operates on molecular graphs and radial basis function kernels that operate on global molecular features. Direct message passing neural network (D-MPNN) with global molecular features is used as strong baselines. After using Bayesian optimization to find the optimal hyperparameters, we benchmark the models on 11 publicly available data sets. Our results show that the prediction of the graph kernel is correlated to the prediction of D-MPNN, which indicates that the molecular representation learned from D-MPNN is very close to the reproducing kernel Hilbert space generated by the hybrid kernel. These results may provide clues for research on the interpretability of graph neural networks. In addition, ensembling the graph kernel models with D-MPNN is the best. The advantage of D-MPNN lies in computational efficiency, and the advantage of the graph kernel model lies in the inherent uncertainty qualification of Gaussian process regression. All codes for graph kernel machines used in this work can be found at https://github.com/Xiangyan93/Chem-Graph-Kernel-Machine.