Abstract
In structure-based virtual screening (SBVS), it is critical that scoring functions capture protein-ligand atomic interactions. By focusing on the local domains of ligand binding pockets, a standardized pocket Pfam-based clustering (Pfam-cluster) approach was developed to assess the cross-target generalization ability of machine-learning scoring functions (MLSFs). Subsequently, 11 typical MLSFs were evaluated using random cross-validation (Random-CV), protein sequence similarity-based cross-validation (Seq-CV), and pocket Pfam-based cross-validation (Pfam-CV) methods. Surprisingly, all of the tested models showed decreased performances from Random-CV to Seq-CV to Pfam-CV experiments, not showing satisfactory generalization capacity. Our interpretable analysis suggested that the predictions on novel targets by MLSFs were dependent on buried solvent-accessible surface area (SASA)-related features of complex structures, with larger predicted binding affinities on complexes owning larger protein-ligand interfaces. By combining buried SASA-related features with target-specific patterns that were only shared among structurally similar compounds in the same cluster, random forest (RF)-Score attained a good performance in Random-CV test. Based on these findings, we strongly advise to assess the generalization ability of MLSFs with Pfam-cluster approach and to be cautious with the features learned by MLSFs.
Supplementary materials
Title
supplementary figures
Description
supplementary figures
Actions
Title
supplementary tables
Description
supplementary tables
Actions
Supplementary weblinks
Title
scripts of MLSF generalization ability benchmark
Description
The complete Pfam-cluster approach, 3-fold dataset split, and SHAP analysis processes are available on https://github.com/hnlab/generalization_benchmark. All other data are also available upon request.
Actions
View