Abstract
The identification of functional enzymes for the catalysis of specific biochemical reactions is a major bottleneck in the de novo design of biosynthesis and biodegradation pathways. Conventional methods based on microbial screening and functional metagenomics require long verification periods and incur high experimental costs; recent data-driven methods are only applicable to a few common substrates. To enable rapid and high-throughput identification of enzymes for complex and less-studied substrates, we propose a robust enzyme promiscuity prediction model based on positive unlabeled learning, which shortens the time needed for new enzyme discovery from several years to 29 days. Using this model, we identified 15 new degrading enzymes specific for the mycotoxins ochratoxin A and zearalenone, of which six could degrade > 90% mycotoxin content within 3 h. We anticipate that this model will become indispensable for the identification of new functional enzymes, thereby advancing the fields of synthetic biology, metabolic engineering, and pollutant biodegradation.