CatEmbed: A Machine-Learned Representation Obtained via Categorical Entity Embedding for Predicting Adsorption and Reaction Energies on Bimetallic Alloy Surfaces

22 May 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Machine-learning models for predicting adsorption energies on metallic surfaces often relies on basic elemental properties, electronic, and geometric descriptors. Here, we apply categorical entity embedding, a featurization method inspired by natural language processing techniques, to predict adsorption energies on bimetallic alloy surfaces using categorical descriptors. Using this method, we develop a machine-learned representation from categorical descriptors (e.g., surface composition, adsorbate type, and site type) of the slab/adsorbate complex. By combining this representation with numerical features (e.g., slab metal stoichiometric ratios), we create the CatEmbed representation. Remarkably, decision tree models trained using CatEmbed, which includes no explicit geometric information, achieve a Mean Absolute Error (MAE) of 0.12 eV. Additionally, we extended this technique to predict reaction energies on bimetallic surfaces, creating the CatEmbed-React representation, which achieves an MAE of 0.08 eV. These findings highlight the effectiveness of categorical entity embedding for predicting adsorption and reaction energies on bimetallic alloy surfaces.

Keywords

computational catalysis
catalysis screening
natural language processing
machine learning
feature engineering
adsorption energies

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
Summary of each feature representation presented in main text, details on model and feature selection, performance of each model used to calculate average MAEs presented in text, and comparison of entity embedding network vs. CatBoost model performance.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.