Outlier-Based Domain of ApplicabilityIdentification for Materials PropertyPrediction Models

Gihan Panapitiya; Emily Saldanha

doi:10.26434/chemrxiv-2023-pmrfw

Materials Science

Search within Materials Science

Outlier-Based Domain of ApplicabilityIdentification for Materials PropertyPrediction Models

01 February 2023, Version 1

This is not the most recent version. There is a

newer version

of this content available

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Machine learning models have been widely applied for material property prediction. However, practical application of these models can be hindered by a lack of information about how well they will perform on previously unseen types of materials. Because machine learning model predictions depend on the quality of the available training data, different domains of the material feature space are predicted with different accuracy levels by such models. The ability to identify such domains enables the ability to find the confidence level of each prediction, to determine when and how the model should be employed depending on the prediction accuracy requirements of different tasks, and to improve the model for domains with high errors. In this work, we propose a method to find domains of applicability using a large feature space and also introduce analysis techniques to gain more insight into the detected domains and subdomains.

Keywords

Domain of Applicability

Machine Learning

Outlier Detection

Supplementary materials

Title

Description

Actions

Title

Supporting Information: Outlier-Based Domain of Applicability Identification for Materials Property Prediction Models

Description

Data preparation details, KMeans clsutering, Unsupervised Anomaly Detection, Complexities of PNNL, Cui and Delaney datasets, Sample molecules from subdomains.

Actions

Supplementary weblinks

Title

Description

Actions

Title

Code for Domain of Applicability analysis.

Description

This repository contains code and the example scripts required to perform Domain of Applicability analysis.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Feb 03, 2023 Version 2

Feb 01, 2023 Version 1

Metrics

609

219

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2023-pmrfw

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Outlier-Based Domain of ApplicabilityIdentification for Materials PropertyPrediction Models

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share