TransPTM: a Transformer-Based Model for Non-Histone Acetylation Site Prediction

04 October 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Protein acetylation is one of the extensively studied post-translational modifications (PTMs) for its sig- nificant roles across a myriad of biological processes. Although many computationl tools for acetylation site identification have been developed, there is a lack of benchmark dataset and bespoke predictors for non-histone acetylation site prediction. To address those problems, we have contributed to both dataset creation and predictor benchmark in this study. Firstly, we construct a non-histone acetylation site bench- mark dataset, namely NHAC, which includes 11 subsets according to the sequence length ranging from 11 to 61 amino acids. There are totally 886 positive samples and 4707 negative samples for each sequence length. Secondly, we propose a transformer-based neural network model, TransPTM, for non-histone acetylation site predication. Our model introduces a pre-trained protein language model ProtT5 to con- struct the site’s feature space. The GNN framewrk consists of three TransformerConv layers for feature extraction and a multilayer perceptron (MLP) module for classification. In experiments, TransPTM has the competitive performance for non-histone acetylation site prediction over 3 SOTA tools. It improves our comprehension on the PTM mechanism and provides a theoretical basis for developing drug targets for diseases. Moreover, the created PTM datasets fills the gap in non-histone acetylation site datasets and is beneficial to the related communities. The source code and data utilized by TransPTM are accessible at https://www.github.com/TransPTM.

Keywords

Non-histone acetylation
Deep learning
Transformer
Protein language model

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.