NETWORK ANALYSIS OF THE ORGANIC CHEMISTRY IN PATENTS, LITERATURE, AND PHARMACEUTICAL INDUSTRY

03 October 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Chemical reactions can be connected in large networks such as knowledge graphs. In this way, prior work has been able to draw meaningful conclusions about the structures and properties of the included organic chemistry. However, the research has focused on public sources of organic chemistry that might lack the intricate details of the synthesis routes used in in-house drug discovery. In this work, we expand on previous analyses to also include an in-house electronic lab notebook (ELN), such that important differences between the network architectures can be investigated. Three chemical reaction knowledge graphs were constructed from US Patent and Trademark Office (USPTO), Reaxys, and an in-house ELN, respectively. The three knowledge graphs were compared. We found that the Reaxys knowledge graph is the most interconnected, whereas the USPTO and ELN knowledge graphs appear more arranged around a few central nodes. These differences might be attributed to the different origins of the data in the three sources.

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.