Abstract
Connecting chemistry
to pharmacology (c2p) has been an objective of GtoPdb and its precursor
IUPHAR-DB since 2003. This has been achieved by populating our database with
expert-curated relationships between documents, assays, quantitative results,
chemical structures, their locations within the documents and the protein
targets in the assays (D-A-R-C-P). A
wide range of challenges associated with this are described in this perspective,
using illustrative examples from GtoPdb entries. Our selection process begins with judgements
of pharmacological relevance and scientific quality. Even though we have a stringent focus for our
small-data extraction we note that assessing the quality of papers has become
more difficult over the last 15 years. We discuss ambiguity issues with the
resolution of authors’ descriptions of A-R-C-P entities to standardised
identifiers. We also describe developments that have made this somewhat easier
over the same period both in the publication ecosystem as well as enhancements
of our internal processes over recent years.
This perspective concludes with a look at challenges for the future
including the wider capture of mechanistic nuances and possible impacts of text
mining on automated entity extraction