Drug discovery is the process by which new candidate medications are discovered. It combines the fields of medicine, biotechnology, chemistry, and pharmacology to create new safe and effective medications and treatments. Traditional approaches to drug discovery involve identifying medicinal properties and compounds found in plants and other traditional medical remedies. Technological developments in software, artificial intelligence, biotechnology, medicinal chemistry, and manufacturing have widened the possibilities of drug discovery beyond traditional drug discovery methods.
The drug discovery process begins with the identification of unmet medical needs through market analysis and input from patients, medical practitioners, therapeutic researchers, and scientific conferences. The medical need must have a level of unsatisfactory treatment options to justify initiating the drug discovery process, or the potential of a novel drug may offer substantial advantages compared with existing treatments such as improved therapeutic efficiency, less adverse side-effects, better patient compliance, fewer drug interactions, and improvements in the overall patient quality of life.
After an unmet medical need is established, the drug discovery process moves on to identifying potential drug targets to solve the unmet medical need, through techniques such as phenotypic screening, genetic association, transgenic organisms, and medical imaging. Identifying unmet medical needs is an ongoing process of understanding available therapeutic treatment options, disease etiology, and epidemiology. This constant identification process produces an always changing analysis gap of the perceived value and potential of potential drug candidates leading to certain drug candidates taking priority over others to better address particular medical needs.
A drug candidate for clinical trials typically needs to demonstrate desirable properties in several areas before clinical trials begin. The drug candidate should show evidence of having the following favourable properties:
- Safety and toxicity
Machine learning, artificial intelligence (AI), and other software developments have made it possible to increase the speed, cost, and overall effectiveness of the drug discovery process through enhancing pattern recognition. Thomas Chittenden, a team leader at the drug discovery company Wuxi NextCODE, commented on the role AI is playing in the drug discovery process in a 2018 interview with the scientific journal Nature by saying:
AI is going to lead to the full understanding of human biology and give us the means to fully address human disease. The way we develop drugs and assess them in clinical trials will all come down to very sophisticated pattern recognition.
Artificial intelligence for drug discovery does face data challenges, such as a scale, growth, diversity, and uncertainty of the data. The data sets available for drug development in pharmaceutical companies can involve millions of compounds and traditional machine learning tools might not be capable of dealing with some types of the data.
Pharmaceutical companies using artificial intelligence
The company has developed an AI platform that feeds data from sources such as research papers, patents, clinical trials, and patient records. This forms a representation of more than one billion known and inferred relationships between biological entities such as genes, symptoms, diseases, proteins, tissues, species, and candidate drugs. The data can also be queried like a search engine to produce knowledge graphs of different medical conditions or possible treatments.
The biotechnology company has developed a model to identify unknown cancer mechanisms using tests on more than 1,000 cancerous and healthy human cell samples. The AI platform generates and analyses large amounts of biological and outcome dat from patients to highlight key differences and identify potential treatments.
The Roche subsidiary is using an AI system from GNS Healthcare in Cambridge, Massachusetts to help drive the company's search for cancer treatments.
The company was birthed out of research at Stanford on molecular machine learning that produced state of the art results. They combine machine learning and simulation to generate novel molecules and predict molecular properties.
The company is using IBM's Watson to power the search for immuno-oncology drugs.
Reverie combines in house data generation with machine learning in order to design novel kinase inhibitors for oncology.
The company signed a deal to use Exscientia's artificial intelligence platform to hunt for metabolic-disease therapies.
The company uses AI as part of an approach to classify genes according to roles and other attributes in order to search for connections between RNA-sequence variations, expression levels, molecular functions, and gene location towards understanding both tumor growth and cardiovascular disease.
There are multiple approaches to using AI in drug discovery. All of the approaches face similar problems such as small training sets, experimental data errors in training sets, and lack of experimental validations. To face challenges, there have been other AI systems developed, such as deep learning (DL) and other modeling studies, which can be implemented for safety and efficacy evaluations of drug molecules based on big data and modeling analysis.
Chemist Anthony Bradley at the University of Oxford has used the Diamond Light Source synchrotron to screen compounds for small chemical fragments that bind to molecular targets, if only weakly, with the aim to improve their binding strengths and produce new therapies. Anthony Bradley is also a member of a research group using artificial neural networks as part of a structure-based drug-design project with the Oxford Protein Informatics Group with the aim to use publicly available data on the structural and chemical activity of small molecules to teach the system to identify those that act on protein drug targets.
There is a virtual chemical space that suggests a geographical map of molecules by illustrating distributions of molecules and their properties. The idea behind the illustration of the chemical space is to collect positional information about molecules within the space to search for bioactive compounds, and virtual screening can help select appropriate molecules for further testing.
There are numerous in silico methods to virtual screen compounds from virtual chemical spaces, along with structure and ligand-based approaches to provide a profile analysis, with faster elimination of non-lead compounds and selection of drug molecules, with reduced expenditure. Drug design algorithms, such as coulomb matrices and molecular fingerprint recognition, consider the physical, chemical, and toxicological profiles to select a lead compound. Various parameters, such as predictive models, the similarity of molecules, the molecule generation process, and the application of in silico approaches, can be used to predict the desired chemical structure of a compound.
A new artificial intelligence system, DeepVS, is used for the docking of 40 receptors and 2950 ligands. DeepVS showed good performance when 95,000 decoys were tested against these receptors. Another approach applied a multi-objective automated replacement algorithm to optimize the potency profile of cyclin-dependent kinase-2 inhibitor by assessing its shape, similarity, biochemical activity, and physiochemical properties.
Quantitative structure-activity relationship-based (QSAR) computational models can predict large numbers of compounds or simple physicochemical parameters. In 2012, Merck supported a QSAR ML challenge to observe the advantages of DL in the drug discovery process. DL models showed significant predictivity compared with traditional ML approaches for 15 absorption, distribution, metabolism, excretion, and toxicity data sets of drug candidates. QSAR modeling tools have been utilized for the identification of potential drug candidates. These AI-based approaches, such as linear discriminant analysis (LDA), support vector machines (SVMs), random forest (RF) and decision trees can be applied to speed up QSAR analysis. When six AI algorithms rank anonymous compounds in terms of biological activity compared to the ranking of traditional approaches, a negligible statistical difference has been found.
Many terms have been used since the introduction of algorithms for drug discovery to describe different technologies and approaches. These have included:
- Computer-aided drug design (CADD)
- Structure-activity relationship (SAR) analysis
- Chemoinformatics (which extends SAR analysis to large compound sets, using different methods, and across different bioactivity classes)
- Machine learning and artificial intelligence
Despite the different terminology, the important matter for each algorithmic approach is the data being analyzed and which methods are being used for this purpose. Methodological computational aspects are focused on the AI context in drug discovery, but it has been argued that the question of what data is being used (and whether that data allows the question to be answered) to achieve a goal should be considered before which algorithmic or computational is applied.
There is a belief that AI will allow researchers and pharmaceutical companies to pinpoint previously unknown causes of disease and accelerate the trend towards treatments designed for patients with specific biological profiles. Another belief of the importance of AI in drug discovery suggests that the proliferation of AI will develop academic changes for researchers—specifically, an increase in the study of a broader curriculum, with less focus on particular gene mutations or other specializations. And an increase the researchers' core areas of expertise and their general knowledge and understanding, with computational models such as AI removing the need for specialized expertise.
Clustered regularly interspaced short palindromic repeats (CRISPR) technology gives drug discovery researchers the ability to alter sequences and expression levels of genes to help them identify therapeutic drug targets and test the therapeutic efficiency of drug candidates. CRISPR has become more useful in the drug discovery process as the cost of using CRISPR and DNA sequencing technologies has decreased due to technological and economies of scale improvements. CRISPR can play a role in several stages of the drug discovery process as outlined in the following diagram:
CRISPR is used systematically during the target identification and validation process to knock out, inhibit, and alter the expression levels of certain genes. CRISPR helps researchers understand how genes and their expression can improve or diminish disease states, and identify and help better predict how drug candidates may produce beneficial or harmful patient outcomes. Both in vivo and in vitro studies are done using CRISPR technology to facilitate gene knockout and expression level studies in the drug discovery process.
Drug discovery companies
Drug Discovery and Development: An Overview
Sandeep Sinha and Divya Vahora
How artificial intelligence is changing drug discovery
May 30, 2018
How CRISPR Is Accelerating Drug Discovery
Brittany L. Enzmann, PhD, Ania Wronski, PhD
January 11, 2019
How CRISPR is transforming drug discovery
March 7, 2018
Step 1: Discovery and Development
April 18, 2019
The Drug Development Process
February 20, 2020