Drug discovery

Drug discovery

Drug discovery is the process by which new candidate medications are discovered.

Drug discovery is the process by which new candidate medications are discovered. It combines the fields of medicine, biotechnology, chemistry, and pharmacology to create new safe and effective medications and treatments. Traditional approaches to drug discovery involve identifying medicinal properties and compounds found in plants and other traditional medical remedies. Technological developments in software, artificial intelligence, biotechnology, medicinal chemistry, and manufacturing have widened the possibilities of drug discovery beyond traditional drug discovery methods.

Overview of the drug discovery process
Identification of an unmet medical need

The drug discovery process begins with the identification of unmet medical needs through market analysis and input from patients, medical practitioners, therapeutic researchers, and scientific conferences. The medical need must have a level of unsatisfactory treatment options to justify initiating the drug discovery process, or the potential of a novel drug may offer substantial advantages compared with existing treatments such as improved therapeutic efficiency, less adverse side-effects, better patient compliance, fewer drug interactions, and improvements in the overall patient quality of life.

After an unmet medical need is established, the drug discovery process moves on to identifying potential drug targets to solve the unmet medical need, through techniques such as phenotypic screening, genetic association, transgenic organisms, and medical imaging. Identifying unmet medical needs is an ongoing process of understanding available therapeutic treatment options, disease etiology, and epidemiology. This constant identification process produces an always changing analysis gap of the perceived value and potential of potential drug candidates leading to certain drug candidates taking priority over others to better address particular medical needs.

Selecting a clinical candidate

A drug candidate for clinical trials typically needs to demonstrate desirable properties in several areas before clinical trials begin. The drug candidate should show evidence of having the following favourable properties:

  1. Chemical
  2. Physiochemical
  3. Pharmacological
  4. Pharmacokinetic
  5. Safety and toxicity
Artificial intelligence

Machine learning, artificial intelligence (AI), and other software developments have made it possible to increase the speed, cost, and overall effectiveness of the drug discovery process through enhancing pattern recognition. Thomas Chittenden, a team leader at the drug discovery company Wuxi NextCODE, commented on the role AI is playing in the drug discovery process in a 2018 interview with the scientific journal Nature by saying:

AI is going to lead to the full understanding of human biology and give us the means to fully address human disease. The way we develop drugs and assess them in clinical trials will all come down to very sophisticated pattern recognition.

Artificial intelligence for drug discovery does face data challenges, such as a scale, growth, diversity, and uncertainty of the data. The data sets available for drug development in pharmaceutical companies can involve millions of compounds and traditional machine learning tools might not be capable of dealing with some types of the data.

Pharmaceutical companies using artificial intelligence

Pharmaceutical company
Artificial intelligence system

The company has developed an AI platform that feeds data from sources such as research papers, patents, clinical trials, and patient records. This forms a representation of more than one billion known and inferred relationships between biological entities such as genes, symptoms, diseases, proteins, tissues, species, and candidate drugs. The data can also be queried like a search engine to produce knowledge graphs of different medical conditions or possible treatments.

The biotechnology company has developed a model to identify unknown cancer mechanisms using tests on more than 1,000 cancerous and healthy human cell samples. The AI platform generates and analyses large amounts of biological and outcome dat from patients to highlight key differences and identify potential treatments.

The Roche subsidiary is using an AI system from GNS Healthcare in Cambridge, Massachusetts to help drive the company's search for cancer treatments.

The company is using IBM's Watson to power the search for immuno-oncology drugs.

The company signed a deal to use Exscientia's artificial intelligence platform to hunt for metabolic-disease therapies.

The company uses AI as part of an approach to classify genes according to roles and other attributes in order to search for connections between RNA-sequence variations, expression levels, molecular functions, and gene location towards understanding both tumor growth and cardiovascular disease.

Approaches to AI in drug discovery

Different roles of AI in drug discovery.

There are multiple approaches to using AI in drug discovery. All of the approaches face similar problems such as small training sets, experimental data errors in training sets, and lack of experimental validations. To face challenges, there have been other AI systems developed, such as deep learning (DL) and other modeling studies, which can be implemented for safety and efficacy evaluations of drug molecules based on big data and modeling analysis.

Structure-based drug-design

Chemist Anthony Bradley at the University of Oxford has used the Diamond Light Source synchrotron to screen compounds for small chemical fragments that bind to molecular targets, if only weakly, with the aim to improve their binding strengths and produce new therapies. Anthony Bradley is also a member of a research group using artificial neural networks as part of a structure-based drug-design project with the Oxford Protein Informatics Group with the aim to use publicly available data on the structural and chemical activity of small molecules to teach the system to identify those that act on protein drug targets.

Virtual chemical space

There is a virtual chemical space that suggests a geographical map of molecules by illustrating distributions of molecules and their properties. The idea behind the illustration of the chemical space is to collect positional information about molecules within the space to search for bioactive compounds, and virtual screening can help select appropriate molecules for further testing.

In silico method

There are numerous in silico methods to virtual screen compounds from virtual chemical spaces, along with structure and ligand-based approaches to provide a profile analysis, with faster elimination of non-lead compounds and selection of drug molecules, with reduced expenditure. Drug design algorithms, such as coulomb matrices and molecular fingerprint recognition, consider the physical, chemical, and toxicological profiles to select a lead compound. Various parameters, such as predictive models, the similarity of molecules, the molecule generation process, and the application of in silico approaches, can be used to predict the desired chemical structure of a compound.


A new artificial intelligence system, DeepVS, is used for the docking of 40 receptors and 2950 ligands. DeepVS showed good performance when 95,000 decoys were tested against these receptors. Another approach applied a multi-objective automated replacement algorithm to optimize the potency profile of cyclin-dependent kinase-2 inhibitor by assessing its shape, similarity, biochemical activity, and physiochemical properties.

Quantitative structure-activity relationship (QSAR)

Quantitative structure-activity relationship-based (QSAR) computational models can predict large numbers of compounds or simple physicochemical parameters. In 2012, Merck supported a QSAR ML challenge to observe the advantages of DL in the drug discovery process. DL models showed significant predictivity compared with traditional ML approaches for 15 absorption, distribution, metabolism, excretion, and toxicity data sets of drug candidates. QSAR modeling tools have been utilized for the identification of potential drug candidates. These AI-based approaches, such as linear discriminant analysis (LDA), support vector machines (SVMs), random forest (RF) and decision trees can be applied to speed up QSAR analysis. When six AI algorithms rank anonymous compounds in terms of biological activity compared to the ranking of traditional approaches, a negligible statistical difference has been found.

Application of algorithms in drug discovery

Many terms have been used since the introduction of algorithms for drug discovery to describe different technologies and approaches. These have included:

  • Computer-aided drug design (CADD)
  • Structure-activity relationship (SAR) analysis
  • Chemoinformatics (which extends SAR analysis to large compound sets, using different methods, and across different bioactivity classes)
  • Machine learning and artificial intelligence

Despite the different terminology, the important matter for each algorithmic approach is the data being analyzed and which methods are being used for this purpose. Methodological computational aspects are focused on the AI context in drug discovery, but it has been argued that the question of what data is being used (and whether that data allows the question to be answered) to achieve a goal should be considered before which algorithmic or computational is applied.

There is a belief that AI will allow researchers and pharmaceutical companies to pinpoint previously unknown causes of disease and accelerate the trend towards treatments designed for patients with specific biological profiles. Another belief of the importance of AI in drug discovery suggests that the proliferation of AI will develop academic changes for researchers—specifically, an increase in the study of a broader curriculum, with less focus on particular gene mutations or other specializations. And an increase the researchers' core areas of expertise and their general knowledge and understanding, with computational models such as AI removing the need for specialized expertise.

Clustered regularly interspaced short palindromic repeats (CRISPR)

Clustered regularly interspaced short palindromic repeats (CRISPR) technology gives drug discovery researchers the ability to alter sequences and expression levels of genes to help them identify therapeutic drug targets and test the therapeutic efficiency of drug candidates. CRISPR has become more useful in the drug discovery process as the cost of using CRISPR and DNA sequencing technologies has decreased due to technological and economies of scale improvements. CRISPR can play a role in several stages of the drug discovery process as outlined in the following diagram:

Role of CRISPR in the stages of Drug Discovery.

CRISPR is used systematically during the target identification and validation process to knock out, inhibit, and alter the expression levels of certain genes. CRISPR helps researchers understand how genes and their expression can improve or diminish disease states, and identify and help better predict how drug candidates may produce beneficial or harmful patient outcomes. Both in vivo and in vitro studies are done using CRISPR technology to facilitate gene knockout and expression level studies in the drug discovery process.

Drug discovery companies




Further reading


Drug Discovery and Development: An Overview

Sandeep Sinha and Divya Vahora


How artificial intelligence is changing drug discovery

Nic Fleming


May 30, 2018

How CRISPR Is Accelerating Drug Discovery

Brittany L. Enzmann, PhD, Ania Wronski, PhD


January 11, 2019

How CRISPR is transforming drug discovery

Andrew Scott


March 7, 2018

Documentaries, videos and podcasts





Research and Markets
May 7, 2021
/PRNewswire/ -- The "Quantitative Structure-Activity Relationship Market Forecast to 2027 - COVID-19 Impact and Global Analysis by Application; Industry, and...
Research and Markets
March 1, 2021
/PRNewswire/ -- The "Global Genomics Market by Product & Service (System & Software, Consumables, Services), Technology (Sequencing, PCR), Application (Drug...
Zoë LaRock
November 24, 2020
Business Insider
Insider Intelligence details how AI and machine learning can help at every stage of the drug discovery process, from research to clinical trials.


Golden logo
Text is available under the Creative Commons Attribution-ShareAlike 4.0; additional terms apply. By using this site, you agree to our Terms & Conditions.