Machine Learning (ML) belongs to the sub-field of artificial intelligence. It is the study of the process and practicality of machines that can perform intelligent tasks proficiently, without the need for explicit programming for these tasks. Recently, artificial intelligence systems have approached human performance in some tasks, such as games and image recognition, but these are in very narrow and concentrated areas. Nevertheless, various forms of artificial intelligence have now been successfully applied in a wide range of fields: from robotics, speech translation and image analysis, to applications in drug molecular design.
Artificial intelligence can play a key role in drug discovery, especially artificial neural networks, such as deep neural networks or recurrent networks, driving the development of this field. Many applications in property or activity prediction, such as physical chemistry and ADMET properties, quantitative structure-property relationship (QSPR) or quantitative structure-activity relationship (QSAR), support this application. Artificial intelligence promotes the development of biologically active molecules toward the desired characteristics. Combined with the synthesis plan and the feasibility of easy synthesis, the possibility of computers automatically discovering drugs is increasing.
In drug discovery, clinical candidate compound molecules must meet a different set of criteria. In addition to the effective efficacy on biological targets, the compound has considerable selectivity for untargeted targets, and has good physicochemical and ADMET properties (absorption, distribution, metabolism, excretion and toxicity properties). Therefore, compound optimization is a multi-dimensional challenge. In the process of multi-dimensional optimization, a large number of silicon prediction methods are used, especially some machine learning techniques have been successfully applied, such as support vector machines (SVM), random forests (Random Forests, RF) or Bayesian learning.
In the absence of reference compounds, it takes about 25 years to design and develop new active molecules from scratch. Due to the development of the field of artificial intelligence, there have been some new developments in the design of compounds from scratch. An interesting method is the variational autoencoder (Figure 4), which consists of two neural networks, an encoder network and a decoder network. The encoder network converts the chemical structure defined by the SMILES representation into a real-valued continuous vector as the latent space. The decoder can convert the vector from this latent space into a chemical structure.
Organic synthesis is a key stage of the small molecule drug discovery program. New molecules are synthesized to follow the compound optimization path and identify molecules with improvements. In some cases, synthetic challenges limit the space available for design molecules. Therefore, the synthesis plan is a critical step in drug discovery. Many calculation methods have been developed to assist in synthesis planning. There are several aspects: predicting the outcome of a reaction with a given set of lead compounds, predicting the yield of a chemical reaction, and inverse synthesis planning. Reverse synthesis plans are mainly controlled by knowledge-based systems based on rules derived from experts or rules automatically extracted from the reaction database.
Artificial intelligence has attracted much attention in recent years and has successfully entered the field of drug discovery. Many machine learning methods, such as QSAR methods, SVMs or random forest methods, are established during the drug discovery process. New algorithms based on neural networks, such as deep neural networks, provide further improvements for attribute prediction, which have been revealed in many benchmark studies comparing deep learning and classic machine learning. The applicability of these new algorithms in many different applications has been proven, including physical and chemical properties, biological activity, and toxicity. In recent years, the application range of artificial intelligence systems has been greatly expanded, including de novo design or inverse synthesis analysis, indicating that we will see more and more applications in areas where large data sets are available. With the progress in these different fields, we can expect that more and more computers will be used for automated drug discovery. Especially the huge advancement in robotics technology will accelerate this progress. However, artificial intelligence is far from perfect. Other technologies with a good theoretical background are still important. In particular, because they benefit from increased computing power, they can simulate larger systems with more accurate methods.