5
3. THE NEW LANGUAGE OF ARTIFICIAL INTELLIGENCE (AI)
Artificial Intelligence can be defined as the way through which a human intelligent
behavior can be transformed into an artificial system, a possibility for simulating the
human intelligence through computers for example, so how the human language can be
transferred into a computational one, in the sense that humans try to teach machines and
computers to think and act like them (Mariano, 2020). The aim of AI is to construct
machines and computer models able to simulate the processing of information
performed in nature by the human brain (in other words, Artificial Intelligence is the
science of producing intelligent machines, computer programs capable of imitating
human thinking and plan activities (Sharma et al; 2022). These systems in fact are
intelligent: AI permits machine models to reason, learn and communicate like humans
(Mariano, 2020). Artificial Intelligence can also be seen as a new technical language
that employs the computer technology for mimicking and increasing knowledge, a term
that represents the acquired ability of technologies to simulate human thinking (Amisha
et al; 2019; Liu et al; 2021). AI has the faculty to better identify hit and lead
compounds: the first is a compound that binds to a biological target (for example a
receptor) and the other is a compound that has pharmacological activity but still requires
to be optimized for its therapeutic effect and safety for being employed, leading to the
improved management of drug development (Debleena et al. 2021; Graham L. Patrick,
2017). The process of new drug design uses the tools of the so called CADD
(Computer-Aided Drug Design) related to AI that is made of computational methods for
developing new compounds because it’s capable to efficiently reduce costs and time of
processing data: in silico method is a CADD procedure that represents the evaluation of
safety and efficacy of the chemical molecule of interest performed by computers and the
screening procedure is another tool that CADD uses for finding proper drug candidates
among chemical libraries (databases that contain information on chemical molecules)
that refers to the activity of detecting the proper pharmacological features that a
candidate must have before having the consent for being used and this approach can
also analyze the potential adverse effects of drug candidates for optimizing hit and lead
compounds (Arya et al; 2021). Since these techniques are computational, then AI
reconsidered and improved them by replacing the traditional tools with their improved
version:
6
Fig. 1 – Artificial Intelligence in drug discovery and development (Debleena et al., 2021)
The 'in silico' procedure is used by AI to visualize big data that have specific
characteristics: a great volume related to the quantity of produced information, a high
velocity is required to elaborate the information that also indicates the amount of time
necessary for obtaining the results and the last feature is a huge variety of knowledge.
Computers use all these in silico elements for eventually deciding the proper drug
candidates that respond to specific requested requisites (Gupta et al. 2021; Ekins et al.,
2007; Patrick, 2017). All of this is done with the purpose of analyzing the features and
the interactions of drug molecules like safety and efficacy, that are fundamental for drug
discovery because it’s important to minimize the adverse effects and maximize the
therapeutic benefit (Pantelidis et al; 2022). AI presents many advantages, such as the
possibility to recognize the most reliable data through a high velocity processing and
management of information with high efficiency and low costs (Tripathi et al; 2021).
On the contrary AI has also disadvantages, such as the unreliability of data because they
can present margins of error or approximations in the processing of data because their
amount is huge to analyze (Debleena et al. 2021). AI can present other unpleasant
aspects like data spareness, insufficiency of instruments and absence of competence:
these disadvantages are related to the fact that drug design for can't be no longer
satisfied by Personal Computers, but improved Supercomputers more efficient than
them and those used nowadays instead can perform all activities hardly carried out by
usual technologies because of high costs and big amount of time (Pantelidis et al; 2022).
7
4. BASIC ROOTS OF AI: MACHINE LEARNING AND DEEP
LEARNING
Artificial Intelligence is an important tool that employs the capacity of computers to
acquire information from already known data and it's considered as a 'box' that
incorporates many approaches like Machine Learning (ML): one of the most important
techniques that AI uses (Dara et al; 2021; Gupta et al; 2021). Machine Learning (ML)
uses machines to collect knowledge through a cognitive approach, so it’s like a
technological superior mind that learns information in a similar way to that of human
logic; it’s a science that permits to computers to do things that require intelligence
simulating those performed by humans (Mariano, 2020; Mouchlis et al., 2021). ML has
the ability to create a software that learns in a suitable and immediate way acquiring
knowledge by observing and elaborating data, it’s an automated system that can
extrapolate features from obtaining of information (Gupta et al; 2021; Debleena et al;
2021; Koski et al; 2021; Mouchlis et al., 2021). The ML model can be explained
through the example of a newborn that has no knowledge, he observes and learns
information from reality and transforms them into rules to follow (Roberto Mariano,
2020). Machine Learning in fact can be also indicated as a model of automatic learning,
while another way to define Machine Learning relates input and output values in a
system, a sort of correspondence leading to a representable nonlinear function y(x;w)
that contains parameters w modifiable according to the x data (if one changes also the
other does) (Roberto Mariano, 2020). ML modeling is adopted by scientists in the
process of planning, invention of new drugs and the analysis of possible unwanted
adverse effects due to drug interactions: algorithms used in Computer-Aided Drug
Design (CADD) and compound libraries have become more feasible for the discovery
of millions of chemical molecules through Virtual Screening, a process described in the
next paragraphs (Gupta et al; 2021; Debleena et al; 2021). Moreover ML uses
Unsupervised Techniques (UTs), that consist in a learning procedure that permits to
automatically acquire features of some chemical structure and don’t classify or label
them: an example that can be represented as in the case of a boy that studies without the
guide of any teacher (Debleena et al; 2021; Pantelidis et al; 2022; Brown, 2021;
Mariano, 2020). On the contrary there are also Supervised Techniques (STs): an
example can be represented by an object and a relative label that describes its
characteristics and classifies them (Brown, 2021; Pantelidis et al; 2022). The ML
8
purpose is learning a general rule that permits to map the input data into the output, as it
would be if there is a teacher that supervises a boy while he learns his lessons,
eventually transforming the input data into images for examples, or into something else
that can be reinterpreted (Pantelidis et al; 2022; Mariano, 2020; Sharma et al; 2016).
The advantages of using ML are the reduced human presence because it’s an automated
process, continuous learning of logic schemes and improving of its functions, better
management of complex huge data, heterogeneity of models that can be used in more
different situations and immediate assistance, high velocity in drug discovery and
development, reduced timing of placing a drug in the market, rapid diagnostics of
diseases (Mariano, 2020; Sharma et al; 2022; Liu et al; 2021; Amisha et al; 2019). The
ideal advantageous situations in which ML can make the difference are those in which
general statistical equations and rules are too complicated to analyze like in the case of
AI facial recognition in which it has to recognize all the physiognomic traits of a
person, analyses of changing events such as the analysis of viruses and potential
pandemics, the analysis of diseases difficult to treat, the search for cures for cancer,
statistical analysis and prediction of the outcomes of the safety testing of drug
candidates and the harvest of safe medical data of patients, decreased human presence in
work and monitoring of huge data and improved drug design (Mariano, 2020;
Vamathevan et al; 2019). The ML disadvantages are instead the possibility to find only
a unique solution in the case of a problem, therefore it's necessary to make more trials in
order for an operation to have success, the necessary high-quality of data and expert
personnel for using it, lack of data transparency and precision because sometimes the
process through which ML reaches the solution to a problem isn't clear, high costs of
hardware to employ, lack of contact with patients and loss of workplaces because it
could replace humans (Mariano, 2020; Vamathevan et al; 2019).
Deep Learning (DL) represents a subdivision of Machine Learning, so it’s a part of it
with specific functions performed through the use of Artificial Neural Networks and
they have two different learning approaches, in fact the way to acquire information for
describing the features of a hypothetical object radically changes: ML examines features
one by one in detail, while DL gives an overview of these features, in fact by taking
whatever object Machine Learning extrapolates characteristics such as the color, the
shape, the side and the position all as input in a classifier to obtain the answer (Mariano,
2020; Dara et al; 2021). On the contrary, Deep Learning is able to extract a wide series
9
of features directly working on the provided image in an automatic procedure that will
be elaborated in the successive Neural Networks until the answer is obtained: summing
up, it's possible to affirm that Machine Learning works on the single characteristics and
Deep Learning works on the entire image (Tripathi et al; 2021; Mariano, 2020;
Mouchlis et al., 2021).
Essentially the main difference between ML and DL is that the method through which
the information is extrapolated changes (Tripathi et al; 2021; Mariano, 2020; Mouchlis
et al., 2021).
Fig. 2 – Advances in de novo drug design: from conventional to Machine
Learning methods (Varnavas Mouchlis et al., 2021)
10
5. THE LANGUAGE OF DEEP LEARNING: ARTIFICIAL NEURAL
NETWORK (ANN)
As mentioned before, Deep Learning can be seen as a sort of duplicate of the human
brain or a copy, formed by the linkage of computational processing units called neurons
made by hardware and components of computers, while obviously the real Neural
Network is formed by biologic neurons: they have the same function at the end, even if
they have different origin, so an Artificial Neural Network is a structural and functional
imitation of the human neuron system (Alzubaidi et al; 2021; Kukreja et al; 2016). The
Network in question is made from the connection of many neurons one after the other,
each of them representing the features of an object in question (Alzubaidi et al; 2021;
Kukreja et al; 2016). The concept of Neural Networks sinks its roots in the overview of
the functioning of biological neurons: the animal nervous system is made by the brain
and by the spinal cord, exercising the management of the organic functions through
messages that, along nerves, reach the target organs in the form of electrochemical
nervous impulses (Kukreja et al; 2016; Sherwood, 2016; Tortora, 2014). The neuron is
the anatomical unit that belongs to the nervous system and it's an electrically active cell
capable of exchanging signals through links with other thousand neurons, every neuron
is made by a central body called soma and by an axon, that is an extension that has
origin from the soma itself and structures called dendrites connect the soma to the axon
of another neuron and the synapses represent communication between neurons that can
have excitatory or inhibitory action through the secretion of chemical substances called
neurotransmitters for favouring or stopping the activity of neurons (Harhs Kukreja et al;
2016; Sherwood, 2016; Tortora, 2014). Dendrites send electrical signals in the form of
input to every neuron and it activates itself by responding through the sending of other
electrical impulses by converting them, turning from an activation to an inactivation
state (Kukreja et al; 2016; Tortora, 2014; Sherwood, 2016). The nervous system is
divided into the central nervous system (CNS) that is made of the brain and the spinal
cord and the peripheral nervous system (PNS), that instead encloses nerve fibers, able to
transfer an information in the form of input to the other parts of the body, that represent
the periphery. The PNS is divided into afferent and efferent neurons: the afferent
neurons collect the input information coming from the external environment toward the
CNS through a sensory receptor, while the efferent neurons instead transport this
information to the other organs of the body (defined as effector organs) through the