FRR 2020-02 in Silico Toxicity Prediction Using An Integrative Multimodel Approach

Vol.
5, 2020-02
In silico Toxicity Prediction using an Integrative Multimodel Approach
Hugo Hernandez① and Livy Shivraj②
① ForsChem Research, 050030 Medellin, Colombia

hugo.hernandez@forschem.org
② SLS Cell Cure Technologies Pvt.Ltd, Secunderabad-500026, India

drlivys@yahoo.com
doi: 10.13140/RG.2.2.13825.20320
Abstract
There is a continual need for new and innovative chemical products to be used in cosmetics,
agrochemicals, pharmaceutical products, cleaning products, and many other applications,
which ultimately result in any sort of contact with humans and other living beings. One of the
principles of Green Chemistry encourages designing safer chemicals, which must preserve or
improve the efficacy of its function while being less toxic than the current alternatives.
However, demonstrating the safety of new compounds in animal models requires a great deal
of time and comes at great expense. Alternative approaches employing computational
predictive models can minimize this burden by providing rapid in silico screening of candidate
compounds, thus quickly eliminating molecules that pose excessive toxicity risk before
devoting development efforts in the lab. In this report, an integrative multimodel approach for
predicting toxicity of chemical compounds is presented, which applies hierarchical toxicity
criteria to provide a robust, although computationally efficient assessment. The application of
the proposed method is illustrated using Cyproconazole as an example compound.
Keywords
Computational Biology, Cyproconazole, Green Chemistry, Molecular Docking, Pharmacokinetic

Modeling, QSAR, Retinoic Acid, Toxicity.
04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (1 / 32)

www.forschem.org
In silico Toxicity Prediction using an
Integrative Multimodel Approach
Hugo Hernandez and Livy Shivraj
drlivys@yahoo.com
1. Introduction
New chemical compounds for different applications (including cosmetics, pharmaceutical

products, agrochemicals, etc.) are in constant demand, but extensive safety profiling is
required before they can be approved for use, in order to minimize the risks to human
health.[1] Conventional safety assessments heavily rely upon studies carried out in vivo with
laboratory animals, but this is a lengthy and expensive process, requiring significant quantities
of the compound of interest, as well as large numbers of research animals.[2] Moreover, this
process is fraught with regulatory and ethical complications.[3]
Computational solutions providing the reliable prediction of chemical toxicity represent a

significant breakthrough, enabling rapid in silico screening of large numbers of compounds
without the cost of testing them or even synthesizing them.[4] With the growing availability of
biochemical and ‘omics data in public databases, and simplified implementation of machine
learning algorithms, these types of solutions have become vastly more accessible. In addition,
application programming interfaces (APIs) are widely available for various computational tools,
such as molecular docking applications, to facilitate the integration of multiple algorithms and
orthogonal analytical outputs.
In this report, we provide a brief overview of general modeling approaches and present the
detailed description of a proposed modeling strategy consisting on a multimodel
computational pipeline for predicting in silico toxicity of novel compounds. We demonstrate
the application of the proposed method in the analysis of an example compound,
Cyproconazole, to illustrate the step-wise assessment of multiple risk factors involved in the
toxicity of a chemical compound. The results obtained at different stages of the method are
captured by toxicity scores that provide simple indicators of toxicity risk.
2. General Modelling Approaches
Computational methods aim to integrate in vitro and in vivo toxicity tests to fairly predict the
toxic potential of a new chemical compound by minimizing the need for animal testing, and
reducing the cost and time of toxicity tests. In silico methods have a singular advantage of
being able to assess chemicals for toxicity even before they are synthesized much earlier in the
research and development pipeline.[5]
Generally speaking, modeling methods include five major steps while developing prediction
models:

www.forschem.org
drlivys@yahoo.com
1. Gathering biological data from experiments that prove associations between

chemicals and toxicity endpoints
2. Calculating molecular descriptors of the chemicals
3. Generating a prediction model
4. Validating the model
5. Interpreting the model
However, their predictability power of these models is directly proportional to the accuracy
and reliability of data available. In general, these methods depend heavily on one or more of
the following:
 Databases which provide information about the chemical properties and biological
information [6,7]
 Software for generating molecular descriptors [8]
 Simulation tools for systems biology and molecular dynamics [9]
 Modeling methods for toxicity prediction [10]
 Modeling tools such as statistical packages and software for generating prediction
models [11]
 Expert systems that include pre-built models in web servers [5]
 Standalone applications for predicting toxicity and visualization tools [12]
This section describes the current state of the art of such modeling approaches.
2.1. Structural Alerts
Structural alerts (SAs) [13] are chemical structures (molecular fragments) associated to
toxicity.[14] SAs can consist of only one atom or several connected atoms. A compound or
mixture having various sub-structures is a combination of SAs and it may contribute to toxicity
more than a single SA.[8,10]
There are two main types of rule-based models here: human-based rules (HBRs) and induction-
based rules (IBRs).[15] HBRs are derived from human knowledge or from literature, but IBRs
are derived computationally through artificial intelligence. HBRs are more accurate but are
limited to human knowledge that could be incomplete or biased.[8]
With each advancing day, more knowledge is added to the existing knowledge base which
makes it impractical to update HBR. On the contrary, the availability of large datasets helps in
the generation of IBRs. The human brain cannot easily connect the complex networks of our
systems but in silico methods can find associations quickly between chemical structural
properties and toxicity endpoints.

www.forschem.org
drlivys@yahoo.com
Sometimes the parent compound may be toxic or its metabolites become toxic as a result of
biotransformation. Using structural alerts to predict toxicity allows identifying the structure of
potential metabolites.[13] However, SAs have a number of limitations. SAs use only binary
features (e.g., chemical structures are either present or absent) and only qualitative endpoints
(e.g., carcinogenic or non-carcinogenic). SAs do not provide insights into the biological
pathways of toxicity and may not be sufficient for predicting toxicity.[16] Depending on the
concurrent absence or presence of other chemical properties, toxicity may decrease or
increase. The list of SAs and rules have to be consistently updated to give an accurate
prediction which often is not possible due to poor data organization and lack of literature
analysis.
It is necessary to understand how to interpret the output of SA models. If SAs are not found for
a chemical or it does not match any toxicity rules, this does not indicate non-toxicity. This is
especially true for HBRs that usually include SAs or rules that indicate toxicity but do not
include SAs or rules that indicate non-toxicity. Therefore, in developing such models, it is
necessary to ensure that the list of SAs and rules are comprehensive and that they are
continuously updated when more experimental data becomes available.
2.2. QSAR
Structure–Activity Relationship (SAR) and Quantitative Structure–Activity Relationship (QSAR)

are a family of models that uses molecular descriptors to predict the toxicity of chemical
compounds.
SAR models are based on non-continuous data, such as identifying active vs. inactive chemicals
based on the presence or absence of specific structural features or properties (e.g. screening
tools that identify whether a chemical will bind or not bind to a receptor). Examples of SAR
approaches include prioritization and ranking of chemical lists [17-19] or data gap analysis using
read-across approaches.[20]
QSAR models are methods that relate structural features of molecules to an activity in a
quantitative manner. QSARs are based on continuous data and result in a quantitatively derived
prediction of activity (e.g., the half maximal effective concentration, EC50), or a chemical,
physical or structural property (e.g., the endocrine receptor relative binding affinity ER-BA
endpoint).[21,22] These relationships rely on information on many chemicals to predict the
activity of a single chemical lacking data. The goal is to quantify “structural similarity” imparting
biological activity by defining structural analogs or chemical categories that may act “similarly”.
Assuming that similar structures result in similar activity, it can be inferred that an untested
chemical that is similar in structure may produce the same activity.

www.forschem.org
drlivys@yahoo.com
There are many tools that provide pre-built QSAR models such as the OECD QSAR Toolbox,
TopKat, Derek Nexus, HazardExpert, VEGA, and METEOR.[5] There are also several advantages
of QSAR models. They are easy to interpret if the descriptors are meaningful. They can model
categorical and continuous toxicity endpoints and molecular descriptors and toxic and non-
toxic chemicals. Using different types of descriptors allows for modeling complex endpoints.
However, some of the limitations of QSARs include:
 QSARs require a large number of chemicals in model development to achieve statistical

significance.
 QSARs require using feature selection to identify the most significant and independent
molecular descriptors, and a large number of descriptors makes the multidimensional
space complex and fragmented.
 QSARs cannot be used for extrapolation between species, routes of exposure, or doses
unless biological data is used.
 QSARs may not be biologically interpretable.
 QSARs do not take dose, duration, or metabolites into consideration.
2.3. Read-Across
Read-across is a method of predicting the unknown toxicity of a molecule using analogous

compounds with known toxicity from the same chemical category. Trend analysis is a method
of predicting toxicity of a chemical by analyzing toxicity trends (increase, decrease, or
constant) of tested chemicals (e.g. when carbon chain length increases, acute aquatic toxicity
increases).[23,24]
The read-across method is developed using either an analogous approach (AN) (called one-to-
one), which uses one or few analogous compounds, or a category approach (CA) (called many-
to-one), which uses many analogous molecules. AN may be sensitive to outliers because two
analogous compounds may have different toxicity profiles.
CA is more advantageous within a category and may increase confidence in the toxicity
predictions. CA requires defining a category boundary to determine if a chemical belongs to the
category and implementing a combination of predictions for compounds that have conflicting
toxicity profiles. A combination of predictions can be done using different mathematical
functions including: minimum, maximum, mode, median, average, linear, quadratic, or other
nonlinear combinations of the predictions. Read-across can be qualitative or quantitative as
mentioned in SA.[25]

www.forschem.org
drlivys@yahoo.com
Predictions of adversity for data-poor chemicals could then be obtained by read-across from
similar data-rich chemicals in the same group. Predictions can also be expressed as probability
statements (likelihood that a chemical in a given category has an adverse effect).
There are several advantages of read-across. Read-across is transparent, easy to interpret and
implement. Read-across allows for a wide range of types of descriptors and similarity measures
to be used to express similarity between chemicals.
Read-across uses small datasets compared to other approaches such as QSAR because there
are usually only a few analogous compounds for a given chemical. Examples of tools
implementing read-across include: The OECD QSAR Toolbox, Toxmatch, ToxTree, AMBIT,
AmbitDiscovery, AIM, DSSTox, or ChemIDplus.[5]
2.4. Systems Biology
Systems biology is the computational and mathematical analysis and modeling of complex
biological systems using metabolic and cell signaling networks. It involves the development of
mechanistic models, such as the reconstruction of dynamic systems from the quantitative
properties of their elementary molecules.[26]
Creating a simple generic model for the diversity of biological systems becomes a big challenge
considering the system-specific type of information and the limited availability of data.
Modeling methods must be tailored not only to make the best use of the information available
but also to answer specific biological questions, ranging from the understanding of the
function of a pathway and its evolution to the molecular mechanisms underlying cellular events
such as cell differentiation.[27] The Adverse Outcome pathway (AOP) concept in systems
biology toxicology has been extremely useful in understanding adverse events in early and late
stages of research and development of novel chemical compounds.[28]
Predictive ability cannot be left to the rule-based computing power of man-made models. The
choice of the type of data has to be carefully picked driven by scientific rationale. Although all
sources of experimental data are potentially useful for modeling a system, modeling
specifically a signaling pathway will have different requirements in terms of data types to be
integrated than a metabolic pathway.
A stringent homeostatic control has to be maintained during the development of a eukaryotic

organism which requires the coordinated working of many molecular and cellular processes,
like cell division and differentiation, as well as metabolism. Without such homeostatic capacity,
the viability of the organism would be compromised. Cellular processes are finely controlled by

www.forschem.org
drlivys@yahoo.com
regulatory molecules like transcription factors which play a critical role in development and
morphogenesis. Advances in research have provided large amounts of data on gene-gene and
protein-protein interactions. With new and intelligent sequencing systems on the rise, the rate
of genome sequencing has been very fast which is posing a challenge for biologists to link
genes and proteins into functional pathways or networks. Computational models have to
address the global dynamical properties of networks which is a challenging task considering
the ever increasing biological information.
Data and knowledge derived from in vitro kinetics of isolated enzymes or unicellular cultures is
often not sufficient to understand the metabolism. The transposition of such knowledge to
multicellular systems remains uncertain. Since dynamic biological networks are represented by
systems of ordinary differential equations (ODE),[29] large sets of experimental data are
needed to determine the values of the parameters present in the system of equations. Most
often the kinetic parameters already available in databases are not always applicable for other
organisms or other experimental scenarios than for those for which they were measured. For
example, kinetic constants usually are developed for metabolic reactions catalyzed by enzymes
acting in a test tube, so the appropriateness for modeling in vivo reactions remains to be
proven. They may also have been measured in different species from the one under
consideration. Alternatively, there is the possibility of parameters estimation,[30,31] but such
methodology is computationally not economical and there is no guarantee that the result is
biologically correct.
Most ODE models make use of continuous functions to describe the kinetic laws. Continuous
functions, however, may not always be appropriate for describing biological processes. For
example, given that molecules are discrete entities, the number of molecules as a function in
time is in reality a discrete function. Hence, it is important to assess if the use of a continuous
function in a given model is a reasonable approximation to reality. As a rule of thumb, if the
experimental error at measuring the real value of a variable is larger than the jump in the
discrete value, then it is usually harmless to replace discrete functions by continuous ones. The
representation of chemical species as a concentration with the use of continuous variables also
assumes that the system is uniform or perfectly mixed. However, in reality biochemical systems
frequently exhibit a large degree of spatial heterogeneity due to processes such as
compartmentalization, molecular association, or restricted diffusion. It has been
mathematically demonstrated that changes in the spatial distribution of molecules have a large
impact on the dynamical behavior of a biochemical system.[32,33] The problem with ODE
models is that they are deterministic. At the molecular scale, individual molecules randomly
collide with one another, allowing for a chemical reaction to occur only if the collision energy is
strong enough.[34] This effect is observed for small volumes; when using deterministic
equations the smaller the volume the less accurate the model becomes.[35] Hence the system

www.forschem.org
drlivys@yahoo.com
under study has to be large enough to avoid stochastic fluctuations which otherwise could give
false signals.[36]
Probabilistic graphical models integrate different types of biological information, such as gene
expression data for example, in order to reconstruct the networks and predict their behaviors.
Experimental validation of such models can be done in the wet lab by observing mRNA
expression or protein expression. Testing entire model predictions with wet experiments are
often impossible, because of simple limitations of available technologies, cost and time.[37]
2.5. Pharmacokinetic/Pharmacodynamic models
Pharmacokinetic (PK) models predict the time lapse of chemical concentration in tissues, and
quantify ADME (Absorption, Distribution, Metabolism, and Excretion) processes.[5,9] PK
models can be compartmental and non-compartmental.[38] A compartment is the whole or
part of an organism in which the concentration of the chemical is uniform. Compartmental
models consist of one or more compartments, and each compartment is usually represented
by its own differential equation. One-compartment models represent the whole body as a
single compartment, under the assumption that the chemical rapidly equilibrates in the
body.[39,40] The concentration ( ) of a certain compound at a given time ( ) can be
determined as:
( )
(2.1)
where is the initial concentration and is the elimination constant. The plot of log of
concentration versus time results in a straight line of slope – . However, these models do not
consider the distribution time of chemicals. Additionally, concentrations in some organs reach
equilibrium faster than in others. Chemicals perfuse rapidly in some tissues, such as liver and
kidney, but slowly in skin and muscles; hence, different equations have to be modelled for
different tissues. For that reason, the concept of two-compartment or multi-compartment
models emerges. After solving the coupled equations, the overall concentration is the weighed
sum of two exponential terms of time (interpreted as distribution phases). The overall
concentration ( ) based on this model is:
( )
(2.2)
where represents volume fraction, and subscripts and indicate each compartment
considered. Also, for the two-compartment model,

www.forschem.org
drlivys@yahoo.com
(2.3)
These models, however, cannot be extrapolated between species or provide any mechanistic
insight.
Summarizing, PK models are toxico-kinetic models used to relate chemical concentration in

tissues to the time of toxic responses.
Pharmacodynamic (PD) models, on the other hand, relate a biological response to the
concentration of the chemical in tissue.[41] PD models based on anatomy, physiology,
biochemistry, and/or biology are called physiologically-based pharmacodynamic (PBPD)
models. PD models can be linear or nonlinear. Linear models should be used with caution
because they do not consider the upper limit of responses and assume that responses always
increase when concentrations increase.
Physiologically-based kinetic (PBK) modelling, also known as physiologically-based pharmaco-

kinetic (PBPK) modelling, physiologically-based toxico-kinetic (PBTK) modelling or
physiologically-based bio-kinetic (PBBK) modelling has contributed to understanding plasma
and tissue dosimetry.[42] The accuracy of the predictions made by these models depends on
the availability and quality of the information on the anatomy and physiology of the target
animal/human as well as biochemical information specific for the toxicant being investigated.
Advances in computational systems biology pathway modelling might be able to support
extrapolation across concentrations and provide concentrations for in vitro/in vivo
extrapolation (IVIVE).[43]
Multiple biological processes occur during pathway perturbations including: Absorption, action
at target tissues, molecular events on a cellular level, inhibition or activation of toxicity
pathways, adaptation to stressors. A chemical may trigger a cascade of events that ultimately
lead to an adverse effect. Information about the exact pathway perturbed by the toxicant and
the specific step in that particular pathway helps in modelling with better predictive power.
PBPK modeling based on reverse dosimetry is a good and reliable method for prediction of in
vivo developmental toxicity using in vitro concentration data translated to in vivo dose
response data.[44] To validate the prediction, benchmark dose (BMD) analysis is performed on
the predicted dose response data by comparing benchmark dose lower confidence limit values
( ) with values derived from the data on the effects of these chemicals using in
vitro assays reported in the literature.[45]
Several researchers have accurately predicted in vivo developmental toxicity for chemicals
using PBK modelling-based reverse dosimetry approaches, translating an in vitro concentration

www.forschem.org
drlivys@yahoo.com
into an in vivo dose.[46] This approach can be used for various toxicological endpoints. This
method is actually able to predict the fold levels of dose required to induce teratogenic effects
in vivo. Any form of integrated testing should focus on predicting in vivo dosimetry at target
tissues.[47,48]
The model includes separate compartments for blood, liver, fat, rapidly perfused tissues, and
slowly perfused tissues. The values for physiological/anatomical parameters are taken from the
literature. The main drawback of PBK models is that they only describe the kinetics of parent
compounds and not their metabolites. Furthermore, identifying the most influential
parameters of a model is also a big challenge. For that, we need to rely upon in vivo data
gathered from kinetic studies of the chemical in animals.
2.6. Uncertainty factors
Uncertainty factors (UFs) (also called assessment/extrapolation/risk factors) are used for
assessing risk from chemical exposure or the recommended daily intake of chemicals.[49] A UF
model is the simplest form of model for inter-species extrapolation (e.g., from animals to
humans), intra-species extrapolation (e.g., from healthy people to special groups of the
population such as elderly people, pregnant women, children, and fetuses), or exposure
duration extrapolation (e.g., from short exposure to long exposure). It requires two main
factors: No observed adverse effect levels (NOAEL), which is the highest dose not exhibiting
observable toxicity and a UF, which is a numerical value to account for variability in inter-
species, intra-species, exposure duration, or exposed dose. Extrapolation is done by dividing
NOAEL by UF. However, there are two limitations for using NOAEL:
 The definition of NOAEL indicates the absence of the appreciable risk of toxicity, but it
does not indicate a zero-effect threshold
 NOAEL values are not constants and can vary depending on experimental designs such
as the number of tested animals, number of doses, and toxicity endpoints.
It was shown that low statistical power (e.g., a small number of tested animals or a small
number of tested doses) would result in higher NOAEL. However, it is possible to use a least
observable adverse effect level (LOAEL, which is the least dose or concentration that causes
the observed effect) or to use a benchmark dose level (BMDL, which is ‘the lower statistical
confidence limit of the dose resulting in a predetermined response’) if NOAEL is not
available.[50,51]
In addition to UFs, modifying factors (MFs) are used to account for uncertainties in the data and
the database. Additionally, safety factors (SFs) are used for irreversible effects, such as

www.forschem.org
drlivys@yahoo.com
teratogenicity and non-genotoxic carcinogenicity. The values of MFs and SFs should not exceed
a value of 10.[50] Although, existing UFs account for intra-species variability, the use of
additional factors for child safety is recommended.
UFs are necessary to estimate reference dose (RfD) and reference concentration (RfC). RfD or
RfC provide quantitative information for use in risk assessments for health effects known or
assumed to be produced through a nonlinear (presumed threshold) mode of action. The
reference values are calculated in general as:
(2.3)
where POD is the point of departure (e.g., NOAEL, LOAEL, or BMDL). Although a default UF of
100 has been proposed, this default value does not account for the quality of the database, the
nature of the effect, the duration of the exposure, route-to-route extrapolation, and
consideration of special groups of the population.[51]
There are several advantages of UF models. They are easy to implement and understand. They
provide adequate safety levels for single chemicals and mixtures of chemicals. Additionally,
they account for inter-species and inter-individual as well as PK and PD differences. However,
there are some limitations of UF models. Default UFs or sub-factors are not conservative nor do
they assume the worst-case scenario. Therefore, extrapolated safety levels of chemicals are
not always below the realistic safety threshold for humans. These models cannot be used to
extrapolate toxicity levels of genotoxic carcinogens because these chemicals always cause
toxicity effects that are proportional to the dose.
2.7. Molecular Docking
Molecular docking is an in silico method used to predict the binding mode of any given
molecule interacting with a specific biological target binding site.[52] Molecular docking
predicts the structure of receptor-ligand complexes, where the receptor is usually a protein or
a protein oligomer and the ligand is either a small molecule or another protein.[53] This
technique is still emerging in the field of predictive toxicology. The main goals of molecular
docking include:[54]
 Predicting a preferred orientation (pose) of a given molecule with respect to a target

biomolecule (accurate structural modeling).
 Scoring the strength of the established binding interactions (correct prediction of
activity).

www.forschem.org
drlivys@yahoo.com
The identification of molecular structures responsible for biological recognition and the
prediction of compound toxicity are complex issues that are often difficult to understand and
simulate. Different models have been proposed for predicting molecular recognition. The lock-
and-key model was first proposed by Emil Fischer in 1894.[55] In this model, recognition occurs
when the shape of the receptor active site is exactly complementary to the ligand shape and
the ligand fits as a key in a lock. The early docking programs were based on the lock-and-key
model by treating the receptor and ligand as rigid bodies. The induced fit model proposed by
Koshland,[56] allows both the receptor and ligand to adapt their structures for optimally
binding to each other. Finally, the conformation selection model considers different possible
conformations of both the receptor and the ligand in solution, and identifies the preferential
binding combination.[53]
The stability of a particular receptor-ligand complex can be measured by determining the

equilibrium binding constant , which is directly related to the Gibbs free binding energy:
(2.4)
Estimating binding free energies accurately is a time-consuming process, particularly

considering all possible poses available. Even when binding conformations are correctly
predicted, it is necessary to differentiate correct poses from incorrect ones. Thus, the need for
a fast and reliable estimations, has led to the definition of different scoring functions which are
based on different simplifying assumptions. Empirical scoring functions are logical extensions
of the structure-activity relationships (QSAR), although for docking, those empirical functions
are based on receptor-ligand structure properties rather than on ligand properties alone.
Docking is usually devised as a multi-step process, involving subsequent molecular dynamics

(MD) or Monte Carlo (MC) simulations to sample particular binding modes more thoroughly
and to obtain a more accurate estimate of the binding free energy.
Accuracy of the binding energy predictions, particularly when the quality of sampling of the
conformational and configurational space associated with macromolecular complexes is
questionable, is an important issue in molecular docking. Furthermore, scoring and reliable
ranking of the test compounds are important bottlenecks in structure-based virtual
screening.[54] Usually, molecular docking software uses proprietary scoring methods, which
causes difficulties with validation of the results.

www.forschem.org
drlivys@yahoo.com
3. Proposed Modelling Strategy
In this section, a general strategy is proposed for assessing the toxicity of chemical compounds
in humans integrating different in silico methods. The most relevant modeling methods
available for estimating toxicity were already mentioned in the previous section. Even though
each one has its own advantages, there is not a perfect method for accurately predicting
toxicity. The main reason for this is that toxic effects are the result of complex interactions
between a chemical compound or its metabolites, and all the compounds involved in different
metabolic pathways present in the body. Until now, a single model considering all possible
interactions and effects for all metabolic pathways in the human body has not been created.
Thus, we propose integrating different available modeling methods, in order to obtain more
reliable predictions of toxicity. The proposed strategy is summarized in Figure 1:
Figure 1. Flow-diagram of the proposed modeling strategy for assessing toxicity at early stages
of product development.

www.forschem.org
drlivys@yahoo.com
The proposed strategy integrates four different modeling strategies commonly used to predict
toxicity of chemical compounds:[57] 1) Toxicity estimation from the chemical structure (using
either Quantitative Structure Activity Relationships modeling (QSAR), Structural Alerts (SA), or
any other suitable Machine Learning (ML) model), 2) Molecular modeling (molecular docking),
3) Physiologically-Based Pharmaco-Kinetic modeling (PBPK), and 4) Uncertainty Factors (UF).
The chemical structure of the molecule to be tested is the input for this strategy. The structure
must be converted into a single-line query format (e.g. SMILES,[58] InChI,[59] etc.) in order to
facilitate search. Alternatively, it would be possible to use directly any single-line description of
the chemical structure. Chemical Identifier Resolver (CIR) tools are available for translating the
structure into a single-line query, or for converting the query between different formats.[60-
62] They can even convert conventional and trade names into any query format. Some QSAR
tools already incorporate a chemical identifier resolver, in which case this step can be omitted.
The next step is the preliminary prediction of toxicity from the chemical structure using
statistical or machine learning (ML) methods. These models heavily rely on training data, so
access to reliable and up-to-date databases is critical. One example of a web-based machine-
learning tool for predicting toxicity is eMolTox.[63] eMolTox incorporates a conformal
prediction confidence measure for improving the predictive capability of conventional QSAR
models.[64] Such confidence measure can be used as a criterion for deciding whether or not
the molecule is potentially toxic, and should be considered for the next modeling stages. For
example, if a 95% confidence level is considered, all molecules showing a confidence value ≥
0.95 should be considered as potentially toxic, and they will continue with the next modeling
stage. If the confidence is < 0.95, then the molecule can be considered as potentially safe. In
this case, the next step would be the experimental validation using experimental toxicology
methods.
It is important noticing that methods based on chemical structure will generate toxicity alerts
whenever similar structures are known to be toxic. It is therefore possible that a safe
compound might generate a false positive alert when relatively similar compounds are toxic.
However, the possibility of a false negative is low, unless the molecule in question possesses a
completely new or extremely rare chemical structure, which would be very unlikely.
The fact that a molecule is flagged as potentially toxic because similar structures are toxic does
not necessarily mean that the molecule is toxic. This cannot be determined using predictive
methods based on structural similarities alone. For that reason, additional levels of analysis are
required.
The second stage of the proposed strategy consists in performing molecular simulations for
understanding the potential interaction between the molecules under consideration and
different critical biomolecular targets present in the human body. The selection of those critical

www.forschem.org
drlivys@yahoo.com
biomolecules, particularly enzymes, proteins and receptors, depends on several factors

including: The metabolic pathway(s) of interest for the evaluation of toxicity, the current
knowledge of the biomolecules involved in each metabolic pathway, and the identification of
the most critical and sensitive biomolecules of the pathway.
Ideally, the structural alerts from the first stage method will provide some insights into the
metabolic pathways that might be potentially affected by the candidate molecule. This is a
valuable input for screening the pathways and target biomolecules to be tested by molecular
simulation. However, it is also possible to consider a standard set of critical targets, covering
the most sensitive pathways. Vedani and coworkers [65] for example, proposed 16 critical
biomolecules for testing toxicity, including: 10 nuclear receptors (androgen, estrogen α,
estrogen β, glucocorticoid, liver X, mineralocorticoid, peroxisome proliferator-activated
receptor γ, progesterone, thyroid α, and thyroid β), four members of the cytochrome P450
enzyme family (1A2, 2C9, 2D6, and 3A4), a cytosolic transcription factor (aryl hydrocarbon
receptor) and a potassium ion channel (hERG). Thus, by running different molecular
simulations, the interaction energy between the candidate molecule and each critical
biomolecule can be determined. They also proposed a toxicity index denoted as the Toxic
Potential (TP), which weighs the effect of the chemical compound on the different
biomolecules tested. TP values ranges from 0 to 1. TP values ≤ 0.2 indicates low toxicity, 0.2 <
TP ≤ 0.6 is moderate, 0.6 < TP ≤ 0.8 is high, and TP > 0.8 is extreme. In this case, a critical TP
value of 0.2 can be used to determine if the last stage of toxicity modeling is required or not.
This approach has been implemented in the VirtualToxLab platform. VirtualToxLab performs
flexible molecular docking calculations for determining the different binding modes between
the candidate molecule and the biomolecule, and then uses a multi-dimensional QSAR method
for estimating the corresponding binding affinities, which then are used for calculating the TP
value.
It is also possible to estimate docking between a chemical compound and a target biomolecule
using the web-based 1-Click docking tool by mcule. This application allows drawing any
chemical structure, and performing molecular docking from a database of more than 9.000
known targets (~3.500 for homo sapiens), but also any custom target not included in the
database can be used. There are other on-line molecular docking tools available (some of them
free), including Swissdock, and DockingServer.
Molecular modeling predicts whether or not a certain chemical compound can disrupt a critical
metabolic pathway. However, the bioavailability of the molecule at the site of action has not
yet been considered. To address this shortcoming, PBPK modeling is used as the next step of
analysis. Nevertheless, compounds with a low toxic potential, while considered as potentially
safe, are not necessarily benign. Those compounds may still present adverse effects through

www.forschem.org
drlivys@yahoo.com
other targets or by other mechanisms. Therefore, in those cases an experimental validation of

toxicity (either in vivo or animal experiments) is always required.
As previously mentioned, pharmacokinetics (PK) is the study of the time course for the
absorption, distribution, metabolism, and excretion of a chemical substance in a biological
system.[66] In PBPK, the concentration of a chemical in the major organs of the body is directly
modeled, following a specific parameterization of those organs and their interaction with the
compound. A complete PBPK model comprises more than 30 different parameters. Most of
those parameters cannot be directly measured in a human body and thus they must be
estimated from experimental data. However, PBPK software usually includes a recommended
set of parameter values, already fitted from available data for different chemical compounds.
There are some parameters specific for each chemical compound, such as the tissue/blood
partition coefficient. This parameter cannot be exactly predicted only from dissociation or lipid
solubility constants, and therefore they have relatively large uncertainties. However, certain
rules of thumb can be used for estimating those parameters with relative confidence,[67] good
enough for predicting extracellular concentrations, and concentration of high lipid soluble
solutes.[68]
PBPK modeling has been mainly developed for pharmaceutical applications, but it is a useful
technology for predicting the toxicity of any type of chemical compound. PBPK is capable of
relating toxicity data from animals with toxicity to humans, by assuming that toxic effects
occur at the same tissue concentration in both cases. Furthermore, PBPK is usually focused on
a specific organ affected by the compound. The target organ(s) can be determined from the
structural alerts obtained during the structural toxicity evaluation of the chemical. Accurate
estimation of relevant parameters for the compound is usually obtained from animal data.
However, when minimizing animal testing is desirable, only theoretical estimates can be used.
Thus, even though the exact level of toxicity of the compound cannot be predicted, a relatively
fair approximation is expected. Furthermore, it is common to build specialized PBPK models for
each particular case instead of using general PBPK models. Again, since the purpose is
screening potentially toxic compounds at early stages of development; general models may be
sufficient.
Commercial software available for PBPK modeling include: ADMET Predictor, Simcyp Simulator,
and PK-Sim, among others. These software packages are mainly oriented towards the
pharmaceutical industry.
For educational and illustrative purposes, free PBPK packages are available. An example of
such freeware is PKQuest, which is a versatile tool requiring a minimum of user input
parameters.[69] PKQuest calculations cannot be used as official toxicity information. However,

www.forschem.org
drlivys@yahoo.com
those results can be used as preliminary estimates of toxicity for screening new chemical
compounds at early stages of development.
Another free alternative for PBPK modeling is using the R-package httk.[70] This package is
freely available.[71] The advantage of this package is that it estimates the specific parameters
for the chemical compound if experimental data is unavailable. Another alternative for
estimating PBPK parameters for a chemical compound from its structure is the PaDEL-
DDPredictor tool.[72]
The last stage of the proposed approach is optional, and will be included if the chemical
product is intended (or might indirectly result) in exposure of sensitive groups of population. In
those cases, the estimates of critical doses are corrected using Eq. (2.3).
Table 1 to Table 4 summarizes some relevant available tools and resources that can be used
with the proposed strategy.
Table 1. Structure to single-line query conversion

Tool Type License Link
Chemspider Web-based Free http://www.chemspider.com/StructureSearch.aspx
Cactus CIR Web-based Free https://cactus.nci.nih.gov/chemical/structure
https://www.knime.com/book/chemical-identifier-
CIR for Knime Plug-in Free
resolver-for-knime-trusted-extension
Table 2. Structural toxicity search

eMolTox Web-based Free http://xundrug.cn/moltox
ToxAlerts Web-based Free, http://ochem.eu/alerts

registration
required
OECD QSAR Desktop- Free, http://www.qsartoolbox.org/

Toolbox based download
required
Derek Nexus Desktop- Proprietary https://www.lhasalimited.org/products/derek-

based license nexus.htm#
QSAR Databank Web-based Free http://qsardb.org/

www.forschem.org
drlivys@yahoo.com
Table 3. Molecular Docking

Mcule 1-Click Web-based Free up to https://mcule.com/apps/1-click-docking/

docking 50
docks/mont
h
VirtualToxLab Desktop- Free license http://www.biograf.ch/index.php?id=projects&subid

based for non- =virtualtoxlab
profit orgs.
Protein Data Web-based Free https://www.rcsb.org/

Bank
SwissDock Web-based Free for http://www.swissdock.ch/docking

academic
use
DockingServer Web-based Limited free https://www.dockingserver.com/web

access
Table 4. PBPK Modeling

PKQuest Desktop- Free http://www.pkquest.com/

based
HTTK R-package Free https://cran.r-project.org/package=httk
ADMET Desktop- Proprietary https://www.simulations-

Predictor based license plus.com/software/admetpredictor/
Simcyp Desktop- Proprietary https://www.certara.com/software/physiologically-

based license based-pharmacokinetic-modeling-and-
simulation/simcyp-simulator/
PK-Sim Desktop- Proprietary http://www.systems-biology.com/products/pk-sim/

based license
PaDEL- Desktop- Free https://omictools.com/padel-ddpredictor-tool

DDPredictor based

www.forschem.org
drlivys@yahoo.com
4. Example: Cyproconazole
As an example of the proposed strategy, it was decided to test Cyproconazole (2-(4-

Chlorophenyl)-3-cyclopropyl-1-(1H-1,2,4-triazol-1-yl)-2-butanol; CAS No. 94361-06-5), a pesticide
molecule known to disrupt the Retinoic Acid (RA) metabolism.[73] RA is an important signaling
molecule in embryonic development, providing transcriptional control over genes affecting
neural tube development, limb formation, and numerous other functions. Dysregulation and
developmental defects can occur either if RA levels are inadequate or if excessive levels are
present at certain stages of development.[74]
The chemical structures of cyproconazole and retinoic acid are presented in Figure 2 and Figure
3, respectively:
Figure 2. Chemical structure of Cyproconazole
Figure 3. Chemical structure of Retinoic Acid
According to the Pesticide Properties Database (PPDB) of the University of Hertfordshire,[75]

cyproconazole is a commonly used fungicide approved for use in different countries. It is a
volatile compound, moderately soluble in water and readily soluble in many organic solvents. It
has a high risk of leaching into groundwater, and can be persistent in both soil and water
systems. It is moderately toxic to mammals, highly toxic to birds, and moderately toxic to most
aquatic organisms, earthworms, and honeybees.

www.forschem.org
drlivys@yahoo.com
Developmental toxicity of cyproconazole in female Wistar rats was investigated by

Machera,[76] indicating that cyproconazole has embryo-toxicity (from 100 mg/kg), fetal
toxicity (from 50 mg/kg), and teratogenic potential (from 20 mg/kg). The main teratogenic
effects observed were cleft palate, hydrocephaly, and severe retardation of ossification. The
teratogenic hazard of triazoles is probably caused by a combination of bioavailability and
cytochrome P-450 mediated metabolism by the embryo cells to form toxic metabolites.[77]
Cyproconazole was also predicted as a reproductive toxicant (developmental toxicity) using
EPA’s ToxCast.[78]
The first step of the proposed strategy is converting the chemical structure into single-line
query formats. Using Chemspider, the following results were obtained:
SMILES: CC(C1CC1)C(Cn2cncn2)(c3ccc(cc3)Cl)O
InChI=1S/C15H18ClN3O/c1-11(12-2-3-12)15(20,8-19-10-17-9-18-19)13-4-6-14(16)7-5-13/h4-7,9-
12,20H,2-3,8H2,1H3
InChIKey: UFNOUKDBUJZYDE-UHFFFAOYSA-N
Furthermore, Chemspider provides access to the following experimental and predicted toxicity
information:
Experimental data: Organic Compound; Organochloride; Pesticide; Amine;

Insecticide; Preservative; Synthetic Compound Toxin, Toxin-
Target Database T3D4492
Predicted (ACD/Labs): #Rule of 5 Violations: 0, LogP: 2.70, LogD (pH 5.5): 2.95, BCF
(pH 5.5): 102.05, LogD (pH 7.4): 2.95, BCF (pH 7.4): 102.62.
Predicted (EPISuite): Log Kow (KOWWIN v1.67 estimate) = 3.25, Bioaccumulation

Estimates: Log BCF from regression-based method = 1.533
(BCF = 34.12).
The single-line query formats can then be used for searching potential toxicity matching in
structural toxicity search tools. For this example, eMolTox was used at this stage, considering a
significance level (expected frequency of errors) of 0.02. The results obtained are summarized
in Figure 4:

www.forschem.org
drlivys@yahoo.com
Figure 4. eMolTox output results obtained for Cyproconazole
This tool also provides some additional predictions, including: LogP 2.865. eMolTox indicates
that there is a potential toxicity risk of this chemical structure, involving the cytochrome P450
2C19.
Since there is a potential toxicity match in QSAR, it is necessary to investigate the toxicity of
the molecule in more detail. This is done by molecular docking. VirtualToxLab already included
cyproconazole in a toxicity prediction set of more than 2000 molecules. Their results,
evaluating molecular docking of different isomers of cyproconazole with 16 key targets are
presented in Table 5:
Table 5. VirtualToxLab results for Cyproconazole isomers

Compound Toxic potential (ToxPot) ToxPot class Main target
Cyproconazole (RR/RS/SR/SS) 0.613 / 0.623 / 0.569 / 0.610 ** 3A4
where the ToxPot class is defined as indicated in Table 6.
Thus, molecular simulation successfully predicts an interaction between cyproconazole and the
cytochrome P-450 3A4, confirming the interaction with a cytochrome P-450. Furthermore, the
toxicity potential class indicates a risk of high toxicity. mcule was also used to confirm the
interaction with the cytochrome P-450 2C19 (not included in VirtualToxLab). The structure of
this target was obtained from the Protein Data Bank.[79] After testing different potential

www.forschem.org
drlivys@yahoo.com
binding sites, one of the largest interactions found, with a docking score of -7.2 (more negative
values indicate higher binding affinity) is illustrated in Figure 5. This result, along with the
information from VirtualToxLab, confirms the initial information of a potential toxic effect on
cytochrome P-450 targets. After testing other potential targets, high affinity was also found for
the 1A28 progesterone receptor (Figure 6, docking score of -7.1), thus also suggesting relevant
reproductive toxicity.
Table 6. VirtualToxLab definition of ToxPot class.[80]

ToxPot Symbol Toxic potential range Possible interpretation1 with respect to binding
class towards one of the five tested protein classes. 2
Note3
n/a ToxPot ≤ 0.300 unlikely to show any adverse effect (triggered by

the 16 target proteins tested in the VirtualToxLab)
0.300 < ToxPot ≤ 0.400
compound may bind weakly to a single target class
0 ~ 0.400 < ToxPot ≤ 0.500 compound may bind modestly to a single target
class or weakly to several target classes
I * 0.500 < ToxPot ≤ 0.600 compound binds moderately to one target class
(e.g. hERG) or modestly to several classes
II ** 0.600 < ToxPot ≤ 0.700 compound binds strongly to one target class (e.g.
hERG) or moderately to several classes
III *** 0.700 < ToxPot ≤ 0.800 compound binds strongly to two target classes (e.g.
nuclear receptor classes I, II)
IV **** ToxPot > 0.800 compound binds strongly to three target classes
(e.g. nuclear receptor classes I, II and cytochrome
P450 enzymes)
1
Please note that a compound with a ToxPot < 0.5 may still trigger adverse effects, particularly upon
continued exposure
2
Protein classes: nuclear receptor class I (AR, ERα, ERβ, GR, MR, PR), nuclear receptor class II (LXR,
PPARγ, TRα TRβ), cytochromes (1A2, 2C9, 2D6, 3A4), hERG, and AhR).
3
A more accurate assessment is possible by analyzing the individual binding affinities towards the 16
target proteins (using the VirtualToxLab interface).

www.forschem.org
drlivys@yahoo.com
Figure 5. High binding affinity docking pose for the complex cyproconazole-CYP2C19 (mcule)
Figure 6. High binding affinity docking pose for the complex cyproconazole-1A28 progesterone
(mcule)

www.forschem.org
drlivys@yahoo.com
For the particular case of the RA metabolic pathway, the set of target biomolecules might also
have included: nuclear receptors-RAR𝞪, RARβ, RAR𝛄, RXR𝞪, RXRβ, RXR𝛄, phase 1-oxidation
enzymes, phase 2-conjugation enzymes, phase 3-transporters and cytochromes CYP26A1,
CYP26B1, CYP26C1.
Even though potential toxicity has been confirmed at the molecular level, bioavailability is also
required. Chemspider already predicted no violation of Lipinski’s rule of five, for orally active
drugs. Also, some bioaccumulation estimates (BCF) were presented. Now, in order to confirm
bioavailability and toxic dose, PBPK modeling is used in the final stage. PBPK models require
parameter values for the organism (and specific internal organs), and for the specific chemical
compound of interest. In principle, the default parameters of the organism can be used, which
corresponds to those of an average person. The specific parameters of the chemical compound
are more difficult to obtain. Those parameters are usually obtained from in vivo data. Since
reducing animal experiments is an important goal of this procedure, they will be estimated
instead. For that reason, the R-package httk is exemplarily used for predicting the toxic dose of
cyproconazole in humans. Using the function calc_mc_oral_equiv it is possible predicting the
oral equivalent dose of a compound from the estimated steady state plasma concentration of
the compound obtained by Monte Carlo/PBPK simulation. The parameters used for the PBPK
model can be obtained from the function parameterize_pbtk. The Monte Carlo component of
this method, accounts for uncertainty in the estimation of different parameters. For
cyproconazole, the following results are predicted by httk:
Table 7. PBPK modeling results obtained using the httk package.
Quantile 5% 50% 95%
Steady state Plasma

4.44 15.30 43.54
concentration [𝜇M]
Equivalent oral dose

0.02254 0.00654 0.00230
[mg/(kg.day)]
Other properties that can be determined include: Clearance (calc_total_clearance), volume of

distribution (calc_vdist), and elimination rate (calc_elimination_rate), amongst others. For
cyproconazole, the following PK properties are found:
Clearance = 0.0259 L/(kg.h)
Volume of distribution = 1.918 L/kg
Elimination rate = 0.0135 1/h (Thus, predicted half-life = 0.693/0.0135 = 51.34 h).

www.forschem.org
drlivys@yahoo.com
Thus, the relatively low equivalent oral doses required to achieve steady state, along with the
relatively large half-life for elimination of the compound indicates that human exposure to
cyproconazole may easily result in adverse outcomes, particularly in developmental toxicity (as
it was found in the previous stages of the simulation).
5. Conclusion
Computational toxicology is a rapidly evolving discipline that integrates information and data
from different sources to develop mathematical and computer-based models to better
understand and predict adverse health effects caused by chemicals. Different tools and
methods have been developed for the in silico assessment of toxicity of chemical compounds.
Each method has its own advantages and its drawbacks. Thus, there is no perfect method for
completely providing a toxic profile of a compound. In this report, a strategy for assessing
toxicity is proposed by synergistically combining different models available.
Following the proposed strategy, the toxicity assessment of a chemical compound can be done
within one day, if information on all target molecules of interest and binding centers, as well as
all pharmaco-kinetic parameters, are available. In a worst case scenario, if this information is
not readily available, the whole simulation process for a single chemical compound might take
less than half a month to be completed. This is a significant time reduction, compared to
experimental toxicity evaluations, not to mention that animal experiments are minimized.
However, the proposed strategy DOES NOT replace toxicology experiments for ensuring that a
chemical compound is safe, but it DOES replace toxicology experiments for screening out
potentially toxic compounds in an early stage of development.[81] In the same line of thought,
since all the results are obtained from computational models, it cannot be expected that they
are absolutely certain [82] and thus, they must always be used with caution.
Acknowledgments
The authors gratefully acknowledge helpful discussions with Louis Hom and Debbie Narver, as
well as their support revising an initial version of the manuscript.
This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.

www.forschem.org
drlivys@yahoo.com
References
[1] Kim, K.-H., Kabir, E., & Jahan, S. A. (2017). Exposure to pesticides and the associated human
health effects. Science of The Total Environment, Vol. 575, pp. 525–535.
https://doi.org/10.1016/j.scitotenv.2016.09.009
[2] Höfer, T., Gerner, I., Gundert-Remy, U., Liebsch, M., Schulte, A., Spielmann, H., …, Wettig, K.
(2004). Animal testing and alternative approaches for the human health risk assessment under
the proposed new European chemicals regulation. Archives of Toxicology, 78(10), 549–564.
[3] Ferdowsian, H. R., & Beck, N. (2011). Ethical and scientific considerations regarding animal
testing and research. PloS One, 6(9), e24059.
[4] Mattison, D. R. (2015). Computational Methods for Reproductive and Developmental

Toxicology. CRC Press.
[5] Raies, A. B., & Bajic, V. B. (2016). In silico toxicology: computational methods for the
prediction of chemical toxicity. Wiley Interdisciplinary Reviews. Computational Molecular
Science, 6(2), 147–172.
[6] Benigni, R., Battistelli, C. L., Bossa, C., Tcheremenskaia, O., & Crettaz, P. (2013). New
perspectives in toxicological information management, and the role of ISSTOX databases in
assessing chemical mutagenicity and carcinogenicity. Mutagenesis, 28(4), 401–409.
[7] Zhu, H. (2013). From QSAR to QSIIR: searching for enhanced computational toxicology
models. Methods in Molecular Biology , 930, 53–65.
[8] Venkatapathy, R., & Wang, N. C. Y. (2013). Developmental toxicity prediction. Methods in
Molecular Biology , 930, 305–340.
[9] Jack, J., Wambaugh, J., & Shah, I. (2013). Systems toxicology from genes to organs.
Methods in Molecular Biology , 930, 375–397.
[10] Roncaglioni, A., Toropov, A. A., Toropova, A. P., & Benfenati, E. (2013). In silico methods to
predict drug toxicity. Current Opinion in Pharmacology, Vol. 13, pp. 802–806.
https://doi.org/10.1016/j.coph.2013.06.001
[11] Gatnik, M. F., & Worth, A. P. (2010). Review of software tools for toxicity prediction.
Publications Office of the European Union.
[12] Guha, R. (2013). On Exploring Structure–Activity Relationships. Methods in Molecular

Biology, pp. 81–94. https://doi.org/10.1007/978-1-62703-342-8_6
[13] Blagg, J. (2010). Structural Alerts for Toxicity. Burger’s Medicinal Chemistry and Drug
Discovery. https://doi.org/10.1002/0471266949.bmc128

www.forschem.org
drlivys@yahoo.com
[14] Limban, C., Nuţă, D. C., Chiriţă, C., Negreș, S., Arsene, A. L., Goumenou, M., Sarigiannis, D.
A. (2018). The use of structural alerts to avoid the toxicity of pharmaceuticals. Toxicology
Reports, 5, 943–953.
[15] Kar, S., & Leszczynski, J. (2019). Exploration of Computational Approaches to Predict the
Toxicity of Chemical Mixtures. Toxics, 7(1). https://doi.org/10.3390/toxics7010015
[16] Alves, V., Muratov, E., Capuzzi, S., Politi, R., Low, Y., Braga, R., … Tropsha, A. (2016). Alarms
about structural alerts. Green Chemistry: An International Journal and Green Chemistry
Resource: GC, 18(16), 4348–4360.
[17] Cronin, M. T. D., & Worth, A. P. (2008). (Q)SARs for Predicting Effects Relating to
Reproductive Toxicity. QSAR & Combinatorial Science, Vol. 27, pp. 91–100.
https://doi.org/10.1002/qsar.200710118
[18] Russom, C. L., Breton, R. L., Walker, J. D., & Bradbury, S. P. (2003). An overview of the use
of quantitative structure-activity relationships for ranking and prioritizing large chemical
inventories for environmental risk assessments. Environmental Toxicology and Chemistry /
SETAC, 22(8), 1810–1821.
[19] Schmieder, P., Mekenyan, O., Bradbury, S., & Veith, G. (2003). QSAR prioritization of
chemical inventories for endocrine disruptor testing. Pure and Applied Chemistry, Vol. 75, pp.
2389–2396. https://doi.org/10.1351/pac200375112389
[20] Hewitt, M., Enoch, S. J., Madden, J. C., Przybylak, K. R., & Cronin, M. T. D. (2013).
Hepatotoxicity: a scheme for generating chemical categories for read-across, structural alerts
and insights into mechanism(s) of action. Critical Reviews in Toxicology, 43(7), 537–558.
[21] Deeb, O., & Goodarzi, M. (2012). In silico quantitative structure toxicity relationship of
chemical compounds: some case studies. Current Drug Safety, 7(4), 289–297.
[22] Devillers, J. (2013). Methods for building QSARs. Methods in Molecular Biology , 930, 3–27.
[23] Benigni, R., Bossa, C., & Tcheremenskaia, O. (2013). Nongenotoxic carcinogenicity of
chemicals: mechanisms of action and early recognition through a new set of structural alerts.
Chemical Reviews, 113(5), 2940–2957.
[24] Modi, S., Hughes, M., Garrow, A., & White, A. (2012). The value of in silico chemistry in the
safety assessment of chemicals in the consumer goods and pharmaceutical industries. Drug
Discovery Today, 17(3-4), 135–142.
[25] Valerio Jr, L. G. (2009). In silico toxicology for the pharmaceutical sciences. Toxicology and
applied pharmacology, 241(3), 356-370. https://doi.org/10.1016/j.taap.2009.08.022
[26] Alberghina, L., & Westerhoff, H. V. (2007). Systems Biology: Definitions and Perspectives.
Springer Science & Business Media.

www.forschem.org
drlivys@yahoo.com
[27] Yang, N.-S. (2011). Systems and Computational Biology: Bioinformatics and Computational
Modeling. BoD – Books on Demand.
[28] Villeneuve, D. L., Crump, D., Garcia-Reyero, N., Hecker, M., Hutchinson, T. H., LaLone, C. A.,
… Whelan, M. (2014). Adverse outcome pathway (AOP) development I: strategies and
principles. Toxicological Sciences: An Official Journal of the Society of Toxicology, 142(2), 312–
320.
[29] Klipp, E. (2007). Modelling dynamic processes in yeast. Yeast , 24(11), 943–959.
[30] Ashyraliyev, M., Jaeger, J., & Blom, J. G. (2008). Parameter estimation and determinability
analysis applied to Drosophila gap gene circuits. BMC Systems Biology, 2, 83.
[31] Sorribas, A., & Cascante, M. (1994). Structure identifiability in metabolic pathways:
parameter estimation in models based on the power-law formalism. Biochemical Journal, 298
(Pt 2), 303–311.
[32] Zhou, H.-X., Rivas, G., & Minton, A. P. (2008). Macromolecular crowding and confinement:
biochemical, biophysical, and potential physiological consequences. Annual Review of
Biophysics, 37, 375–397.
[33] Zimmerman, S. B., & Minton, A. P. (1993). Macromolecular crowding: biochemical,

biophysical, and physiological consequences. Annual Review of Biophysics and Biomolecular
Structure, 22, 27–65.
[34] Hernandez, H. (2019). Collision Energy between Maxwell-Boltzmann Molecules: An

Alternative Derivation of Arrhenius Equation. ForsChem Research Reports, 4, 2019-13. doi:
10.13140/RG.2.2.21596.33926.
[35] Ellis, R. J. (2001). Macromolecular crowding: an important but neglected aspect of the
intracellular environment. Current Opinion in Structural Biology, 11(1), 114–119.
[36] Fournier, T., Gabriel, J. P., Mazza, C., Pasquier, J., Galbete, J. L., & Mermod, N. (2007).
Steady-state expression of self-regulated genes. Bioinformatics , 23(23), 3185–3192.
[37] Bloomingdale, P., Housand, C., Apgar, J. F., Millard, B. L., Mager, D. E., Burke, J. M., & Shah,
D. K. (2017). Quantitative systems toxicology. Current Opinion in Toxicology, 4, 79–87.
[38] El-Masri, H. (2013). Modeling for regulatory purposes (risk and safety assessment).
Methods in Molecular Biology , 930, 297–303.
[39] Mager, D. E., Wyska, E., & Jusko, W. J. (2003). Diversity of mechanism-based
pharmacodynamic models. Drug Metabolism and Disposition: The Biological Fate of Chemicals,
31(5), 510–518.
[40] Sung, J. H., Srinivasan, B., Esch, M. B., McLamb, W. T., Bernabini, C., Shuler, M. L., &
Hickman, J. J. (2014). Using physiologically-based pharmacokinetic-guided “body-on-a-chip”

www.forschem.org
drlivys@yahoo.com
systems to predict mammalian response to drug and chemical exposure. Experimental Biology
and Medicine, Vol. 239, pp. 1225–1239. https://doi.org/10.1177/1535370214529397
[41] Meibohm, B., & Derendorf, H. (1997). Basic concepts of

pharmacokinetic/pharmacodynamic (PK/PD) modelling. International Journal of Clinical
Pharmacology and Therapeutics, 35(10), 401–413.
[42] Blaauboer, B. J. (2010). Biokinetic modeling and in vitro-in vivo extrapolations. Journal of
Toxicology and Environmental Health. Part B, Critical Reviews, 13(2-4), 242–252.
[43] Bhattacharya, S., Zhang, Q., Carmichael, P. L., Boekelheide, K., & Andersen, M. E. (2011).
Toxicity Testing in the 21st Century: Defining New Risk Assessment Approaches Based on
Perturbation of Intracellular Toxicity Pathways. PLoS ONE, Vol. 6, p. e20887.
https://doi.org/10.1371/journal.pone.0020887
[44] Louisse, J., de Jong, E., van de Sandt, J. J. M., Blaauboer, B. J., Woutersen, R. A., Piersma,
A. H., … Verwei, M. (2010). The use of in vitro toxicity data and physiologically based kinetic
modeling to predict dose-response curves for in vivo developmental toxicity of glycol ethers in
rat and man. Toxicological Sciences: An Official Journal of the Society of Toxicology, 118(2),
470–484.
[45] Daston, G. P., Chapin, R. E., Scialli, A. R., Piersma, A. H., Carney, E. W., Rogers, J. M., &
Friedman, J. M. (2010). A different approach to validating screening assays for developmental
toxicity. Birth Defects Research. Part B, Developmental and Reproductive Toxicology, 89(6),
526–530.
[46] Zhang, M., van Ravenzwaay, B., Fabian, E., Rietjens, I. M. C. M., & Louisse, J. (2018).
Towards a generic physiologically based kinetic model to predict in vivo uterotrophic responses
in rats by reverse dosimetry of in vitro estrogenicity data. Archives of Toxicology, 92(3), 1075–
1088.
[47] Rotroff, D. M., Wetmore, B. A., Dix, D. J., Ferguson, S. S., Clewell, H. J., Houck, K. A., …
Thomas, R. S. (2010). Incorporating human dosimetry and exposure into high-throughput in
vitro toxicity screening. Toxicological Sciences: An Official Journal of the Society of Toxicology,
117(2), 348–358.
[48] Wetmore, B. A., & Thomas, R. S. (2013). Incorporating Human Dosimetry and Exposure
Information with High-Throughput Screening Data in Chemical Toxicity Assessment. High-
Throughput Screening Methods in Toxicity Testing, pp. 77–95.
https://doi.org/10.1002/9781118538203.ch3
[49] Dorne, J. L. C. M. (2010). Metabolism, variability and risk assessment. Toxicology, 268(3),
pp. 156–164. https://doi.org/10.1016/j.tox.2009.11.004

www.forschem.org
drlivys@yahoo.com
[50] Falk-Filipsson, A., Hanberg, A., Victorin, K., Warholm, M., & Wallén, M. (2007). Assessment
factors--applications in health risk assessment of chemicals. Environmental Research, 104(1),
108–127.
[51] Martin, O. V., Martin, S., & Kortenkamp, A. (2013). Dispelling urban myths about default
uncertainty factors in chemical risk assessment--sufficient protection against mixture effects?
Environmental Health: A Global Access Science Source, 12(1), 53.
[52] Trisciuzzi, D., Alberga, D., Leonetti, F., Novellino, E., Nicolotti, O., & Mangiatordi, G. F.
(2018). Molecular Docking for Predictive Toxicology. Methods in Molecular Biology , 1800, 181–
197.
[53] Brooijmans, N., & Kuntz, I. D. (2003). Molecular Recognition and Docking Algorithms.
Annual Review of Biophysics and Biomolecular Structure, Vol. 32, pp. 335–373.
https://doi.org/10.1146/annurev.biophys.32.110601.142532
[54] Kitchen, D. B., Decornez, H., Furr, J. R., & Bajorath, J. (2004). Docking and scoring in virtual
screening for drug discovery: methods and applications. Nature Reviews. Drug Discovery, 3(11),
935–949.
[55] Fischer, E. (1894). Einfluss der Configuration auf die Wirkung der Enzyme. Berichte Der
Deutschen Chemischen Gesellschaft, Vol. 27, pp. 2985–2993.
https://doi.org/10.1002/cber.18940270364
[56] Koshland, D. E. (1958). Application of a Theory of Enzyme Specificity to Protein Synthesis.

Proceedings of the National Academy of Sciences, Vol. 44, pp. 98–104.
https://doi.org/10.1073/pnas.44.2.98
[57] Reisfeld, B., & Mayeno, A. N. (2013). Computational Toxicology. Humana Press.
[58] Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction
to methodology and encoding rules. Journal of Chemical Information and Modeling, Vol. 28,
pp. 31–36. https://doi.org/10.1021/ci00057a005
[59] Heller, S. R., McNaught, A., Pletnev, I., Stein, S., & Tchekhovskoi, D. (2015). InChI, the
IUPAC International Chemical Identifier. Journal of Cheminformatics, 7, 23.
[60] https://www.knime.com/book/chemical-identifier-resolver-for-knime-trusted-extension.
Last access: 30/01/2020.
[61] https://cactus.nci.nih.gov/chemical/structure. Last access: 30/01/2020.
[62] http://www.chemspider.com/StructureSearch.aspx. Last access: 30/01/2020.
[63] Ji, C., Svensson, F., Zoufir, A., & Bender, A. (2018). eMolTox: prediction of molecular
toxicity with confidence. Bioinformatics , 34(14), 2508–2509.

www.forschem.org
drlivys@yahoo.com
[64] Balasubramanian, V., Ho, S.-S., & Vovk, V. (2014). Conformal Prediction for Reliable
Machine Learning: Theory, Adaptations and Applications. Newnes.
[65] Vedani, A., Dobler, M., & Smieško, M. (2012). VirtualToxLab — A platform for estimating
the toxic potential of drugs, chemicals and natural products. Toxicology and Applied
Pharmacology, Vol. 261, pp. 142–153. https://doi.org/10.1016/j.taap.2012.03.018
[66] Campbell, J. L., Jr, Clewell, R. A., Gentry, P. R., Andersen, M. E., & Clewell, H. J., 3rd. (2012).
Physiologically based pharmacokinetic/toxicokinetic modeling. Methods in Molecular Biology ,
929, 439–499.
[67] Poulin, P. (2015). A Paradigm Shift in Pharmacokinetic–Pharmacodynamic (PKPD)

Modeling: Rule of Thumb for Estimating Free Drug Level in Tissue Compared with Plasma to
Guide Drug Design. Journal of Pharmaceutical Sciences, Vol. 104, pp. 2359–2368.
https://doi.org/10.1002/jps.24468
[68] Levitt, D. G. (2017). Computer Assisted Human Pharmacokinetics: Non-compartmental,

Deconvolution, Physiologically Based, Intestinal Absorption, Non-Linear. University of
Minnesota.
[69] Levitt, D. G. (2009). PKQuest_Java: free, interactive physiologically based pharmacokinetic

software package and tutorial. BMC Research Notes, 2, 158.
[70] Pearce, R. G., Setzer, R. W., Strope, C. L., Wambaugh, J. F., & Sipes, N. S. (2017). httk: R
Package for High-Throughput Toxicokinetics. Journal of Statistical Software, 79(4), 1–26.
[71] https://cran.r-project.org/package=httk. Last access: 30/01/2020.
[72] He, Y., Liew, C. Y., Sharma, N., Woo, S. K., Chau, Y. T., & Yap, C. W. (2013). PaDEL-
DDPredictor: open-source software for PD-PK-T prediction. Journal of Computational
Chemistry, 34(7), 604–610.
[73] Marotta, F., & Tiboni, G. M. (2010). Molecular aspects of azoles-induced teratogenesis.
Expert Opinion on Drug Metabolism & Toxicology, 6(4), 461–482.
[74] Lee, L. M. Y., Leung, C.-Y., Tang, W. W. C., Choi, H.-L., Leung, Y.-C., McCaffery, P. J., …
Shum, A. S. W. (2012). A paradoxical teratogenic mechanism for retinoic acid. Proceedings of
the National Academy of Sciences of the United States of America, 109(34), 13668–13673.
[75] http://sitem.herts.ac.uk/aeru/ppdb/en/Reports/198.htm. Last access: 30/01/2020.
[76] Machera, K. (1995). Developmental toxicity of cyproconazole, an inhibitor of fungal

ergosterol biosynthesis, in the rat. Bulletin of Environmental Contamination and Toxicology,
54(3), 363–369.

www.forschem.org
drlivys@yahoo.com
[77] Flint, O. P., & Boyle, F. T. (1986). Structure-teratogenicity relationships among the mono-
and bistriazole antifungal agents, using an In vitro test for teratogenic hazard. Food and
Chemical Toxicology, Vol. 24, p. 649. https://doi.org/10.1016/0278-6915(86)90149-3
[78] Martin, M. T., Knudsen, T. B., Reif, D. M., Houck, K. A., Judson, R. S., Kavlock, R. J., & Dix, D.
J. (2011). Predictive model of rat reproductive toxicity from ToxCast high throughput screening.
Biology of Reproduction, 85(2), 327–339.
[79] https://www.rcsb.org/structure/4GQS. Last access: 30/01/2020.
[80] http://www.biograf.ch/data/projects/virtualtoxlab_results.php. Last access: 30/01/2020.
[81] Greene, N., & Naven, R. (2009). Early toxicity screening strategies. Current Opinion in Drug
Discovery & Development, 12(1), 90–97.
[82] Hernandez, H. (2020). Formulation and Testing of Scientific Hypotheses in the presence of
Uncertainty. ForsChem Research Reports, 5, 2020-01. doi: 10.13140/RG.2.2.36317.97767.

www.forschem.org

FRR 2020-02 in Silico Toxicity Prediction Using An Integrative Multimodel Approach

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

FRR 2020-02 in Silico Toxicity Prediction Using An Integrative Multimodel Approach

Încărcat de

Drepturi de autor:

Formate disponibile

Vol.

In silico Toxicity Prediction using an Integrative Multimodel Approach

Hugo Hernandez① and Livy Shivraj②

① ForsChem Research, 050030 Medellin, Colombia

② SLS Cell Cure Technologies Pvt.Ltd, Secunderabad-500026, India

Computational Biology, Cyproconazole, Green Chemistry, Molecular Docking, Pharmacokinetic

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (1 / 32)

New chemical compounds for different applications (including cosmetics, pharmaceutical

Computational solutions providing the reliable prediction of chemical toxicity represent a

2. General Modelling Approaches

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

1. Gathering biological data from experiments that prove associations between

2.1. Structural Alerts

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

Structure–Activity Relationship (SAR) and Quantitative Structure–Activity Relationship (QSAR)

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

 QSARs require a large number of chemicals in model development to achieve statistical

Read-across is a method of predicting the unknown toxicity of a molecule using analogous

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

2.4. Systems Biology

A stringent homeostatic control has to be maintained during the development of a eukaryotic

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

2.5. Pharmacokinetic/Pharmacodynamic models

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

Summarizing, PK models are toxico-kinetic models used to relate chemical concentration in

Physiologically-based kinetic (PBK) modelling, also known as physiologically-based pharmaco-

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

2.6. Uncertainty factors

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

2.7. Molecular Docking

 Predicting a preferred orientation (pose) of a given molecule with respect to a target

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

The stability of a particular receptor-ligand complex can be measured by determining the

Estimating binding free energies accurately is a time-consuming process, particularly

Docking is usually devised as a multi-step process, involving subsequent molecular dynamics

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

3. Proposed Modelling Strategy

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

biomolecules, particularly enzymes, proteins and receptors, depends on several factors

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

other targets or by other mechanisms. Therefore, in those cases an experimental validation of

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

Table 1. Structure to single-line query conversion

Chemspider Web-based Free http://www.chemspider.com/StructureSearch.aspx

Cactus CIR Web-based Free https://cactus.nci.nih.gov/chemical/structure

Table 2. Structural toxicity search

eMolTox Web-based Free http://xundrug.cn/moltox

ToxAlerts Web-based Free, http://ochem.eu/alerts

OECD QSAR Desktop- Free, http://www.qsartoolbox.org/

Derek Nexus Desktop- Proprietary https://www.lhasalimited.org/products/derek-

QSAR Databank Web-based Free http://qsardb.org/

04/02/2020 ForsChem Research Reports Vol. 5, 2020-02 (2 / 32)

Table 3. Molecular Docking

Mcule 1-Click Web-based Free up to https://mcule.com/apps/1-click-docking/

VirtualToxLab Desktop- Free license http://www.biograf.ch/index.php?id=projects&subid

Protein Data Web-based Free https://www.rcsb.org/

SwissDock Web-based Free for http://www.swissdock.ch/docking

DockingServer Web-based Limited free https://www.dockingserver.com/web

Table 4. PBPK Modeling

PKQuest Desktop- Free http://www.pkquest.com/

HTTK R-package Free https://cran.r-project.org/package=httk