Sunteți pe pagina 1din 14

P.G.

Mezey, Computational Aspects of Combinatorial Quantum Chemistry, Journal of Computational Methods in Sciences and Engineering (JCMSE), 1, 99-106, (2001).

Computational Aspects of Combinatorial Quantum Chemistry Paul G. Mezey a,b,c


a

Mathematical Chemistry Research Unit,

Department of Chemistry and Department of Mathematics and Statistics University of Saskatchewan, 110 Science Place, Saskatoon, SK, Canada, S7N 5C9 e-mail mezey@sask.usask.ca,
b

Institute for Advanced Study, Collegium Budapest,

Szenthromsg u. 2, 1014 Budapest, Hungary,


c

CODATA Secretariat,

CODATA (ICSU/UNESCO), Committee for Data in Science and Technology, 51 Bd de Montmorency, 75016 Paris, France

Key words: Combinatorial Quantum Chemistry (CQC), Holographic Electron Density Theorem, Additive Fuzzy Density Fragmentation Principle, Adjustable Density Matrix Assembler (ADMA), The ADMA-CQC Approach,

Quantitative Shape Activity Relations, (QShAR)

Abstract

The Additive Fuzzy Density Fragmentation

(AFDF)

principle and the

Adjustable Density Matrix Assembler (ADMA) methods are proposed for a combinatorial construction of electron density representations of a series of macromolecules, related to one another by combinatorial reassignments of constituent fragments. Some of the fundamental computational aspects of the ADMA-based Combinatorial Quantum Chemistry (ADMA-CQC) approach are discussed, with special emphasis on the constraints provided by the recently proven holographic properties of molecular electron densities.

Introduction

Electron density fragmentation methods of computational quantum chemistry, especially the additive fuzzy density fragmentation (AFDF) methods [1] provide the building blocks for a combinatorial construction of large numbers of molecular electron density representations. The Adjustable Density Matrix Assembler (ADMA) method [2-5] serves as a tool for

Combinatorial Quantum Chemistry, primarily aimed at macromolecules. The ADMA method generates macromolecular density matrices and thus is suitable for the computation of a large variety of molecular properties, in addition to electron density. The analytical representation of electron densities of the ADMA method, in terms of a macromolecular density matrix derived from fuzzy fragment density matrices and an atomic orbital basis set, is advantageous when compared to numerical representations of electron densities using a three-dimensional numerical grid. An earlier numerical method

involving grid representations of fuzzy density fragments, the Molecular Electron Density Loge (or Lego) Assembler method (MEDLA) [6-9], is also suitable for macromolecular electron density computations. However, it is less suitable for combinatorial quantum chemistry applications than the ADMA method, since the numerical grids of individual electron density fragments would require a cumbersome re-alignment resulting in loss of accuracy,

furthermore, the numerical density representation is not ideally suited for the computations of macromolecular properties different from electron density. The more advanced Adjustable Density Matrix Assembler (ADMA) method actually builds macromolecular density matrices, hence it is applicable for the computation of several other macromolecular properties beside electron density, for example, forces acting on individual nuclei of the macromolecule. The combinatorial approach is equally applicable to representations of fuzzy electron density fragments and to the associated fuzzy fragment density matrices. The ADMA-CQC approach extends the possibility of the rapid

computation of analogous molecular properties, as well as electron densities to a large number of macromolecules related to one another by some combinatorial re-selection of some or all of their molecular fragments from a fuzzy fragment density matrix databank. In this contribution some of the choices concerning the combinatorial construction of macromolecular density matrices are discussed.

What Is Combinatorial Quantum Chemistry?

Combinatorial chemistry, in its first form applicable to the combinatorial synthesis of large numbers of molecules, was invented in 1982 by Arpad Furka [10-13]. The use of combinatorial approaches to the production of very large numbers of molecules from specific building blocks has revolutionized synthetic chemistry. The original portioning mixing method for the synthesis of combinatorial libraries by Furka, and subsequent variations on the basic combinatorial chemistry principle were applied primarily to peptides, using individual amino acids as building blocks. However, these methodologies have also been adapted to other types of molecules and to a much broader family of potential building blocks linked up according to combinatorial patterns. These advances have revolutionized the approaches used by the

pharmaceutical industry for the production and selection of new molecules for

tests and for the optimization of the pharmacological effectiveness of potential drug molecules. It is an interesting coincidence that within computational quantum chemistry a similar advance has occurred recently. Starting with the introduction of the MEDLA method (Molecular Electron Density Lego [or Loge] Approach) [6-9], fuzzy electron density building blocks have been used within a numerical approach (based on a three-dimensional numerical grid) to construct electron densities for large molecules, including proteins. By appropriate choice of the fuzzy electron density fragments, these building blocks can be combined in a variety of ways, using combinatorial principles, leading to the construction of a large number of molecular electron densities. If the density construction is carried out using parallel processors, relying on a common fuzzy electron density fragment database, then the combinatorial construction of the electron densities of a large number of molecules can be achieved simultaneously. Indeed, the approach is analogous with the synthetic combinatorial chemistry approach, an "in silico" version of the original idea of Furka, where there is no need either for "portioning" or "mixing". However, the MEDLA numerical grid technique was limited to electron density computations in numerical representations, where the variations in local grid alignments when combining two or more fuzzy electron density fragments had disadvantageous effects on the accuracy of the approach. Furthermore, the approach was sensitive to the resolution of the numerical grid, that was important not only in visual displays, but in the very construction process of the numerical electron densities. By contrast, in the more advanced ADMA (Adjustable Density Matrix Assembler) method [1-5], the combination of fuzzy electron density

fragments is accomplished by building first a macromolecular density matrix from the fuzzy fragment density matrices. That is, the combinatorial step is carried out on the fuzzy fragment density matrices. This has several advantages. The electron density representation is analytical, relying on the macromolecular density matrix and the associated basis set information,

consequently, neither grid alignment problems nor grid resolution problems occur. Furthermore, in the possession of the macromolecular density matrix, many molecular properties beyond electron density can be calculated, for example, approximate forces acting on individual nuclei of the macromolecule can also be computed, suitable for the study of folding problems in proteins and other, macromolecular conformational problems. Relying on a fragment density matrix databank, the ADMA approach can be used for the combinatorial selection and subsequent assembly of fuzzy fragment density matrices, forming macromolecular density matrices for a large number of macromolecules, related to one another by a combinatorial reassignment of local molecular fragments. Since the resulting macromolecular density matrices can be used for both electron density computations and for the computation of all other properties determined by density matrices and basis set information, the ADMA method provides a versatile basis for

Combinatorial Quantum Chemistry (CQC). The corresponding approach, the ADMA-CQC approach is expected to enhance the computational toolbox of fundamental structural biochemistry and biotechnology research.

3.

Molecular Electron Density Fragmentation Principles

For the generation of fuzzy electron density fragments the following approach is one of the simplest. This is the approach used in the first applications of both the MEDLA and the ADMA methods. Consider a molecule M, and its electron density (r) expressed in terms of a density matrix P of elements Pij , atomic orbital basis functions i(r) : with reference to a set of n

n (r) =

n Pij i(r) j(r) . (1)

i=1 j=1

Furthermore, assume that the set of all nuclei of M is partitioned into m families, denoted by

f1, f2, . . . , fk, . . . fm,

(2)

where each nuclear family is associated with a fuzzy electron fragment,

density

F1, F2, . . . , Fk, . . . Fm,

(3)

respectively. The corresponding actual fragment electron density functions are denoted by 1(r), 2(r), . . . , k(r), . . . m(r),

(4)

respectively.

Their actual definition and the approach used for their Hartree-Fock ab initio framework are

computation within the standard described below.

The notation mk(i) is used for the membership function of atomic orbital (AO) i(r) in the set of AOs centered on a nucleus of nuclear set fk of fragment Fk. That is, this membership function is defined as follows: mk(i) = 1 if AO i(r) is centered on one of the nuclei of set fk,

0 otherwise.

(5)

Based on the membership functions mk(i), a fuzzy fragment density

matrix is defined for each of the fragment density function k(r). In particular, the elements Pkij of the n n fragment density matrix Pk for the k-th fragment Fk are defined as

Pkij = 0.5 [mk(i)+ mk(j)] Pij ,

(6)

where Pij

are the elements of the density matrix

of the molecule M.

This definition is equivalent to the one given by the condition = Pij if both i(r) and j(r) are AO's centered on nuclei of the fragment, = 0.5 Pij if only one of i(r) and j(r) is centered on a nucleus of the fragment, = 0 otherwise. (7)

Pkij

One should note that the fuzzy fragment density matrix Pk has the same n n dimensions as that of the density matrix P of the complete molecule M. Using the fragment density matrix Pk for the k-th fragment, the associated fuzzy electron density k(r) is defined as

n k(r) =

n Pkij i(r) j(r) . (8)

i=1 j=1

As the consequence of the definition of the fragment density matrix and the partitioning of the nuclei of the molecule, the sum of all such fragment density matrices Pk is equal to the density matrix P of the complete

molecule, since m Pij = Pkij k=1 (9)

holds for the matrix elements, consequently,

m P = Pk. k=1 (10)

Since both the complete electron density and the fuzzy fragment electron densities are linear in the corresponding density matrices P and Pk, the

sum of all fragment densities k(r) is equal to the density (r) of the complete molecule:

m (r) = k(r). k=1 (11)

In this fragmentation scheme both the fragment density matrices Pk and the fragment densities k(r) are additive, that is, the above relations describe an additive, fuzzy electron density fragmentation scheme. These fuzzy fragment density matrix additivity rules, as well as the electron density fragment additivity rules are exact at any given ab initio LCAO level. If the nuclear geometry and local surroundings of a given nuclear family in one molecule is the same as that of a stoichiometrically identical family in another molecule, than, assuming the use of the same set of basis functions

(using the same local coordinate system), the corresponding two fuzzy fragment density functions are very similar. By taking larger and larger "coordination shells" of identical local surroundings, the differences between the two fragment densities can be made smaller than any pre-set positive threshold. This fact can be utilized for building density matrices P for macromolecules M, hence also their electron densities (r), by generating a series of fuzzy fragment density matrices Pk from smaller parent molecules Mk with fragment nuclear arrangements identical to those in the

macromolecule M, and also having local "coordination shells" the same as those of the fragments within the macromolecule M. The corresponding fuzzy fragment density matrices Pk can be assembled into a macromolecular density matrix P using a scheme described in earlier works [3,5]. If a sufficiently detailed fragment density matrix databank, with a large enough variety of fragment nuclear geometry and coordination shell is available, then a combinatorial construction of the ADMA macromolecular density matrices of several macromolecules becomes possible. In the following sections the limitations and some practical, computational aspects of the ADMA-CQC approach are discussed.

4.

The Holographic Electron Density Theorem and Its Combinatorial

Consequences

The celebrated Hohenberg-Kohn theorem [14] states that, at least in principle, the three-dimensional molecular electron density cloud contains the complete information about the molecule. How does the complete information compare with the information carried by various local regions of the molecule?

If molecules were finite, closed systems, bounded objects in space, then an early result of Mnch and Riess could be applied, where a result analogous to the Hohenberg-Kohn theorem would apply to parts of such artificial molecules [15]. However, molecules are neither finite, nor closed, they are not bounded objects in space. Nevertheless, as established by a more recent result for boundaryless, real molecules, a theorem stronger than the Hohenberg-Kohn theorem holds: any small nonzero volume piece of the (boundaryless, nondegenerate ground state) electron density of each molecule M already

contains the complete information about the entire molecule [16-18]. This theorem implies that unless an electron density fragment is obtained from the same molecule, it cannot represent exactly the electron density of the fragment within the new molecule. This is a fundamental limitation of the combinatorial quantum chemistry approach. However, excellent

approximations are still possible, and the error of the fragment density (manifested in the error of the fuzzy fragment density matrix) can be made smaller than any positive threshold, by a suitable enlargement of the coordination shell. In practice, this limitation is not the primary factor affecting accuracy of the combinatorial chemistry approach. The overall level of the given ab initio computation, used for the computation of the parent molecules and the generation of the fuzzy fragment density matrix data bank is at present a more important limitation.

5.

Computational Aspects of the Combinatorial Fragment Assembly

Approach

Depending on the level of accuracy required for the representation of the quantum chemical properties of the macromolecules M studied with the ADMA-CQC method, three main approaches can be used in the generation

(and possible readjustment) of fuzzy fragment density matrices.

(i) No fragment readjustment If there is no need for high accuracy, and if a sufficiently detailed fuzzy fragment density matrix databank is available with several variants for density matrices in terms of local nuclear geometries and coordination shells, then by taking the optimum choice for each fragment density matrix from the databank, a reasonable approximation of the series of macromolecular density matrices can be obtained by combinatorial construction.

(ii) Fuzzy fragment deformation by nuclear rearrangement If somewhat higher accuracy is required, then the optimum fragment density matrix taken form the databank can be readjusted by carrying out a density deformation (accompanied by a change of the density matrix) using either the Dimension Expansion - Reduction (DER) method, or the Weighted Affine Transformation (WAT) method, or the method of Lwdin Transform method, that has the advantage of ensuring idempotency for the new density matrix. All three of these methods generate approximate fragment density matrices by changing the local nuclear arrangement exactly into a desired new arrangement, exactly reproducing the arrangement within the given macromolecule. M. Details of the DER, WAT, and Lwdin Transform methods can be found in references [1,5].

(iii) Fuzzy fragment recalculation using a new, custom made parent molecule This approach, the most accurate but also the most laborious of the three, involves a complete recalculation of the fragment density matrix for the actual nuclear geometry and coordination shell, to match those in the actual macromolecule M. Whereas this approach is time-consuming, it has the

advantage that the resulting new variant of the fuzzy fragment density matrix can be added to the already existing data bank, making the databank more applicable the next occasion using options (i) or (ii).

With the appropriate level of fragment density matrix selection and

readjustment, the

ADMA-CQC method is a technique that is applicable for

the generation of ab initio quality quantum chemical representations of a large number of macromolecules.

Acknowledgement

The operating and strategic research grant support of the Natural Sciences and Engineering Research Council (NSERC) of Canada, the support of CODATA Task Group on Data Quality and Database Compatibility, and the hospitality of the Institute for Advanced Study, Collegium Budapest, are gratefully acknowledged.

References

1.

P.G. Mezey, Functional Groups in Quantum Chemistry, Advances in Quantum Chemistry, 27, 163-222 (1996).

2.

P.G. Mezey, Shape Analysis of Macromolecular Electron Densities, Structural Chem., 6, 261-270 (1995).

3.

P.G. Mezey, Macromolecular Density Matrices and Electron Densities with Adjustable Nuclear Geometries, J. Math. Chem., 18, 141-168 (1995).

4.

P.G. Mezey, Local Shape Analysis of Macromolecular Electron Densities, in "Computational Chemistry: Reviews and Current Trends, Vol.1", Ed. J. Leszczynski, World Scientific Publ., Singapore, 1996, pp 109-137.

5.

P.G. Mezey, Similarity Measures and Lwdin's Transform for Approximate Density Matrices and Macromolecular Forces, Int. J. Quantum Chem., 63, 39-48 (1997).

6.

P.D. Walker and P.G. Mezey, Molecular Electron Density Lego

Approach to Molecule Building, J. Amer. Chem. Soc., 115, 12423-12430 (1993). 7. P.D. Walker and P.G. Mezey, Ab initio Quality Electron Densities for Proteins: A MEDLA Approach, J. Amer. Chem. Soc., 116, 12022-12032 (1994). 8. P.D. Walker and P.G. Mezey, Realistic, Detailed Images of Proteins and Tertiary Structure Elements: Ab Initio Quality Electron Density Calculations for Bovine Insulin, Can J. Chem., 72, 2531-2536 (1994). 9. P.D. Walker and P.G. Mezey, A New Computational Microscope for Molecules: High Resolution MEDLA Images of Taxol and HIV-1 Protease, Using Additive Electron Density Fragmentation Principles and Fuzzy Set Methods, J. Math. Chem., 17, 203-234 (1995). 10. 11. A. Furka, Notarized Notes, 1982, see http://www.win.net/kunagota A. Furka, F. Sebestyen, M. Asgedom, and G. Dibo, in "Highlights of Modern Biochemistry, Proc. 14th Internat. Congr. Biochem.", VS Publ., Utrecht, The Nederlands, 1988, Vol. 5, pp 47. 12. A. Furka, F. Sebestyen, M. Asgedom, and G. Dibo, Abstract P-168, in Abstr. 10th Internat. Symp. Medicinal Chem., Budapest, 1988, pp 288. 13. A. Furka, F. Sebestyen, M. Asgedom, and G. Dibo, Int. J. Peptide Protein Res., 37, 487 (1991). 14. 15. 16. 17. 18. P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964). J. Riess and W. Mnch, Theor. Chim. Acta, 58, 295 (1981). P.G. Mezey, Mol. Phys., 96, 169 (1999). P.G. Mezey, J. Math. Chem., 23, 65 (1998). P.G. Mezey, J. Chem. Inf. Comp. Sci., 39, 224 (1999).

S-ar putea să vă placă și