Structural Optimization of Molecular Clusters with Density Functional Theory Combined with Basin Hopping

optimization of molecular clusters with density functional theory combined with basin hopping. The Nottingham ePrints service makes this work by researchers of the University of Nottingham available open access under the following conditions. A note on versions: The version presented here may differ from the published version or from the version of record. If you wish to cite this item you are advised to consult the publisher's version. Please see the repository url above for details on accessing the published version and note that access may require a subscription. Identifying the energy minima of molecular clusters is a challenging problem. Traditionally , search algorithms such as simulated annealing, genetic algorithms or basin hopping have been used in conjunction with empirical force fields. We have implemented a basin hopping search algorithm combined with density functional theory to enable the optimization of molecular clusters without the need for empirical force fields. This approach can be applied to systems where empirical potentials are not available or may not be suciently accurate. We illustrate the e↵ectiveness of the method with studies on water, methanol and water + methanol clusters as well as protonated water and methanol clusters at the B3LYP+D/6-31+G* level of theory. A new lowest energy structure for H + (H 2 O) 7 is predicted at the B3LYP+D/6-31+G* level. In all of the protonated mixed water and methanol clusters, We find that H + prefers to combine with methanol rather than water in the lowest-energy structures.

Identifying the energy minima of molecular clusters is a challenging problem. Traditionally, search algorithms such as simulated annealing, genetic algorithms or basin hopping have been used in conjunction with empirical force fields. We have implemented a basin hopping search algorithm combined with density functional theory to enable the optimization of molecular clusters without the need for empirical force fields. This approach can be applied to systems where empirical potentials are not available or may not be su ciently accurate. We illustrate the e↵ectiveness of the method with studies on water, methanol and water + methanol clusters as well as protonated water and methanol clusters at the B3LYP+D/6-31+G* level of theory.
A new lowest energy structure for H + (H 2 O) 7 is predicted at the B3LYP+D/6-31+G* level. In all of the protonated mixed water and methanol clusters, We find that H + prefers to combine with methanol rather than water in the lowest-energy structures. a) nick.besley@nottingham.ac.uk

I. INTRODUCTION
Finding the global minimum on a complex potential energy surface (PES) is a fundamental problem in many disciplines, including physics, chemistry and biology. For example, finding the crystalline ground state structure of a solid or identifying the native state structure of a protein require searching for the lowest energy structure. Characterizing these structures via experiment can be di cult and simulation often provides the most e cient route to determine such structures. The potential energy landscapes of many chemical and biological systems are very complicated with many minima, with similar energies, separated by high-energy barriers. For such systems, an exhaustive search of configuration space is not viable and manual search procedures not reliable. This creates the demand for e cient searching algorithms, and in the last few decades, several algorithms have emerged for global minimization. Genetic algorithms (GAs) [1][2][3][4] are based on evolution theory, and have become successful tools in optimization problems in many fields. GAs belong to the class of stochastic optimization methods, which also comprise many other techniques. Amongst these, one of the most versatile techniques is Monte Carlo (MC) simulated annealing. 5 This method starts by exploring the configurational space at high temperature where the free energy landscape is relatively simple and gradually approaches the free energy global minimum by decreasing the temperature to 0 K, where the global minimum of the PES and the free energy coincide.
Despite its popularity, simulated annealing has several drawbacks. Finding the correct global minimum is only guaranteed for infinitely slow cooling, and this necessitates extensive sampling of conformational space. Furthermore, if the global minimum of free energy changes at low temperatures where dynamical relaxation is slow, the system can get stuck in the structure corresponding to the high temperature free energy global minimum, which corresponds to a local minimum on the PES. Hypersurface deformation methods represent another class of global optimization techniques, in which a transformation to smooth the PES and reduce the number of local minima is performed. 6,7 The global minimum of the deformed PES is mapped back to the original surface under the assumption that it corresponds to the global minimum of the original PES. However, di culties in locating global minima of small Lennard-Jones clusters using this method were reported. 8,9 Other techniques which attempt to reduce the e↵ects of barriers on the PES have also been proposed. [10][11][12] However, in general these techniques are computationally demanding and have not yet been tested for large clusters. In the search for an e cient global optimization algorithm for Lennard-Jones clusters, Wales and Doye proposed the basin hopping (BH) algorithm 13 , which shares the same principle as the MC minimization algorithm of Li and Scheraga 14 . This method has proved to be an e↵ective stochastic global search algorithm. Recently, other variants of the BH method have also been proposed, [15][16][17] which further strengthen the technique.
In addition to a suitable search algorithm, an accurate force field is critical to characterize the global minimum energy structure successfully. The extensive sampling required by all search algorithms means that calculations predominantly use a combination of empirical force field and mixing rules, and this approach has proved successful in a wide range of applications. 18,19 However, there are some problems associated with the reliance on empirical force fields. For example, a suitable force field may not be available for the system of interest, necessitating the development of an appropriate force field before any calculations to determine the global minimum can commence. Furthermore, the transfer of a force field from the framework in which it was parameterized to the system being studied is an approximation, and strictly force field parameters should be computed for all interactions in the system. This has been achieved based on quantum mechanical calculations. [20][21][22] However, the development of such force fields is demanding and their availability is limited.
Another approach is to refine structures determined using empirical force fields by posteriori geometry optimizations using quantum chemical methods. 23 In recent years, the advent of density functional theory (DFT) and the growth in available computational resources has reduced dramatically the time required for accurate quantum chemical calculations. These developments have made it possible to interface search algorithms directly to quantum chemical methods, namely DFT, without reference to any empirical potential. This will be substantially more computationally expensive, but it is possible to conceive of many systems where such an approach is an attractive alternative, for example where an empirical potential is not available, or the complex electronic structure and changes in electronic structure cannot be captured by a simple potential. Some authors have already begun to pursue such approaches, predominantly using DFT in con-junction with GAs with great promise. [23][24][25][26][27][28] . The Gradient Embedded Genetic Algorithm (GEGA) technique 25 was successfully used for identifying the lowest-energy structures of small lithium clusters Li 0/+1/ 1 n (n = 5 7). The method was extended to search for the global minima on the potential energy surface of mixed clusters formed by a radical H atom solvated by one to four water molecules. 26 Recently, Johnston and co-workers have used a genetic algorithm coupled with DFT calculations to perform global optimisations for all compositions of 8-atom Au-Ag bimetallic clusters. 28 In this paper, we describe our implementation that combines the basin hopping search algorithm with DFT, and show that it provides an e↵ective approach for global optimization. DFT is well suited to a basin hopping approach, since e cient geometry optimization routines to find local minima are well established, 29 and in some cases the BH method has outperformed GAs. 30 The structure of atomic and molecular clusters has been studied extensively with computational approaches, and we utilize our method to search for the global minima of (H 2 O) n (where 4  n  11), (CH 3 OH) m (where 4  m  7) and (H 2 O) n (CH 3 OH) m (where 4  n + m  7). In addition, we also explore the protonated (H 2 O) n clusters (where 3  n  9) and (H 2 O) n (CH 3 OH) m clusters (where n = 1 and 2  m  5 ). Water is the most important liquid, and there has been many investigations of water cluster structures using di↵erent optimization techniques and potential functions. 31 The study of water clusters can aid understanding of the formation of ice, clouds, solution chemistry, and many biochemical processes. 32 Methanol is a simple organic molecule that has both hydrogen bonding and hydrophobic interactions, and small methanol clusters have been studied recently. 33 Methanol is miscible with water at all proportions by forming multiple hydrogen bonds, and the nature of how these multiple hydrogen bond networks are formed has been a subject of interest. 34-39 Both water and methanol clusters have been well characterized using either empirical or semi-empirical force fields or quantum chemistry combined with a manual search procedures, which provide good test systems for the method used here. Subsequently, some stable structures of the binary mixture of water and methanol, and some protonated structures are reported for the first time.

II. COMPUTATIONAL METHODS
We have implemented the BH method within the Q-Chem software package, 40 to provide a powerful tool for global energy optimization. In principle, BH can be used in conjunction with any quantum chemical method for which energy and gradients are available. However, in practice DFT provides a good balance between computational cost and accuracy.
Dispersion forces have a significant e↵ect on the structure of water clusters. 43 For the results presented here, DFT augmented with an empirical dispersion correction (DFT-D) is used. More specifically, the B3LYP exchange-correlation functional 44,45 with the +D dispersion correction of Grimme 46 in combination with the 6-31+G* basis set is used. The BH algorithm is essentially a combination of the Metropolis sampling technique 47 and a gradient-based local search method. This has the e↵ect of sampling the energy basins, where an energy basin is a certain part of the configuration space around a minimum on the PES that contains all the configurations that will relax into this minimum using downhill relaxations, instead of sampling the configuration space. The PES is e↵ectively mapped onto a set of interpenetrating staircases with plateaux corresponding to the energy basins. Figure   1 shows a schematic view of the staircase topography that results from this transformation.
To enhance the e ciency of the method, BH with occasional jumping 15 is used, which incorporates a jumping move in addition to the standard MC moves. Jumping is a MC move without local minimization at infinite temperature and, consequently, is always accepted.
When the usual MC moves are rejected a number of times, the system is judged to be trapped at the local minimum. The temperature is raised to T = 1, and the MC jumping moves are executed several times to allow the system to escape from the local minimum ( Figure 1). If the number of the jumping moves is too large, the technique constitutes a random multi-start strategy. Therefore, it is important to keep a partial memory of the previous state by employing a small number of jumping moves. This provides an e cient way to escape from a local minimum and to explore the next basin of the valley when it is separated by high barriers.
The transformed energy landscape is explored using a canonical MC simulation at a constant temperature of 300 K. Initially, the molecules are placed randomly in a cubic box with a density equal 1.5 times lower than that of the liquid phase under ambient conditions. The precise details of the container have little e↵ect on the results, and it is only required to prevent the molecules from evaporating o↵. At each MC step, all molecules are translated and rotated twice on average. The nature of the transformed surface allows for relatively large step sizes, and the maximum displacement in the translation move is set to 2.5Å and the maximum rotation set to ⇡/2. The atoms are not allowed to get closer than 1.5Å to avoid overlapping which can result in the self-consistent field (SCF) calculation failing to converge. Thus, any move that brings at least one atom pair within 1.5Å is discarded, and when a molecule walks out of the box, it will be placed back in the opposite side of the box.
If the MC moves are rejected ten times, a jumping procedure consisting of seven random walks in the configurational space will take place.
For each cluster, ten separate runs, each consisting of 1000 MC steps starting from di↵erent randomly generated configurations of molecules, are performed. To speed up the calculations, the integration grid is reduced from the default SG1 grid to a 25 point Euler-Maclaurin radial grid and 50 point Lebdev angular grid, and the SCF energy convergence criteria is loosened to 10 5 a.u., the root mean squared gradient to fall below 2 ⇥ 10 3 a.u. and the convergence criterion for displacement is set to 2 ⇥ 10 3 a.u. This provides a good balance between the quality of the evaluated energy and e ciency. Once the global minimum is established the structure is refined with optimization with default optimization parameters.

A. Water clusters
The stable structures of (H 2 O) n clusters (where 4  n  11) predicted using B3LYP+D/6-31+G* are shown in Figure 2, and the corresponding binding energies are given in Table   1. Additional binding energies computed with the larger 6-311+G** basis set evaluated using the same structures are also shown in Table 1. An additional set of calculations without the +D dispersion correction has also been performed in order to explore the e↵ect of dispersion on the predicted structures. With the exception of (H 2 O) 6 , the dispersion correction does not alter the overall shape of the lowest energy structures, although there are small di↵erences in the precise intermolecular distances.
Cyclic rings are found for (H 2 O) 4 and (H 2 O) 5 , which agree with previous global optimization of small water clusters using the di↵usion-equation method combined with an empirical potential. 42 In fact, these cyclic rings are widely accepted as the lowest-energy structures in the literature. 43 There is more ambiguity regarding the structure of (H 2 O) 6 . Without the dispersion correction, a book-like structure is preferred, while with the inclusion of dispersion a trigonal prism configuration is found. The trigonal pyramid structure agrees with previous studies using MP2, and DFT coupled with dispersion-enhanced functionals (M05-2X, M06-2X, M06-L) 43 . In fact, the water hexamer is rather an interesting case, as controversies also arise from the experimental determination of its stable structure. For n = 7, an edge-capped trigonal prism (cubelike structure with a corner missing), is the most stable. This structure has ten hydrogen bonds, which is the greatest number of hydrogen bonds that (H 2 O) 7 can possibly form. Other structures which have eight and or nine hydrogen bonds are also found. However, these structures all have higher energy than the edge-capped trigonal prism one. The most stable geometry of (H 2 O) 8 that is found is a cubic shape with D 2d symmetry and 12 hydrogen bonds. A slightly higher energy structure with S 4 symmetry is also observed, which is the structure that was found using the TIP4P force field. 31b However, the cubic D 2d structure is in agreement with experiment. 50 For (H 2 O) 9 , the lowest energy structure has one water molecule hydrogen bonded to the cubic octamer. In order to form this structure, a hydrogen bond within the cube is broken and the two water molecules at the edges of the cube form an additional two hydrogen bonds with the ninth water molecule. This structure has 13 hydrogen bonds and is consistent with the prediction from the experimental studies of Buck et al. 50b For n = 10, a fused pentameric structure has the lowest energy. This structure has two water molecules hydrogen bonded to the corner water molecule of a cube. This structure agrees with all theoretical studies in the literature, 43 but experiments 50b suggest a butterfly structure with 14 hydrogen bonds. However, the butterfly structure has been confirmed to be energetically less stable than the fused pentameric structure in a theoretical study by Maheshwary et al. 31d The most stable structure for (H 2 O) 11 contains 17 hydrogen bonds. For this cluster size, there are no available experimental data and theoretical studies have predicted di↵erent low energy structures. The structure found by Bulusu et al. 51 is consistent with ours, but alternative structures have been suggested by Maheshwary 31d and Wales and Hodges using the TIP4P force fields. 31b However, separate calculations confirm that these structures are energetically higher than the structure shown at the B3LYP+D/6-31+G ⇤ level of theory.
The binding energies for the water clusters are given in Table 1. There is a small variation between the results computed with the two di↵erent basis sets considered, with the binding energies for the smaller basis set being larger. However, if the structures were optimized with the larger basis set, the magnitude of the binding energy for this basis set may increase.
The O-H bonds that participate in the hydrogen bonds are longer than those which are free.
This detail of the structure will not be captured by rigid empirical force fields, revealing an advantage of using methods such as DFT. The binding energies computed here are greater than the previously reported values obtained using Hartree-Fock theory. 31d The di↵erence between the DFT values reported here and the Hartree-Fock values increases with the size of the cluster and is most likely associated with the dispersion interaction, which is included in our calculations but will be totally absent at the Hartree-Fock level.

B. Methanol clusters
Hydrogen bonding is also important in methanol clusters, and the most stable structures maximise the number of hydrogen bonds whilst minimizing steric e↵ects. The most stable structures found by the BH search are shown in Figure 3. For the (CH 3 OH) 4 cluster, a cyclic arrangement with S 4 symmetry with the methyl groups alternating between above and below the plane of the ring is the most stable. Other conformers including a cyclic tetramer with C i symmetry and a cyclic trimer + 1 molecule are also found. However, these isomers are higher in energy than the S 4 symmetry one. A ring structure is also predicted for the pentamer. Since there are an odd number of methanol molecules having the methyl groups, alternating above and below the ring is not possible, and this results in the ring being less planar and a methyl group almost in the plane of the ring. Other local minima with higher energies including a cyclic trimer + 2 and a cyclic tetramer + 1 are also observed. For the larger (CH 3 OH) 6 and (CH 3 OH) 7 clusters, the most stable structures are still cyclic but they deviate significantly from planarity. It is expected that for larger clusters (m 6), there will be attempts to minimize the volume of the systems. The flat ring structures lead to an excluded volume, and it is more favorable for these clusters to form a folded ring structure and fill these excluded volume with a dielectric. 33d It is of interest to explore whether the dispersion energy plays an important role in de- For these structures, the cyclic ring is more planar than the corresponding structures when dispersion in included. For (CH 3 OH) 6 and (CH 3 OH) 7 the tendency for methyl groups to orientate towards the center of the ring is also reduced. The binding energies of the methanol clusters are given in Table 1. The size of the binding energies for the methanol clusters is greater than for the corresponding water cluster, presumably because of the additional dispersion interaction arising from the methyl groups.

C. Water + Methanol clusters
One of the attractive features of performing the calculations entirely using DFT is that  and (CH 3 OH) 4 , the lowest energy structure is the symmetric cyclic S 4 cluster, and it would be expected that the mixed clusters adopt a similar structure. Figure 5 shows that the predicted global minima, do indeed adopt a cyclic arrangement. When there is more than one methanol molecule present, they arrange themselves to be adjacent to each other. This is likely to be a result of the favorable dispersion interaction from this arrangement. For the (H 2 O) 2 + (CH 3 OH) 2 combination, there are two di↵erent low-lying isomers: one where the two methanol molecules are arranged adjacent to one another with the two methyl groups oriented above and below the plane of the ring. In the other isomer, the two methyl groups lie across the ring and are oriented in the same direction with reference to the ring plane.
The first of these isomers has the lower energy, and this agrees with a recent study by Mandal et al. 39 The lowest energy structures of the pentamer mixtures (n + m = 5) are also shown in Figure 5. For all of the combinations studied, the cyclic ring geometry is to be the most energetically favorable structure with the methanol molecules adjacent to each other.
Other structures with higher energies including cyclic trimer + 2 and cyclic tetramer + 1 are also observed.
The lowest energy structures for the hexamer mixtures are shown in Figure 6.

D. Protonated systems
There have been many previous studies of the global minimum structures of protonated water clusters using empirical force fields. 31, 52,53 In this section, we present results for the global minima of H + (H 2 O) n (where n  9) and H + (H 2 O) 1 (CH 3 OH) m (where m  5) clusters from B3LYP+D/6-31+G* in conjunction with BH. As before, ten independent BH runs with di↵erent starting structures, consisting of 1000 MC steps each, were performed. However, the lowest-energy minimum was located in under 1000 steps in all cases. The lowest energy structures for the protonated water clusters are shown in Figure 8 with the binding energies given in Table 3. For n  4, the chain-like structure, including a branched chain for n = 4 are preferred. For n 5, the structures are dominated by four-and five-membered rings, and only for n = 5 is a water molecule is found outside a closed ring. All of the global minima of the protonated water clusters found in this study are qualitatively the same as those reported by Hodges and Wales using an empirical potential 31c , apart from the global minimum of n = 7 cluster. For this cluster, we predict a di↵erent lowest energy structure, which has a binding energy that is 0.515 kcal/mol lower than that of the structure found by Hodges and Wales at the B3LYP+D/6-31+G* level.
The protonated mixed water + methanol clusters raise the interesting question of whether the H + is associated with a water or methanol molecule. We have studied the H + (H 2 O) 1 (CH 3 OH) m clusters where m  5, and the lowest energy structures are shown in Figure 9 with the binding energies given in Table 3. For all of the clusters studied, H + preferentially associated with CH 3 OH rather than H 2 O. For m = 2, a chain-like structure is preferred, while for m = 3 and m = 4 ring-like structures are observed. The chain-like structure is also observed for cluster with m = 3 and m = 4. However, these structures have higher energy than the ring-like structure. For m = 5, a book-like structure similar to the water hexamer cluster is preferred which maximizes the number of hydrogen bonds that can be formed by this mixture cluster. Other low-lying minima including the chain-like structure and pentamer ring (H 2 O stays in the ring) + 1 are also found for this cluster.

IV. CONCLUSION
The basin hopping method has been implemented within the Q-Chem software package to allow the lowest energy structures of complex system to be characterized purely using quantum chemical methods without the need for empirical potential functions. While the computational expense of this approach is considerably greater than using empirical energy functions, it may be useful for systems where empirical potentials are not available or where the complex electronic structure or changes in electronic structure make the system unsuitable to be described by a simple potential energy function. This method has been used to study water, methanol, water + methanol, protonated water, and protonated water + methanol clusters at the B3LYP+D/6-31+G* level of theory. Overall, the predicted structures of the water clusters using our approach agree with those reported in the literature.
Although these systems are largely governed by electrostatic interactions and hydrogen bonding, the inclusion of dispersion does have a significant e↵ect. For the water hexamer cluster, DFT without dispersion correction predicts a book-like structure to be the global minimum, while with the inclusion of dispersion a preference for a trigonal prism configuration is found. Dispersion plays a greater role in the structure of the methanol clusters.
For example, without the dispersion correction, a cyclic ring structure rather than folded structures are observed for (CH 3 OH) 5 and (CH 3 OH) 6 . Furthermore, dispersion also has a large e↵ect on the position of the methyl groups in the methanol clusters.
For water and methanol mixed clusters, a cyclic ring geometry is the most energetically favorable structure for all combinations of the tetramer and pentamer clusters. This is consistent with the corresponding pure H 2 O and CH 3 OH clusters which also have a cyclic ring structure. For the larger mixed clusters, the lowest energy structures resemble the relevant structures for the pure component clusters. Since this approach does not require potentials, it is straightforward to study protonated clusters. For protonated water, a di↵erent lowest energy structure is predicted for H + (H 2 O) 7 , which has a binding energy of 0.515 kcal/mol lower than that of the structure found by Hodges and Wales 31c at the B3LYP+D/6-31+G* level. For the protonated mixed water and methanol clusters, we find that the H + preferentially combines with methanol rather than water in the lowest-energy structures.
The combination of the basin hopping approach with DFT provides a useful tool for identifying minima on complex PESs, which can be readily applied to a wide variety of systems. As the availability of computational resources continues to increase, the range and scope of systems that can be studied using this approach will continue to grow.

V. ACKNOWLEDGMENTS
We