Introduction 1 2 3 4 5 6 7 10 11 d 12 13 14 15 16 17 18 19 Because of the way advances in NDDO developments occurred, in terms of the modifications of the approximations and the extensions to specific elements or groups of elements, there has been an inevitable lack of consistency. The aim of the current work was three-fold: to investigate the incorporation of some of the reported modifications to the core-core approximations into the NDDO methodology; to carry out a systematic global parameter optimization of all the main group elements, with emphasis on compounds of interest in biochemistry; and to extend the methodology by performing a restricted optimization of parameters for the transition metals. This resulted in the development of a new method, consisting of the final set of approximations used and the optimized parameters. This method will be referred to as parametric method number 6, or PM6. The name PM6 was chosen to avoid any confusion with two other unpublished methods, PM4 and PM5. Theory Despite the apparent complexity of semiempirical methods, there are only three possible sources of error: reference data may be inaccurate or inadequate, the set of approximations may include unrealistic assumptions or be too inflexible, and the parameter optimization process may be incomplete. In order for a method to be accurate, all three potential sources of error must be carefully examined, and, where faults are found, appropriate corrective action taken. Reference data 20 21 22 23 24 For molecular geometries, gas phase reference data are preferred, but in many instances such data were unavailable, and recourse was made to condensed-phase data. Provided that care was taken to exclude those species whose geometries were likely to be significantly distorted by crystal forces, or which carried a large formal charge, condensed-phase data of the type found in the CSD were regarded as being suitable as reference data. Because earlier methods used only a limited number of reference data, most of the cases where the method gave bad results were not discovered until after the method was published. In an attempt to minimize the occurrence of such unpleasant surprises, the set of reference data used was made as large as practical. To this end, where there was a dearth or even a complete absence of experimental reference data, recourse was made to high level calculations. Thus, for the Group VIII elements, there are relatively few stable compounds, and the main phenomena of interest involve rare gas atoms colliding with other atoms or molecules, so reference data representing the mechanics of rare gas atoms colliding with other atoms was generated from the results of ab-initio calculations. Additionally, there is an almost complete lack of thermochemical data for many types of complexes involving transition metals, so augmenting what little data there was with the results of ab-initio calculations was essential. Use of Ab-Initio results 25 26 27 28 f f f f E tot S 1 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ S = {\sum\limits_j {{\left( {\Delta H_{j} {\left( {\operatorname{Re} {\text{f}}{\text{.}}} \right)} - 627.51{\left( {E_{{{\text{Tot}}}} + {\sum\limits_i {C_{i} n_{i} } }} \right)}} \right)}^{2}_{j} } }$$\end{document} C i i n i 1 S f 2 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S = {\sum\limits_j {{\left( {\Delta H_{j} {\left( {\operatorname{Re} {\text{f}}{\text{.}}} \right)} - 627.51{\left( {E_{{{\text{Tot}}}} + {\sum\limits_i {C_{i} n_{i} } } + C_{x} n_{x} } \right)}} \right)}^{2}_{j} } }$$\end{document} C x C x f Training set reference data 7 8 Use of rules in parameter optimization f f −1 f −1 f −1 −1 f −1 −1 f f f f f f f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{H}} = - 57.8$$\end{document} f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{H = 10}}{\left( {{\text{ - 5 + H}}_{{{\text{H2O}}}} {\text{ + H}}_{{{\text{H2O}}}} } \right)}$$\end{document} H2O f −1 −1 Rules are very useful in defining the parameter hypersurface. Examples of such tailoring are as follows: Correcting qualitatively incorrect predictions II 4 2− 2d 2d d II 4 2− f n.n −1 Rare gas atoms at sub-equilibrium distances −1 Use of rules to restrain parameter values In general, uncharged atoms that are separated by a distance sufficiently large so that all overlaps between orbitals on the two atoms are vanishingly small will not interact significantly, and what interaction energy exists would arise from VDW terms: of their nature, these are mildly stabilizing. Although statements of this type are obviously true, when they are expressed as rules and added to the training set of reference data they can help define the parameter values. For a pair of atoms, A and B, a simple diatomic system would be constructed in which the interatomic separation was the minimum distance at which any overlaps of the atomic orbitals would still be insignificant. The electronic state of such a system would then be the sum of the states of the two isolated atoms. Thus, if both A and B were silicon, then, since the ground state of an isolated silicon atom is a triplet, the combined state would be a quintet. Because the two atoms do not interact significantly, a rule could then be constructed that said “The energy of the diatomic system is equal to the addition of energies of the two individual systems.” By giving this rule a large weight, any tendency of the method to generate a spurious attraction or repulsion between the atoms would be prevented. Atomic energy levels In keeping with the philosophy that a large amount of reference data should be used in the parameter optimization, spin-free atomic energy levels were used for most elements. The exceptions were carbon, nitrogen, and oxygen, where there were enough conventional reference data that the addition of atomic energy levels would not significantly improve the definition of the parameter surface. NDDO approximations do not allow for spin-orbit coupling. Therefore, spin-free levels were needed. For a few elements, there were insufficient spin states to allow the spin-free energy levels to be calculated. For all the remaining elements, spin-free energy levels were calculated. 29 31 E S,L,J 3 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ E = \frac{1} {{{\left( {2S + 1} \right)}{\left( {2L + 1} \right)}}}{\sum\limits_{J = {\left| {L - S} \right|}}^{L + S} {{\left( {2J + 1} \right)}E{\left( {S,L,J} \right)}} } $$\end{document} 7 3 −1 5 J −1 d 2 d 4 5 d 1 d 5 7 5 −1 Where there were relatively few other reference data, the singly-ionized, and, in rare cases, the doubly-ionized, spin-free states were also evaluated and used as reference data. Each energy level contributed one reference datum to the training set. Most atoms have a large number of atomic energy levels, so in order to minimize the probability that a level might be incorrectly assigned, each level was labeled with three quantum numbers: the total spin momentum, the total angular momentum, and the principal quantum number for these two quantum numbers. These were compared with the corresponding values calculated from the state functions. Since each set of three quantum numbers is unique, the potential for miss-assignment was minimized. In rare cases, particularly during the early stages of parameter optimization, two states with the same total spin and angular quantum numbers would be interchanged, with the result that the calculated principal quantum number would also be interchanged. All such cases always involved the ground state, and were quickly identified and corrected. Approximations Most of the approximations used in PM6 are identical to those in AM1 and PM3. The differences are: Core-core interactions \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.$$\end{document} AB \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.$$\end{document} 4 4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.{\left( {1 + e^{{ - \alpha _{A} R_{{AB}} }} + e^{{ - \alpha _{B} R_{{AB}} }} } \right)}$$\end{document} 14 5 5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.{\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} R_{{AB}} }} } \right)}$$\end{document} When PM3 parameters for elements of Groups IA were being optimized, the MNDO approximation to the core-core expression was found to be unsuitable. In these elements there is only one valence electron so the core charge is the same as that of hydrogen. A consequence of this was that the apparent size of these elements was also approximately that of a hydrogen atom, in marked contrast with observation. For these elements, diatomic core-core parameters were also found to be essential. 4 5 6 6 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.{\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} {\left( {R_{{AB}} + 0.0003R^{6}_{{AB}} } \right)}}} } \right)}$$\end{document} 5 6 d 13 d d d d- d d s p d s p s p Unpolarizable core f AB 32 7 7 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ f_{{AB}} = c{\left( {\frac{{{\left( {Z^{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 3}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$3$}}}_{A} + Z^{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 3}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$3$}}}_{B} } \right)}}} {{R_{{AB}} }}} \right)}^{{12}} $$\end{document} c −8 Individual core-core corrections For a small number of diatomic interactions, the general expression for the core-core interaction was modified in order to correct a specific fault. Because it is desirable to keep the methodology as simple as possible, modifications of the approximations were made only after determining that the existing approximations were inadequate. The diatomic specific modifications were: O–H and N–H 4 8 8 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|} \right.\left. {s_{B} s_{B} } \right\rangle {\left( {1 + R_{{AB}} e^{{ - \alpha _{A} R_{{AB}} }} + R_{{AB}} e^{{ - \alpha _{B} R_{{AB}} }} } \right)}$$\end{document} f 4 9 9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.{\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} R^{2}_{{AB}} }} } \right)}$$\end{document} 5 9 C–C −1 10 10 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|} \right.\left. {s_{B} s_{B} } \right\rangle {\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} {\left( {R_{{AB}} + 0.0003R^{6}_{{AB}} } \right)}}} + 9.28e^{{ - 5.98R_{{AB}} }} } \right)}$$\end{document} Si–O 2 3 4 12 11 11 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|} \right.\left. {s_{B} s_{B} } \right\rangle {\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} {\left( {R_{{AB}} + 0.0003R^{6}_{{AB}} } \right)}}} - 0.0007e^{{ - {\left( {R_{{AB}} - 2.9} \right)}^{2} }} } \right)}$$\end{document} sp 2 f 12 12 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \Delta {H}\ifmmode{'}\else$'$\fi_{f} = \Delta H_{f} - 0.5e^{{ - 10\phi }} $$\end{document} ϕ sp 2 −1 More elements s s p d \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \varphi = \frac{{{\left( {2\xi } \right)}^{{n + \raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 2}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$2$}}} }} {{{\left( {{\left( {2n} \right)}!} \right)}^{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 2}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$2$}}} }}r^{{n - 1}} e^{{ - \xi r}} Y^{m}_{l} {\left( {\theta ,\phi } \right)} $$\end{document} ξ n Y l m θ ϕ 1 s p d s p d d s p s p d s p p s p s p Table 1 Principal quantum numbers for atomic orbitals   s p d   s p d H 1 Kr 5 4 He 1 2 Rb 5 5 Li 2 2 Sr 5 5 Be 2 2 Y 5 5 4 B 2 2 Zr 5 5 4 C 2 2 Nb 5 5 4 N 2 2 Mo 5 5 4 O 2 2 Tc 5 5 4 F 2 2 Ru 5 5 4 Ne 3 2 Rh 5 5 4 Na 3 3 Pd 5 5 4 Mg 3 3 Ag 5 5 4 Al 3 3 3 Cd 5 5 Si 3 3 3 In 5 5 P 3 3 3 Sn 5 5 S 3 3 3 Sb 5 5 5 Cl 3 3 3 Te 5 5 5 Ar 4 3 I 5 5 5 K 4 4 Xe 6 5 Ca 4 4 Cs 6 6 Sc 4 4 3 Ba 6 6 Ti 4 4 3 La 6 6 5 V 4 4 3 Lu 6 6 5 Cr 4 4 3 Hf 6 6 5 Mn 4 4 3 Ta 6 6 5 Fe 4 4 3 W 6 6 5 Co 4 4 3 Re 6 6 5 Ni 4 4 3 Os 6 6 5 Cu 4 4 3 Ir 6 6 5 Zn 4 4 Pt 6 6 5 Ga 4 4 Au 6 6 5 Ge 4 4 Hg 6 6 As 4 4 4 Tl 6 6 Se 4 4 4 Pb 6 6 Br 4 4 4 Bi 6 6 Parameter optimization Background S 13 Q ref i Q calc i g i 13 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S = {\sum\limits_i {{\left( {g_{i} {\left( {Q_{{calc}} {\left( i \right)} - Q_{{ref}} {\left( i \right)}} \right)}} \right)}^{2} } }$$\end{document} 2 −1 Table 2 Default weighting factors for reference data Reference data Weight f −1 Bond length −1 Angle −1 Dipole −1 I.P. −1 Elements Multiplier Core 1.0 Organic 0.9 Main group 0.8 Transition metals 0.7 2 3 4 P Q calc i P j 7 Q calc i P Sequence of optimization of parameters 10 Because the elements H, C, N, and O are of paramount importance in biochemistry, and because large amounts of reference data are available, the starting point for parameter optimization involved the simultaneous optimization of parameters for these four elements. For the purposes of discussion, this set of four elements will be called the “core elements”. Once stable parameters had been obtained, parameters for other elements important in organic chemistry were optimized in two stages. First, the parameters for the core elements were held constant, and parameters for the elements F, P, S, Cl, Br, and I were optimized one at a time. Then all parameters for all ten elements were simultaneously optimized. This set (the organic elements) was then used as the starting point for parameterizing the rest of the main group. The same sequence was followed for the rest of the main-group elements. That is, parameters for each element were optimized while freezing the parameters for the organic elements. Then, once all the elements had been processed, all parameters for all of the 39 main-group elements, plus zinc, cadmium, and mercury, were optimized simultaneously. When parameters for the transition metals were being optimized, all parameters for the main group elements were held constant. There were several reasons for this. Most importantly, the reference data for the transition metals, particularly the thermochemical data, was of lower quality, so one consideration was to prevent the transition metals from having a deleterious effect on the main-group elements. Another important consideration was that most compounds involving transition metals also involved only elements of the organic set. Since parameters for these elements had been optimized using a training set consisting of all the main-group elements, the values of the optimized parameters would likely be relatively insensitive to the influence of the small number of additional reference data involving transition metals. In general, all parameters for a given element were optimized simultaneously; this was both efficient and convenient. In some optimizations, specifically those involving a new element, only sub-sets of parameters were used. Three main sub-sets were used: Parameters that determine atomic electronic properties ss pp dd sn pn dn Parameters that determine molecular electronic properties s p d s p d Parameters that determine geometries As soon as an initial optimized set of electronic parameters was available, the diatomic and other core-core parameters could be optimized. The most efficient process was to optimize these parameters initially without allowing the electronic parameters or the molecular geometries to optimize. If geometries were allowed to optimize, optimization of the core-core parameters would be slowed considerably, because of the tight dependency of the optimized geometries on the values of the core-core parameters, and vice versa. S Results Parameters for PM6 3 4 Table 3 PM6 parameters for 70 elements Table 4 Diatomic core−core parameters Accuracy Comparison with other semiempirical methods 33 f f 5 34 6 7 8 9 Table 5 −1 Element PM6 No. PM5 No. PM3 No. AM1 No. Hydrogen 7.29 3039 13.89 2340 17.09 2340 21.12 2270 Helium 0.00 1 0.00 1 0.00 1 0.00 1 Lithium 7.98 83 15.31 83 18.02 83 18.84 82 Beryllium 5.92 34 29.06 34 29.58 34 18.51 34 Boron 6.44 122 10.81 120 11.84 120 – – Carbon 7.31 2828 13.03 2155 15.06 2155 19.42 2123 Nitrogen 8.22 1067 16.45 761 20.96 761 24.23 744 Oxygen 8.42 1758 16.59 1243 20.13 1244 27.68 1229 Fluorine 8.49 497 22.31 350 21.25 350 37.40 334 Neon 0.00 1 0.00 1 0.00 1 0.00 1 Sodium 5.72 40 8.57 39 9.47 39 10.77 38 Magnesium 9.84 66 12.07 66 17.94 66 18.71 66 Aluminum 7.61 75 17.49 75 19.15 75 18.99 75 Silicon 6.51 98 9.28 96 12.80 96 17.00 95 Phosphorus 8.20 110 16.01 98 17.36 98 20.06 95 Sulfur 8.81 427 15.40 330 18.44 330 26.38 323 Chlorine 8.28 670 16.69 390 18.71 390 23.06 383 Argon 0.00 1 0.00 1 0.00 1 0.00 1 Potassium 6.53 43 12.33 42 9.36 42 28.38 41 Calcium 11.87 43 28.68 43 43.44 43 63.20 43 Scandium 10.33 52 – – – – – – Titanium 10.20 85 – – – – – – Vanadium 14.29 59 – – – – – – Chromium 14.09 60 – – – – – – Manganese 12.77 44 – – – – – – Iron 18.31 76 – – – – – – Cobalt 15.51 42 – – – – – – Nickel 15.10 51 – – – – – – Copper 13.00 47 – – – – – – Zinc 5.56 54 17.84 54 32.93 54 37.06 54 Gallium 7.51 47 29.12 47 37.58 47 46.87 47 Germanium 9.83 67 12.20 67 15.86 67 19.12 67 Arsenic 6.94 49 15.22 49 16.68 49 17.34 49 Selenium 4.40 25 39.58 25 39.71 25 32.00 25 Bromine 7.37 330 17.20 199 25.04 199 28.22 199 Krypton 0.00 1 0.00 1 0.00 1 0.00 1 Rubidium 10.91 24 16.57 24 21.47 24 29.33 23 Strontium 7.72 38 52.46 38 103.16 38 57.21 38 Yttrium 13.28 51 – – – – – – Zirconium 11.18 46 – – – – – – Niobium 8.57 51 – – – – – – Molybdenum 13.41 70 – – – – 35.77 69 Technetium 15.14 50 – – – – – – Ruthenium 13.87 56 – – – – – – Rhodium 20.92 32 – – – – – – Palladium 11.65 47 – – – – – – Silver 4.67 14 – – – – – – Cadmium 3.49 38 34.66 38 61.92 38 – – Indium 7.33 54 31.53 54 29.83 54 32.16 54 Tin 7.14 77 16.83 77 17.10 77 20.21 77 Antimony 5.41 58 30.98 58 34.61 58 35.00 58 Tellurium 8.20 45 35.66 45 46.80 45 22.91 45 Iodine 7.23 279 23.77 176 25.90 176 36.55 175 Xenon 0.00 1 0.00 1 0.00 1 0.00 1 Cesium 6.89 40 37.01 40 35.22 40 55.33 39 Barium 12.12 37 98.20 37 154.65 37 161.09 37 Lanthanum 10.37 37 – – – – – – Lutetium 7.68 24 – – – – – – Hafnium 8.52 37 – – – – – – Tantalum 14.37 36 – – – – – – Tungsten 7.38 28 – – – – – – Rhenium 10.40 57 – – – – – – Osmium 6.46 19 – – – – – – Iridium 10.21 25 – – – – – – Platinum 11.61 77 – – – – – – Gold 12.82 32 – – – – – – Mercury 5.94 51 16.39 51 17.67 51 19.75 51 Thallium 10.42 44 32.63 44 73.96 45 73.18 45 Lead 7.92 44 18.08 44 14.18 44 16.71 44 Bismuth 7.74 53 99.88 53 28.95 53 119.23 53 Table 6 Average unsigned errors in bond lengths (Å) Element PM6 No. PM5 No. PM3 No. AM1 No. Hydrogen 0.044 238 0.056 219 0.032 217 0.035 181 Helium 0.251 6 0.459 6 0.182 4 0.655 5 Lithium 0.175 111 0.191 110 0.167 110 0.171 105 Beryllium 0.076 42 0.131 42 0.067 42 0.085 42 Boron 0.027 116 0.043 116 0.066 122 – – Carbon 0.057 1191 0.066 693 0.051 634 0.063 628 Nitrogen 0.090 663 0.145 309 0.124 259 0.163 253 Oxygen 0.095 1163 0.122 625 0.103 577 0.117 571 Fluorine 0.063 396 0.096 246 0.069 251 0.101 228 Neon 0.353 5 0.182 2 0.062 1 0.030 1 Sodium 0.229 33 0.200 33 0.208 30 0.140 29 Magnesium 0.089 106 0.067 106 0.167 105 0.073 106 Aluminium 0.045 77 0.120 72 0.098 70 0.138 70 Silicon 0.039 97 0.056 94 0.074 95 0.077 90 Phosphorus 0.039 141 0.078 92 0.073 92 0.083 87 Sulfur 0.094 359 0.107 216 0.091 207 0.134 200 Chlorine 0.069 672 0.098 283 0.095 284 0.130 285 Argon 0.258 4 0.303 1 – – – – Potassium 0.139 46 0.135 47 0.148 47 0.281 46 Calcium 0.133 67 0.177 69 0.151 67 0.102 60 Scandium 0.053 90 – – – – – – Titanium 0.078 140 – – – – – – Vanadium 0.090 168 – – – – – – Chromium 0.080 89 – – – – – – Manganese 0.083 107 – – – – – – Iron 0.102 117 – – – – – – Cobalt 0.107 100 – – – – – – Nickel 0.065 133 – – – – – – Copper 0.174 130 – – – – – – Zinc 0.076 77 0.084 77 0.098 77 0.142 76 Gallium 0.048 80 0.105 81 0.192 81 0.135 81 Germanium 0.038 131 0.045 131 0.056 133 0.068 133 Arsenic 0.073 72 0.069 70 0.080 72 0.099 72 Selenium 0.056 56 0.094 55 0.071 54 0.061 54 Bromine 0.104 358 0.106 184 0.146 182 0.136 184 Krypton 0.059 6 0.417 3 0.623 3 0.602 3 Rubidium 0.413 36 0.498 37 0.176 34 0.230 36 Strontium 0.087 56 0.199 55 0.128 32 0.242 47 Yttrium 0.132 69 – – – – – – Zirconium 0.063 65 – – – – – – Niobium 0.060 88 – – – – – – Molybdenum 0.104 89 – – – – 0.095 84 Technetium 0.078 84 – – – – – – Ruthenium 0.073 113 – – – – – – Rhodium 0.162 68 – – – – – – Palladium 0.080 120 – – – – – – Silver 0.151 41 – – – – – – Cadmium 0.159 54 0.179 55 0.121 50 – – Indium 0.039 77 0.085 77 0.155 77 0.102 77 Tin 0.073 96 0.065 96 0.078 96 0.087 94 Antimony 0.060 92 0.169 91 0.083 91 0.135 92 Tellurium 0.070 80 0.162 79 0.123 77 0.122 79 Iodine 0.144 286 0.137 147 0.146 145 0.175 141 Xenon 0.620 8 0.584 4 0.472 2 0.793 6 Cesium 0.258 40 0.335 43 0.372 25 0.358 43 Barium 0.202 51 0.228 47 0.207 48 0.261 51 Lanthanum 0.253 47 – – – – – – Lutetium 0.050 60 – – – – – – Hafnium 0.071 42 – – – – – – Tantalum 0.074 59 – – – – – – Tungsten 0.141 57 – – – – – – Rhenium 0.068 108 – – – – – – Osmium 0.072 50 – – – – – – Iridium 0.169 71 – – – – – – Platinum 0.057 140 – – – – – – Gold 0.158 84 – – – – – – Mercury 0.143 64 0.110 64 0.135 63 0.139 64 Thallium 0.202 59 0.248 55 0.208 45 0.268 43 Lead 0.140 53 0.167 53 0.121 53 0.125 51 Bismuth 0.142 81 0.616 75 0.225 82 0.682 75 Table 7 Average unsigned errors in bond angles (Degrees) Element PM6 No. in set PM5 No. in set PM3 No. in set AM1 No. in set Lithium 7.79 28 6.82 28 3.53 28 9.48 28 Beryllium 6.61 14 6.44 14 6.94 14 5.98 14 Boron 3.27 31 4.41 31 4.61 31 – – Carbon 2.50 134 2.79 134 2.75 131 2.25 131 Nitrogen 7.32 37 8.01 37 6.75 35 7.94 31 Oxygen 12.14 59 11.12 58 10.17 53 9.57 42 Fluorine 8.32 3 16.18 3 26.34 3 24.67 2 Sodium 21.00 4 2.87 4 3.43 4 5.32 4 Magnesium 8.44 24 7.28 24 14.23 24 7.10 24 Aluminum 4.05 20 5.26 20 7.21 19 4.33 19 Silicon 5.25 35 3.37 35 2.81 34 2.88 34 Phosphorus 3.24 35 4.40 35 6.01 35 5.07 35 Sulfur 5.23 46 5.64 45 5.42 41 5.05 41 Chlorine 3.65 5 19.47 5 10.31 5 14.80 5 Potassium 17.90 11 10.27 11 12.93 11 12.75 11 Calcium 14.99 16 11.35 16 16.81 16 18.06 15 Scandium 7.98 32 – – – – – – Titanium 7.86 39 – – – – – – Vanadium 7.46 44 – – – – – – Chromium 3.77 19 – – – – – – Manganese 6.02 26 – – – – – – Iron 11.21 30 – – – – – – Cobalt 10.68 29 – – – – – – Nickel 10.44 48 – – – – – – Copper 10.77 44 – – – – – – Zinc 10.92 27 14.41 27 8.16 27 13.34 27 Gallium 4.43 18 10.86 18 14.43 18 13.84 18 Germanium 4.58 52 5.37 52 8.95 52 5.71 52 Arsenic 6.29 36 6.52 36 6.48 36 5.03 36 Selenium 7.27 24 16.16 24 12.37 23 5.46 23 Bromine 12.64 4 20.03 4 19.21 3 3.27 3 Rubidium 9.69 11 10.20 11 21.03 11 6.68 11 Strontium 18.16 25 32.91 25 32.92 25 31.00 25 Yttrium 12.29 34 – – – – – – Zirconium 10.36 12 – – – – – – Niobium 6.54 23 – – – – – – Molybdenum 8.15 27 – – – – 8.73 27 Technetium 4.96 22 – – – – – – Ruthenium 6.93 34 – – – – – – Rhodium 10.66 22 – – – – – – Palladium 9.19 46 – – – – – – Silver 23.36 9 – – – – – – Cadmium 15.23 10 13.52 10 20.09 10 – – Indium 4.47 17 7.21 17 5.30 17 4.94 17 Tin 3.06 34 4.09 34 3.74 34 11.81 34 Antimony 6.49 41 12.24 41 6.84 41 7.40 41 Tellurium 4.85 25 7.00 25 5.33 25 7.87 25 Iodine 8.33 1 12.55 1 20.66 1 4.53 1 Cesium 15.50 12 8.52 12 19.38 12 11.75 12 Barium 28.65 10 28.43 10 37.04 10 36.17 10 Lanthanum 9.25 14 – – – – – – Lutetium 7.08 26 – – – – – – Hafnium 5.64 10 – – – – – – Tantalum 9.88 15 – – – – – – Tungsten 10.90 9 – – – – – – Rhenium 7.39 32 – – – – – – Osmium 12.67 10 – – – – – – Iridium 7.86 18 – – – – – – Platinum 5.92 72 – – – – – – Gold 13.59 16 – – – – – – Mercury 20.20 15 20.99 15 18.47 15 21.49 15 Thallium 5.73 10 10.28 10 19.95 10 25.38 10 Lead 4.33 20 5.24 20 4.61 20 3.57 19 Bismuth 8.01 25 21.74 25 8.28 25 33.99 25 Table 8 Average unsigned errors in dipole moments (D) Element PM6 No. PM5 No. PM3 No. AM1 No. Hydrogen 0.62 266 0.80 265 0.64 222 0.50 204 Lithium 0.78 16 0.95 16 0.79 16 0.52 16 Beryllium 1.63 1 1.49 1 0.27 1 0.53 1 Boron 0.66 17 0.66 17 0.73 17 – – Carbon 0.51 219 0.62 218 0.41 176 0.42 165 Nitrogen 0.61 48 0.66 48 0.46 40 0.55 39 Oxygen 0.99 198 1.27 196 1.05 74 0.74 75 Fluorine 0.80 124 1.11 121 0.59 63 0.69 59 Sodium 1.34 6 0.80 6 1.97 6 1.26 6 Aluminium 0.33 1 1.50 1 1.76 1 0.53 1 Silicon 0.21 11 1.09 11 0.72 11 0.29 11 Phosphorus 0.83 14 0.79 14 0.37 10 0.87 10 Sulfur 0.62 28 1.01 28 0.74 21 0.70 21 Chlorine 0.99 103 1.27 100 0.77 47 0.84 43 Potassium 0.44 4 0.34 4 1.30 4 0.58 4 Calcium 0.73 4 1.12 4 1.23 4 0.33 4 Scandium 1.11 9 – – – – – – Titanium 1.02 8 – – – – – – Vanadium 0.82 8 – – – – – – Chromium 1.98 9 – – – – – – Manganese 1.06 11 – – – – – – Iron 1.61 14 – – – – – – Cobalt 1.04 6 – – – – – – Nickel 1.40 15 – – – – – – Copper 1.11 10 – – – – – – Zinc 0.21 4 0.18 4 0.16 4 0.16 4 Gallium 0.20 1 1.81 1 1.35 1 0.64 1 Germanium 0.63 23 0.63 23 0.55 23 0.59 23 Arsenic 0.37 6 0.99 6 0.35 6 0.37 6 Selenium 0.66 10 0.94 10 0.61 10 0.80 10 Bromine 0.90 88 1.34 87 1.01 37 0.50 39 Rubidium 1.84 6 2.43 6 1.65 6 0.44 6 Strontium 1.64 6 1.31 6 2.55 6 1.51 6 Yttrium 1.70 8 – – – – – – Zirconium 0.94 8 – – – – – – Niobium 0.91 10 – – – – – – Molybdenum 1.09 8 – – – – 1.48 8 Technetium 1.74 13 – – – – – – Ruthenium 1.13 12 – – – – – – Rhodium 1.09 6 – – – – – – Palladium 0.97 8 – – – – – – Silver 1.98 9 – – – – – – Cadmium 0.42 2 2.22 2 0.67 2 – – Indium 0.47 3 0.78 3 0.75 3 1.36 3 Tin 0.28 13 0.41 13 0.88 13 0.81 13 Antimony 0.55 5 0.77 5 0.48 5 0.61 5 Tellurium 0.47 2 0.75 2 0.31 2 1.35 2 Iodine 1.03 77 1.54 77 1.48 28 1.22 30 Cesium 1.25 9 3.47 9 1.89 9 0.87 9 Barium 1.77 11 1.29 11 1.93 11 1.11 11 Lanthanum 1.23 8 – – – – – – Hafnium 0.63 6 – – – – – – Tantalum 0.97 5 – – – – – – Tungsten 0.92 14 – – – – – – Rhenium 0.76 13 – – – – – – Osmium 0.63 8 – – – – – – Iridium 0.96 8 – – – – – – Platinum 1.07 8 – – – – – – Gold 0.78 14 – – – – – – Mercury 0.63 9 0.77 9 0.63 9 0.67 9 Thallium 0.89 3 1.35 3 0.45 3 2.43 3 Lead 0.73 6 0.76 6 0.41 6 0.82 6 Bismuth 0.42 8 3.21 8 1.14 8 3.40 8 Table 9 Average unsigned errors in ionization potential (eV) Element PM6 No. PM5 No. PM3 No. AM1 No. Hydrogen 0.43 226 0.40 226 0.60 226 0.52 217 Lithium 0.89 12 0.88 12 1.29 12 0.59 12 Beryllium 0.52 7 0.29 7 0.93 7 0.45 7 Boron 0.31 11 0.34 11 1.01 11 – – Carbon 0.41 230 0.39 230 0.54 230 0.54 227 Nitrogen 0.55 43 0.45 43 0.53 43 0.48 42 Oxygen 0.62 72 0.56 72 0.63 72 0.69 69 Fluorine 0.64 67 0.65 67 0.74 67 0.85 65 Sodium 0.34 5 0.34 5 1.43 5 0.51 4 Magnesium 0.97 4 1.05 4 1.10 4 1.41 4 Aluminum 0.62 3 0.29 3 0.40 3 0.69 3 Silicon 0.43 11 0.81 11 0.70 11 0.68 11 Phosphorus 0.49 13 0.47 13 0.64 13 0.56 13 Sulfur 0.52 46 0.51 46 0.48 46 0.62 46 Chlorine 0.48 62 0.58 62 0.57 60 0.61 57 Potassium 0.23 4 0.50 4 0.54 4 0.34 3 Calcium 0.74 1 1.24 1 0.52 1 0.41 1 Scandium 3.73 1 – – – – – – Titanium 0.09 1 – – – – – – Zinc 0.32 5 0.35 5 0.99 5 0.49 5 Gallium 0.52 3 0.73 3 1.28 3 1.16 3 Germanium 0.70 13 0.49 13 0.93 13 1.05 13 Arsenic 0.69 5 0.31 5 0.62 5 0.79 5 Selenium 0.38 10 0.29 10 0.47 10 1.22 10 Bromine 0.28 33 0.39 33 1.20 33 0.49 32 Rubidium 0.18 3 0.39 3 0.93 3 0.22 3 Strontium 0.63 1 0.38 1 0.14 1 0.26 1 Cadmium 0.33 5 0.46 5 0.39 5 – – Indium 0.63 2 0.86 2 2.06 2 0.83 2 Tin 0.70 14 0.48 14 1.22 14 0.44 14 Antimony 0.44 5 0.90 5 1.16 5 0.54 5 Tellurium 0.43 3 0.20 3 0.25 3 0.70 3 Iodine 0.47 29 0.46 29 0.48 29 0.89 29 Cesium 0.58 4 0.71 4 1.37 4 1.11 4 Barium 0.08 1 0.97 1 0.08 1 0.75 1 Mercury 0.51 12 0.43 12 0.74 12 0.49 12 Thallium 0.30 3 0.46 3 0.80 3 0.53 3 Lead 0.56 13 0.47 13 0.93 13 0.65 13 Bismuth 0.98 5 1.28 5 0.72 5 1.66 5 10 11 12 13 14 Table 10 f –1 Set of elements No. PM6 RM1 PM5 PM3 AM1 H, C, N, O 1157 4.64 4.89 5.60 5.65 9.41 H, C, N, O, F, P, S, Cl, Br, I 1774 5.05 6.57 6.75 8.05 12.57 Whole of main group 3188 6.16 15.27 17.76 22.34 70 elements 4492 8.01 Table 11 Average unsigned errors in bond lengths for various sets of elements (Å) Set of elements No. PM6 RM1 PM5 PM3 AM1 H, C, N, O 413 0.025 0.022 0.033 0.021 0.031 H, C, N, O, F, P, S, Cl, Br, I 712 0.031 0.036 0.044 0.037 0.046 Whole of main group 2636 0.085 0.121 0.104 0.131 70 elements 5154 0.091 Table 12 Average unsigned errors in angles for various sets of elements (Degrees) Set of elements No. PM6 RM1 PM5 PM3 AM1 H, C, N, O 100 3.1 3.1 3.3 2.5 2.7 H, C, N, O, F, P, S, Cl, Br, I 244 3.2 4.0 4.3 3.8 3.4 Whole of main group 900 8.0 8.6 8.5 8.8 70 elements 1681 7.9 Table 13 Average unsigned errors in dipole moments for various sets of elements (D) Set of elements No. PM6 RM1 PM5 PM3 AM1 H, C, N, O 55 0.38 0.22 0.31 0.26 0.26 H, C, N, O, F, P, S, Cl, Br, I 131 0.37 0.33 0.50 0.36 0.38 Whole of main group 313 0.60 0.86 0.72 0.65 70 elements 569 0.85 Table 14 Average unsigned errors in I.P.s for various sets of elements (eV) Set of elements No. PM6 RM1 PM5 PM3 AM1 H, C, N, O 99 0.45 0.40 0.41 0.51 0.45 H, C, N, O, F, P, S, Cl, Br, I 229 0.47 0.41 0.44 0.51 0.56 Whole of main group 383 0.50 0.49 0.68 0.63 70 elements 385 0.50 Comparison with AM1* 15 15 15 Table 15 Average unsigned errors in phosphorus, sulfur, and chlorine   f −1 Bond length (Å) Dipole (D) I.P. (eV) Angles (Degrees) PM6 AM1* No. PM6 AM1* No. PM6 AM1* No. PM6 AM1* No. PM6 AM1* No. Phosphorus 8.3 19.1 90 0.022 0.051 56 0.57 0.49 10 0.51 0.81 12 2.5 3.3 19 Sulfur 6.5 10.6 199 0.029 0.060 71 0.36 0.64 14 0.52 0.50 45 3.1 4.1 34 Chlorine 6.1 18.2 156 0.025 0.106 69 0.55 0.60 10 0.52 0.62 25 3.4 14.6 4 Comparison with RM1 35 s-p 10 11 12 13 14 14 Comparison with high-level methods f f −1 16 1 Table 16 f −1 Statistic PM6 B3LYP* HF* Median 3.26 3.75 5.10 AUE 4.44 5.19 7.37 RMS 6.23 7.42 10.68 No. of molecules in set: 1373 * Basis set: 6–31G* Fig. 1 f Hydrogen bonding 36 −1 17 −1 Table 17 Relative energies of conformers of water dimer Structure Ref. f −1 PM6 PM5 PM3 AM1 s −5.00 −3.96 −0.24 −2.79 −2.81 s 0.00 0.00 0.00 0.00 0.00 i + 0.52 0.83 0.50 0.91 0.64 s 0.57 0.66 0.25 0.93 0.46 i 0.70 0.29 0.11 2.10 −0.94 2 0.95 0.77 0.39 2.63 −0.51 2h 0.99 0.59 0.21 2.71 −0.67 7 (Triply Hydrogen Bonded 1.81 0.93 −1.85 1.16 −0.95 8 (Non-planar Bifurcated 3.57 2.67 −0.83 1.71 1.26 9 (Non-planar Bifurcated 1.79 0.73 −1.95 1.15 −0.87 2v 2.71 1.42 −1.77 1.28 −0.05 *: Relative to two isolated water molecules +: Structures 2 – 10 are relative to Structure 1 18 Table 18 −1 Hydrogen-bonded system Ref PM6 PM5 PM3 AM1 Ammonia - ammonia −2.94 −2.34 −0.77 −0.67 −1.41 Water - methanol −4.90 −5.12 −2.59 −0.20 −4.52 Water - acetone −5.51 −5.25 −2.43 −2.22 −4.09 Water, dimer, linear (O–H–O = 180°) −5.00 −3.69 −1.57 −3.49 −3.16 Water, dimer −5.00 −4.88 −2.43 −1.95 −5.01 Benzene dimer, T-shaped −2.34 −0.83 −0.22 −0.56 −0.07 Water - acetate anion −19.22 −18.72 −12.28 −15.77 −15.91 Water - formaldehyde −5.17 −4.22 −2.17 −2.73 −3.40 Water - ammonia −6.36 −4.32 −2.75 −1.53 −2.90 Water - formamide −8.88 −7.60 −4.14 −4.33 −7.54 Formic acid, dimer −13.90 −10.03 −4.75 −8.65 −6.44 Water - methylammonium cation −18.76 −14.90 −8.94 −10.48 −14.36 Formamide - formamide −13.55 −10.83 −4.46 −6.08 −8.14 Acetic acid, dimer −14.89 −10.33 −4.50 −8.70 −6.44 Nitrogen pyramidalization sp 2 sp 2 19 Table 19 Average errors in pyramidalization of nitrogen (Torsion angle about nitrogen, in degrees) Statistic PM6 PM3 AM1 RM1 Average signed error −1.7 −13.6 0.2 9.7 Average unsigned error 5.0 15.0 3.5 19.1 Transition metals Optimizing parameters for transition metals was not as straightforward as for the main group elements. As with the main group compounds, there is a wealth of structural reference data on transition metal complexes. However, unlike main group compounds, there is a distinct shortage of reliable thermochemical data. To alleviate this shortage, the thermochemical data that was available was augmented by the results of DFT calculations. It was recognized, however, that these derived reference data were likely to be of a lower accuracy than the experimental data. Many transition metal complexes are also highly labile; a consequence of this was that some moieties that are known to exist in the solid phase were predicted to be unstable in the gas phase, at least at the PM6 level of calculation. In most cases, such moieties had a high formal charge, therefore, without any countercharge, their instability in isolation is understandable. When an intrinsically unstable ion was identified, it was removed from further consideration. d-d 37 s p d Sets of transition metals For the purpose of discussion, the set of 30 transition metals can be partitioned into eight of the groups of the Periodic Table, with each group containing one or more triads of elements. A detailed discussion of each element is impractical because of the wide range of compounds in transition metal chemistry. The following section, therefore, will be limited to systems where PM6 does not work well, and to systems illustrative of the structural chemistry of specific elements. Group IIIA: Scandium, Yttrium, Lanthanum, and Lutetium III 5 2 2 9 3+ 2 7 3+ Fig. 2 5 III f III f Group IVA: Titanium, Zirconium, and Hafnium III IV 4 The behavior of zirconium and hafnium is similar to that of titanium. Group VA: Vanadium, Niobium, and Tantalum 5 2 5 6 12 2+ 3 Fig. 3 6 12 2+ 6 12 2+ 4h h Group VIA: Chromium, Molybdenum, and Tungsten III - 4 21 Fig. 4 − IV 8 4− 4d 6 3 8 6 2− 3 4 VI 12 36 4− 5 4 Fig. 5 4 12 36 4− PM6 predicts the structures of all three hexacarbonyls with good accuracy, but gives qualitatively the wrong structures for the dinuclear decacarbonyls. This failure to qualitatively predict the structure of the polynuclear carbonyls occurred frequently during the survey of the transition metals. Group VIIA: Manganese, Technetium, and Rhenium II 2− 6 Fig. 6 3+ cum granus salis 4 − Group VIIIA: Iron, Cobalt, Nickel, Ruthenium, Rhodium, Palladium, Osmium, Iridium, and Platinum 7 5 3h 4v 4v −1 3h −1 4v 3h Fig. 7 trans-7,8-Dihydro-2,3,7,8,12,13,17,18-octaethylporphyrinato-iron (II) Reference value (CSD entry BUYKUB) in parenthesis 4 8 Fig. 8 Nickel Dimethylglyoxime Reference value (CSD entry NIMGLO10) in parenthesis 3 2 2 4 − 9 Fig. 9 2 Group IB: Copper, Silver, and Gold 4 10 Fig. 10 Copper phthalocyanine Reference value (CSD entry CUPOCY16) in parenthesis III 6h 7 Group IIB Zinc, Cadmium, and Mercury d s p Discussion Methodological changes During the development of PM6, only very minor changes were made to the set of approximations. The main change was in the construction of the training set used for parameter optimization. One of the most important changes was the use of rules in the training set to define chemical information that was not a function of any single molecule. In earlier methods the training set had included only standard reference data. Of their nature, such data could not allow for chemical facts that were independent of any one moiety. For example, the strength of a hydrogen bond is of great importance in biochemistry, but it could not be expressed in terms of a single species. By use of rules, the value of some chemical quantity could be related to that of another. In the case of hydrogen bonding, the heat of formation of the water dimer was made a function of the heat of formation of two separated water molecules. 5 2 3 3 f −1 −1 −1 III 2 7 3+ p p p Detecting faults in semiempirical methods is difficult, and rather than wait until all errors of this type were found and fixed, a process that could potentially take several more years, the decision was made to freeze the parameters at their current value. Obviously, PM6 still has many errors; some have already been described. Work has already started in an attempt to correct them. Elimination of computational artifacts Earlier NDDO methods, particularly PM3 and AM1, produced artifacts in potential energy surfaces as a result of unrealistic terms in the core-core approximation, specifically in the set of Gaussian functions used. In PM6, only one Gaussian-type correction to the core-core potential is allowed, and, consequently, the potential for these artifacts has been reduced. On the other hand, because PM6 uses diatomic parameters, the likelihood of readily-characterized errors involving specific pairs of atoms, e.g. Sc and H, as mentioned earlier, is increased. Errors of this type can be easily eliminated by a re-parameterization of the faulty diatomic. There are over 450 sets of diatomic interactions parameterized in PM6, covering most of the common types of chemical bonds. But the number of potential bonds is much larger: given 70 elements, there are almost 2500 diatomic sets. If a molecule contains two elements for which the diatomic interaction parameters are missing, then, provided the elements are well separated, say by more than 4 Ångstroms, the absence of the parameters will not be important. If the two elements were near to each other, then the diatomic core-core parameters would be needed. This would involve generating a small training set of reference data that included a few examples of the type of interaction involved, and optimizing the two terms in the diatomic interaction. This ability to add diatomic parameter sets to PM6 without modifying the underlying parameterization has the advantage that more and more types of interaction can be added without changing the essential nature of the method. Accuracy −1 −1 Several low-energy phenomena are predicted more accurately by PM6, with the most important of these being the prediction of the energies and geometries involved in hydrogen bonding. One consequence of this increased accuracy is that the lowest energy conformer of acetylacetone is now correctly predicted to be the ene-ol structure, and not the twisted di-one configuration. −1 −1 21 f f f f As a result of the current work, there is a clear strategy for further improving the accuracy of semiempirical methods. All three potential sources of error need to be addressed. Regarding reference data, considerably more data are needed than were used here. This would likely come from increased use of high-level theoretical methods: methods significantly more accurate than those used here would obviously be needed in any future work. Parameter optimization can be performed with confidence and reliability, particularly when well-behaved systems are used. In all cases examined where problems were encountered in parameter optimization, problems also occurred in the normal SCF calculation in MOPAC2007. This implies that as faults in the SCF procedure are corrected, faults in parameter optimization would also be removed. Permanent errors Notwithstanding the optimism just expressed, not all errors can be eliminated by better data and better optimizations. Despite strenuous efforts, some calculated quantities persistently failed to agree with the reference values. Many potential causes for these failures were investigated. In each case the weight for the offending quantity was increased considerably and the parameter optimization re-run. When that was done, the specific error decreased, but errors elsewhere increased disproportionately. Since the final gradient of the error function was acceptably small, it followed that the parameter optimization was not in error. The reference data were checked to ensure that they were in fact trustworthy. Because two of the three possible origins of error had been eliminated, the inescapable conclusion was that there is a fault in the set of approximations. The most serious of these faults was the qualitatively incorrect prediction of the geometry of the exceedingly simple system, iron pentacarbonyl. Conclusions d −1 −1 The potential exists for further large increases in accuracy. This would likely result from the increased use of accurate reference data derived from high-level methods, and from the development of better tools for detecting errors at an early stage of method development. Electronic supplementary material Tables of errors in predicted heats of formation, geometries, dipole moments, and ionization potentials obtained using PM6, AM1, PM3, PM5, RM1, AM1*, HF 6-31G*, and B3LYP 6-31G* for individual species are provided, together with references for all reference data used. These data were used in generating the statistics presented in the discussion on accuracy. ESM DOC 23.7 MB