Introduction 1 2 3 4 6 7 8 Pose prediction 9 10 11 8 1 10 Table 1 Comparison of RMSD results from a set of docking engines  MolDock GLIDE Surflex Mean 1.38 1.38 1.86 SD 1.49 1.74 2.02 Median 0.92 0.69 1.10 11 12 10 model 13 14 However, the problem of choosing which metric to use to compare pose prediction studies is dwarfed by the difficulty in choosing a dataset of protein–ligand co-complexes upon which to perform the comparison. A widespread tendency in conformer reproduction and pose prediction studies is to ignore even the possibility of error in the crystal structures that are being reproduced. Crystal structures are often treated as perfect, infinitely precise and accurate representations of the atomic details of a protein–ligand complex. There are a number of reasons why this is not so; a few will be discussed in the following paragraphs. Crystal structures are models 1 Fig. 1 Fitting atoms into electron density produces a crystallographic model  Incomplete or fragmentary density; The electron density not defining the positions of all atoms unambiguously; Poor structural parameters are used for the fitting process, which can give inappropriate conformations (particularly of ligands); Errors by the users, arising from careless treatment of the data or lack of expertise with small molecules. 15 16 R free 17 R free R free 18 R free R free Crystal structures have unavoidable imprecision R free 19 20 2 21 22 Fig. 2 B-factors for the ligand in the 5ER1 crystal structure  23 24 18 1 1 1 \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym} 
\usepackage{amsfonts} 
\usepackage{amssymb} 
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$ \sigma {\left( {r{\text{, }}B_{{{\text{avg}}}} } \right)} = 2.2 N{_{\text{atoms}}} ^{{1/2}} V{_{{\text{a}}}} ^{{1/2}} n{_{{{\text{obs}}}}} ^{ - 5/6} R_{{{\text{free}}}} $$\end{document} r N atoms V a n obs R free 1 r r r 4 3 3 Fig. 3 4  3 2 Table 2 Resolution and DPI for selected structures from the Kirchmair dataset PDB code Resolution (Å) DPI (Å) 1FC7 1.38 0.69 1FDO 1.38 0.60 1JJT 1.8 1.37 1JJE 1.8 1.25 1CIB 2.5 5.54 1ILH 2.76 0.14 1C8M 2.8 0.18 1QJX 2.8 0.25 18 4 25 3 2 2 24 Fig. 4 The nominal resolution versus the coordinate error for a subset of the Gold (structures with resolution <2.5 Å) and the Glide data sets  2 \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym} 
\usepackage{amsfonts} 
\usepackage{amssymb} 
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$ \sigma {\left( {r{\text{, }}B_{{{\text{avg}}}}}\right)} = 0.22{\left({1 + s}\right)}^{{1/2}}V{_{M}}^{{-1/2}} C^{{-5/6}}R_{{{\text{free}}}} d{_{{\min }}}^{{5/2}} $$\end{document} 2 s V m C d min 4 2 s V m C R free 4 R free overall 26 Crystal structures have avoidable errors R free 27 5 6 Fig. 5 Ligand conformation from 1A8T structure. The conformation has two serious atom–atom clashes  Fig. 6 Ligand conformation from the 1A4K structure. The cis-amide group is an error of fitting  5 28 6 29 30 31 32 33 Virtual screening 34 36 3 3 sampled x% sampled x% total total 3 \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym} 
\usepackage{amsfonts} 
\usepackage{amssymb} 
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$ {\text{EF}} = (\text{Hits}{_{\text{sampled}}}^{\text{x\%}}/{\text{N}}{_{\text{sampled}}}^{\text{x\%}}) \times {\text{(N}}_{{{\text{total}}}} {\text{/Hits}}_{{{\text{total}}}} {\text{) }} $$\end{document} It is dependent on the structure of the dataset, in that datasets with larger proportions of actives will have a narrower range of possible enrichments. It penalizes ranking one active compound above another. It exhibits pernicious behaviour at the cut-off at which the enrichment is calculated. It gives no weight to where in the ranked list a known active compound appears. Thus to calculate enrichment at 1% in a virtual screen of 10,000 compounds, the number of actives (N) in the top ranked 100 compounds is needed. However the enrichment at 1% is the same whether the N active compounds are ranked at the very top of the list or at the very bottom of the top ranked 100. It is difficult to calculate analytically errors in enrichment, and there is no available literature for such a calculation. 37 37 38 39 40 3 41 42 2 43 that have not yet been conducted 41 44 2 vide infra vide supra 45 43 37 46 7 47 48 Fig. 7 Effect of decoy selection method on virtual screening by docking  8 2 49 46 50 51 Fig. 8 AUCs for various virtual screening methods on part of the Surflex-Dock validation set  8 52 decoy active 4 53 8 54 It is unfortunate that the docking targets in DUD (39 crystal structures and 1 homology model) were not selected with as much care as the small molecule datasets. In 6 of the 38 co-crystal structures in DUD (there is one apo structure in the set), the DPIs are 1.5 Å or more, resulting in significant uncertainty in the positioning of any atom in these structures. These structures are ALR2 (1AH3), COX-2 (1CX2), EGFR (1M17), GR (1M2XZ), InhA (1P44) and p38 (1KV2). Accordingly docking results from these structures should be interpreted with great care. Conclusions R free 52 55