One of the questions that we are very often asked is, ‘How do you handle charged molecules?’ This is very relevant for small molecules but becomes critical in the context of the another popular question: ‘Can you put fields onto a protein?’ Here we detail our approach and rationale for the handling of charged small molecules.
In order to calculate the electrostatic field value at a point around a molecule we place a charged ‘probe atom’ at that position and measure its interaction energy. This calculation is only approximate: for a proper energy, we would have to allow the molecule to repolarize in response to the probe atom. Cresset’s XED force field includes atomic polarizability, but the calculations would be too slow if we repolarized the ligand for every putative field point position. The parameterization of the force field to give good field patterns takes this into account.
This field calculation algorithm works very well for most neutral small molecules and one of the criteria we used in parameterising the XED force field was to ensure that the field potentials calculated in this way match experimentally-determined molecular interaction energies and geometries as well as possible. However, there are two potential sticking points that we have to deal with: dielectric and formal charge.
The dielectric issue is an interesting one. We are primarily interested in molecules that act as ligands i.e. they are interacting with a biological macromolecule of some sort, usually a protein. What sort of environment around a ligand molecule are we expecting? The simplest solution, to do our calculations in vacuo, is obviously inappropriate. However, the often-used alternative, to apply some sort of solvation electrostatic model to model a water environment (GBSA, Poisson-Boltzmann, etc.), is also inappropriate as we are not really interested in how our ligand looks when fully solvated by water, as that’s not how the protein sees it. The effective dielectric inside a protein active site is difficult to measure exactly: generally accepted values seem to be around 2-5 for dry protein powders, with values for the active sites of solvated proteins ranging from 1 to ~20 or more.1, 2 The dielectric in enzyme active sites seems to be around 10, but this is probably swayed by strong contributions from mobile charged residues on the protein surface.3
Given all this, the question of what dielectric environment to assume for a ligand is more complex than is usually assumed. We have spent a lot of time looking at this and have found that assuming a simple dielectric of 4 around the ligand seems to give good results. This is roughly appropriate for moderately buried active sites but is almost certainly too low for highly exposed active sites. In practice it seems to work well for neutral molecules but what to do with the charged ones? Using a simple dielectric of 4 leads to the charged group completely dominating the electrostatics of the molecule – rather than a nuanced field pattern describing its binding characteristics you just get a big ball of charge (right). It should be noted that the commonly-used method of calculating Poisson-Boltzmann electrostatics in water can suffer from the same problem in small molecules: the low dielectric inside the molecule (as opposed to the high dielectric surrounding it) allows the local formal charge to dominate the molecule’s electrostatics.
We make two assumptions: interactions in biological systems tend to occur at high ionic strength, and charged ligands usually (albeit not always) tend to interact with charged residues in the protein. They very rarely interact directly with very hydrophobic/low dielectric regions of a binding pocket. Since the exact protein environment around a ligand isn’t usually known (if it was, you would not need to use field-based alignments in the first place!), we approximate these effects by assuming that charged groups usually have a counterion or mobile solvation shell somewhere nearby, and hence that they on average experience a higher dielectric than the non-charged regions of a molecule. Extensive testing showed that an effective dielectric of 32 works very well, so all of our field calculations take place with a dielectric of 4 for the neutral parts of the molecule, and a dielectric of 32 for charged groups. The effect of this on the field is that charged groups have large field points, as you would expect, but without swamping the electrostatic potentials from the rest of the molecule.
So we have a workable solution for the dielectric problem, at least for small molecules. To return to the other question, ‘can we add field patterns to proteins?’ From a computational point of view, the answer is ‘yes’, although the calculations are slow. However, if you think about the dielectric issues, you soon see that applying the same field calculation methodology to ligands and proteins is fundamentally flawed. Our standard field calculation method assumes that the molecule exists in a protein-ish environment. This is not true of the protein! We’ve been looking at alternatives that are proving very promising which we will detail on our blog and within our technology section as the methods evolve.