# GIST

Perform grid inhomogenous solvation theory.

gist [doorder] [doeij] [skipE] [refdens <rdval>] [temp <tval>] [noimage] [gridcntr <xval> <yval> <zval>]

[griddim <xval> <yval> <zval>] [gridspacn <spaceval>] [prefix <filename prefix>] [ext <grid extension>] [out <output>] [info <info>]

[doorder] Calculate the water order parameter [reference] for each voxel.

[doeij] Calculate the triangular matrix representing the water-water interactions between pairs of voxels (see below).

[skipE] Skip all energy calculations (cannot be specified with ’doeij’).

[refdens rdval>] Reference density of bulk water, used in computing g_O, g_H, and the translational entropy. Default is 0.0334 molecules/Å3.

[temp <tval>] Temperature of the input trajectory.

[noimage] Disable distance imaging in energy calculation.

[gridcntr <xval> <yval> <zval>] Coordinates (Å) of the center of the grid (default 0.0, 0.0, 0.0).

[griddim <xval> <yval> <zval>] Grid dimensions along each coordinate axis (default 40, 40, 40).

[gridspacn <spaceval>] Grid spacing (linear dimension of each voxel) in Angstroms. Values greater than 0.75 Å are not recommended (default 0.5 Å).

[prefix <filename prefix>] Output file name prefix (default “gist”).

[ext <grid extension>] Output grid file name extension (default “.dx”).

[out <output>] Name of the main GIST output file. If not specified set to ’<prefix>-output.dat’.

[info <info>] Name of main GIST info file. If not specified info is written to standard output.

**DataSet Aspects:**

[gO] Number density of oxygen centers found in the voxel, in units of the bulk density.

[gH] Number density of hydrogen centers found in the voxel in units of the reference bulk density.

[Esw] Mean solute-water interaction energy density.

[Eww] Mean water-water interaction energy density.

[dTStrans] First order translational entropy density.

[dTSorient] First order orientational entropy density .

[neighbor] Mean number of waters neighboring the water molecules found in this voxel multiplied by the voxel number density.

[dipole] Magnitude of mean dipole moment (polarization).

[order] Average Tetrahedral Order Parameter.

[dipolex] x-component of the mean water dipole moment density

[dipoley] y-component of the mean water dipole moment density

[dipolez] z-component of the mean water dipole moment density

[Eij] Water-water interaction matrix.

Grid Inhomogeneous Solvation Theory [ref1, ref2] (GIST) is a method for analyzing the structure and thermodynamics of solvent in the vicinity of a solute molecule. The current implementation works for only water, but the method can be generalized to other solvents whose molecules are rigid like water, such as chloroform or dimethylsulfoxide (DMSO). GIST post-processes explicit solvent simulation data to create a three-dimensional mapping of water density and thermodynamic properties within a region of interest, which is defined by a user-specified 3D rectangular grid. The small grid boxes are referred to as voxels, and each voxel is associated with solvent properties. (See Fig. 28.1.)

The GIST implementation incorporated into AmberTools cpptraj also calculates a number of other local water properties, as listed below. GIST works for the nonpolarizable water models currently supported by AMBER. In order to carry out a GIST calculation, you must have a trajectory file generated with explicit water, as well as the corresponding topology file. To generate the most readily interpretable results, it is recommended that the

solute (e.g., a protein) be restrained into essentially one conformation. GIST will then provide information about the structure and thermodynamics of the solvent for that conformation. For a room-temperature simulation of a solvent-exposed binding site, and a grid-spacing of 0.5 Å, it is recommended that the simulation be at least 10-20 ns in duration, and it is also a good idea to check for convergence of the GIST properties you are interested in by loading and then processing successively more frames of your trajectory file. Because GIST assumes that the solute of interest comprises all molecules in the simulation that are not waters, it is a good idea to remove all counterions and cosolutes with cpptraj’s strip command before running GIST. A sample series of cpptraj commands for running GIST is provided below.

Although it is not mandatory to supply values of gridcntr, griddim and gridspcn, these parameters should be carefully chosen, because they determine the region to be analyzed (gridcntr and griddim) and the spatial resolution and convergence properties of the results (gridspcn). In particular, although smaller grid spacings will give finer spatial resolution, longer simulation times will be needed to converge the properties in the smaller voxels that result. A larger grid spacing will allow earlier convergence, but will smooth the spatial distributions and hence can reduce accuracy.

The reference density of water (rdval) is taken by default to be the experimental number density of pure water at 300 K and 1 atm. However, different water models may yield slightly different bulk densities under these conditions, and the density also depends on T and P. If you know that the bulk density of the water model you are using, at the T and P of your simulation, deviates significantly from 0.0334 water molecules/Å3, it would be

advisable to supply the actual value with the refdens keyword, instead of allowing GIST to supply the default value.

**GIST Output**

GIST generates a main output file and a collection of grid data files that by default are in Data Explorer format (.dx); this can be changed via the ext keyword. These grid files enable visualization of the various gridded quantities, such as with the program VMD [reference]. If the doeij keyword is provided, GIST also writes out a matrix of water-water interactions between pairs of voxels. In addition, run details are written to stdout, which can be

redirected into a log file.

Note that a number of quantities are written out as both densities and normalized quantities. For example, the output file includes both the solute-water energy density and the normalized (per water) solute-water energy. In all cases, the normalized quantity at voxel i, Xi;norm is related to the corresponding density, Xi;dens, by the relationship Xi;norm = riXi;dens, where ri is the number density of water in the voxel. The normalized quantity

provides information regarding the nature of the water found in the voxel. The density has the property that, if the grid extended over the entire simulation volume, the total system quantity would be given by Xtot =Vvoxel åi Xi;dens, where Vvoxel is the volume of one grid voxel.

The main output file takes the form of a space-delimited-variable file, where each row corresponds to one voxel of the grid. This file can easily be opened with and manipulated with spreadsheet programs like Excel and LibreOffice Calc. The columns are as follows.

• index – A unique, sequential integer assigned to each voxel

• xcoord – x coordinate of the center of the voxel (Å)

• ycoord – y coordinate of the center of the voxel (Å)

• zcoord – z coordinate of the center of the voxel (Å)

• population – Number of water molecule, ni, found in the voxel over the entire simulation. A water molecule is deemed to populate a voxel if its oxygen coordinates are inside the voxel. The expectation value of this quantity increases in proportion to the length of the simulation.

• g_O – Number density of oxygen centers found in the voxel, in units of the bulk density (rdval). Thus, the expectation value of g_O for a neat water system is unity.

• g_H – Number density of hydrogen centers found in the voxel in units of the reference bulk density (2rdval). Thus, the expectation value of g_H for a neat water system would be unity.

• dTStrans-dens – First order translational entropy density (kcal/mole/Å3), referenced to the translational entropy of bulk water, based on the value rdval.

• dTStrans-norm – First order translational entropy per water molecule (kcal/mole/molecule), referenced to the translational entropy of bulk water, based on the value rdval. The quantity dTStrans-norm equals dTStrans-dens divided by the number density of the voxel.

• dTSorient-dens – First order orientational entropy density (kcal/mole/Å3), referenced to bulk solvent (see below).

• dTSorient-norm – First order orientational entropy per water molecule (kcal/mole/water), referenced to bulk solvent (see below). This quantity equals dTSorient-dens divided by the number density of the voxel.

• Esw-dens – Mean solute-water interaction energy density (kcal/mole/Å3). This is the interaction of the solvent in a given voxel with the entire solute. Both Lennard-Jones and electrostatic interactions are computed without any cutoff, within the minimum image convention but without Ewald summation. This quantity is referenced to bulk, in the trivial sense that the solute-solvent interaction energy is zero in bulk.

• Esw-norm – Mean solute-water interaction energy per water molecule. This equals Esw-dens divided by the number density of the voxel (kcal/mole/molecule).

• Eww-dens – Mean water-water interaction energy density, scaled by ½ to prevent double-counting, and not referenced to the corresponding bulk value of this quantity (see below). This quantity is one half of the mean interaction energy of the water in a given voxel with all other waters in the system, both on and off the GIST grid, divided by the volume of the voxel (kcal/mole/Å3). Again, both Lennard-Jones and electrostatic interactions are computed without any cutoff, within the minimum image convention.

• Eww-norm – Mean water-water interaction energy, normalized to the mean number of water molecules in the voxel (kcal/mole/water). See prior column definition for details.

• Dipole_x-dens – x-component of the mean water dipole moment density (Debye/Å3).

• Dipole_y-dens – y-component of the mean water dipole moment density (Debye/Å3).

• Dipole_z-dens – z-component of the mean water dipole moment density (Debye/Å3).

• Dipole-dens – Magnitude of mean dipole moment (polarization) (Debye/Å3).

• Neighbor-dens – Mean number of waters neighboring the water molecules found in this voxel multiplied by the voxel number density. Two waters are considered neighbors if their oxygens are within 3.5 angstroms

of each other.

For any given frame,

• Neighbor-norm – Mean number of neighboring water molecules, per water molecule found in the voxel (units of number per water).

• Order-norm – Average Tetrahedral Order Parameter [reference], qtet , for water molecules found in the voxel, normalized by the number of waters in the voxel. The order parameter for water i in a given frame is given

by: qtet (i) = 1� 3

8 å3j

=1å4k

=j+1(cosfi jk + 13

)2 where j and k index the 4 closest water neighbors to water i, and fi jk is the angle formed by water i, j, and k. If the doorder keyword is not provided or is set to FALSE, then this calculation will not be done, and the entries in this column will be set to zero. Grid files are provided for all computed quantities listed above, except that the normalized quantities are not included.

The filenames are as follows: gist-gO.dx, gist-gH.dx, gist-dTStrans-dens.dx, gist-dTSorient-dens.dx, gist-Esw-dens.dx, gist-Eww-dens.dx, gist-dipolex-dens.dx, gist-dipoley-dens.dx, gist-dipolez-dens.dx, gist-dipoledens. dx, gist-neighbor-dens.dx, gist-neighbor-norm.dx, gist-order-norm.dx. If the doorder keyword is not provided, then the data in gist-order-norm.dx will all be zeroes. Note that the file of voxel water densities, gist-gO.dx, can be used as input to the program Placevent, in order to define spherical hydration sites based on the density distribution.

Similar grid files with other computed quantities can be generated by reading the gist.out file into a spreadsheet program, processing the numbers to generate a new column of voxel data of interest, and writing this column to an ascii text file. Then the Perl script write_dx_file.pl, which should be available on the GIST tutorial web-site, may be used to read in the column of data and create the corresponding dx file. The input format, and an example, are as follows:

./write_dx_file.pl [filename] [x-dimension y-dimension z-dimension] [x-origin y-origin z-origin] [grid spacing] ./write_dx_file.pl file.dat 40 40 40 13.0 13.0 13.0 0.75

If the doeij keyword is provided, GIST also writes a large file, Eww_ij.dat, containing the mean water-water interaction energies between pairs of voxels, scaled by ½. (See below.) This file has three columns. The first two columns are voxel indexes, i, j, where j > i, so that no pair appears more than once, and the third column is the mean interaction energy (kcal/mole) of water in voxels i and j, scaled by ½. If the occupancy of either voxel is 0, such as for voxels covered by solute atoms, then the interaction energy is zero. In order to save space, such interactions are omitted from the file.

Sample cpptraj input file to run GIST

The following input file, gist.in, causes cpptraj to read a parameter file named topology.top; read in the first 5000 frames of the trajectory file named trajectoryfile.mdcrd; strip out all Na and Cl ions; and carry out a GIST run which computes order parameters, uses a 41x41x45 grid centered at (25.0, 31.0, 30.0) with a spacing of 0.5 Å, uses the default bulk water density of 0.0334 molecules/Å3, and generates the main output file gist.out.

parm topology.top

trajin trajectoryfile.mdcrd 1 5000

strip @Na

strip @Cl

gist doorder doeij gridcntr 25.0 31.0 30.0 griddim 41 41 45

gridspacn 0.50 out gist.out

go

To execute this run in the background, use

cpptraj<gist.in>gist.log& or cpptraj -i gist.in>gist.log&

Referencing GIST results to unperturbed (bulk) water

Water Model Mean Energy (Eww-norm) (kcal/mol/water) Number Density (Å�3)

TIP3P -9.533 0.0329

TIP4PEW -11.036 0.0332

TIP4P -9.856 0.0332

TIP5P -9.596 0.0329

Tip3PFW -11.369 0.0334

SPCE -11.123 0.0333

SPCFW -11.873 0.0329

Table 28.3.: Water model energy and density.

Inhomogeneous fluid solvation theory, which is the basis of GIST, is designed to provide information on how water structure and thermodynamics around a solute molecule, such as a protein, are changed relative to the structure and thermodynamics of unperturbed (bulk) water. Accordingly, the quantities reported by GIST are most informative when the results are referenced to the corresponding bulk water properties. For the orientational

entropy, the reference value is the same regardless of water model or conditions, because the first order orientational distribution of water in the bulk is always uniform. Therefore, the GIST results for orientational entropies are already referenced to bulk. However, cpptraj reports unreferenced values for those GIST quantities whose reference values depend upon the water model and the simulation conditions; i.e., the energies. The translational entropy as well as the number densities will be referenced to bulk using the input referenced density or the default density value of 0.0334. The table below provides useful reference values for these quantities, computed for various water models at P=1atm, T=300K, using GIST in order to ensure a consistent minimum image treatment of periodic boundary conditions.

Users running calculations under significantly different conditions, or with different water models, should consider generating their own reference quantities by applying GIST to a simulation of pure water under their conditions of interest. The quantities of interest can then be obtained in their most precise available form by averaging over voxels, for the pure water simulation. If the quantity of interest is Q, then its average reference value is

Qre f erence = åniQi

åni

, where Qi and ni are, respectively, GIST’s reported values of the quantity and the population in voxel i. The densities, ri, are referenced to the corresponding bulk densities, ro, as gi = ri=ro, while the energy and entropy terms are referenced by subtracting their bulk values.

**Interpreting GIST results**

GIST provides access to the first order entropies and the first- and second-order energies of inhomogeneous fluid solvation theory. Non-zero higher-order entropies exist but are not yet computationally accessible. However, for a pairwise additive force-field, such as those listed in the Table above, the energy is fully described at the second order provided by GIST.

GIST is a research tool, and its applications (to, for example, protein-ligand binding and protein function) are still being explored. The following general comments may be helpful to users studying GIST results.

- The water in voxels near a solute (e.g., a protein) almost always has unfavorable water-water interaction energies, relative to bulk, simply because the solute displaces water, resulting in fewer proximal water-water interactions.
- The unfavorable water-water energies mentioned in reference may be balanced by favorable water-solute interactions. If they are not, as may occur especially for voxels in small, hydrophobic pockets, then the net energy of the water in the voxel may be unfavorable relative to bulk, in which case a ligand which displaces water from the voxel into bulk may get a boost in affinity.
- Because the first order orientational distribution of bulk water is uniform, and a nonuniform distribution always has lower entropy than a uniform one, the solute can only lower the orientational entropy of water, relative to bulk. Thus, this term always opposes solvation, and displacing oriented water into the bulk is always favorable from the standpoint of orientational entropy.
- Localized water, which corresponds to voxels with high water density, has a low first order translational entropy, and the translational entropy around a solute is lower than that in bulk, as a nonuniform translational distribution takes the place of the uniform translational distribution of bulk water.
- The displacement of highly oriented (low orientational entropy) and localized (low translational entropy) water into bulk leads to a favorable increase in these entropy terms.
- However, highly oriented and localized water is often the consequence of strongly favorable polar interactions, such as hydrogen-bonding, between water and the solute. As a consequence, the net favorability of displacing such water is frequently a balance between favorable entropic consequences and unfavorable energetic consequences.
- The water-water energy associated with a given voxel accounts for the interactions of the waters in this voxel with all other waters in the system, including waters in other voxels. This quantity is multiplied by ½, so that, in a pure-water system where the GIST grid covers the entire simulation box, the sum over all voxels equals the correct mean water-water interaction energy. Note that Reference does not include this factor of ½.
- For a typical GIST application, in which the grid occupies only part of the simulation box, the energy bookkeeping can become complicated, as discussed in Section II.B.3 (page 044101-6) of Reference . That section also explains how one can compute the water-water energy associated with a region R defined by a set of voxels, ERW W. The regional water-water energy, on a normalized (per water) basis, is given by ERWW = 2(åi2R Ei;WW �åi2Råj2R; j>i Ei; j;WW) where i 2 R means that voxel i is in region R, Ei;WW is the value of Eww-norm for voxel i, and Ei; j;WW is the value of the water-water interaction energy between voxels i and j, taken from the file Eww_ij.dat. The extra factor of 2 in the present formula, relative to that in the paper, results from application of an extra factor of ½ to the reported water-water interaction energies here.