nastruct

Perform nucleic acid structure analysis.

nastruct [<dataset name>] [resrange <range>] [sscalc] [naout <suffix>] 
     [noheader] [resmap <ResName>:{A,C,G,T,U} ...] [calcnohb]
          [noframespaces] [baseref <file>] ...
          [bpmode {3dna | babcock}] [allhb]
     [hbcut <hbcut>] [origincut <origincut>] [altona | cremer]
     [zcut <zcut>] [zanglecut <zanglecut>] [groovecalc {simple | 3dna}] 
          [axesout <file> [axesoutarg <arg> ...] [axesparmout <file>]]
     [bpaxesout <file> [bpaxesoutarg <arg> ...] [bpaxesparmout <file>]]
     [stepaxesout <file> [stepaxesoutarg <arg> ...] [stepaxesparmout <file>]]
     [{first | reference | ref <name> | refindex <#> | allframes | specifiedbp pairs <b1>-<b2>, ...}]

[<dataset name>] Output data set name.
[resrange <range>] Residue range to search for nucleic acids in (default all).
sscalc Calculate parameters between consecutive bases in strands
[naout <suffix>] File name suffix for output files:
BP.<suffix> for base pair parameters,
BPstep.<suffix> for base pair step parameters,
Helix.<suffix> for base pair step helical parameters.
SS.<suffix> for parameters of consecutive bases in strands if sscalc is specified.
[noheader] Do not print header to naout file.
[resmap <ResName>:{A,C,G,T,U}] Attempt to treat residues named <ResName> as if it were A, C, G, T, or U; useful for residues with modifications or non-standard residue names. This will only work if enough reference atoms are present in <ResName>.
[calcnohb] Calculate parameters between bases in base pairs even if no hydrogen bonds present between them.
noframespaces If specified, do not add a blank line in the output.
[baseref <file>] Specify a custom nucleic acid base reference. One file per custom residue; multiple ’baseref’ keywords may be present. See below for details.
[bpmode {3dna|babcock}] Specify axis conventions for calculating base pair parameters. If ‘3dna’ (default), use conventions of 3DNA[20]; flip Y and Z of complimentary base for antiparallel. If ‘babcock’, use conventions of Babcock et al.; flip Y and Z of complimentary base for antiparallel, flip X and Y for parallel.
[allhb] Report the total number of hydrogen bonds detected instead of just the number of Watson-Crick-Franklin hydrogen bonds.
[hbcut <hbcut>] Distance cutoff (in Angstroms) for determining hydrogen bonds between bases (default 3.5).
[origincut <origincut>] Distance cutoff (in Angstroms) between base pair axis origins for determining which bases are eligible for base-pairing (default 2.5).
[altona] Use method of Altona & Sundaralingam to calculate sugar pucker (default, see pucker command).
[cremer] Use method of Cremer and Pople to calculate sugar pucker (see pucker command).
[zcut] Distance cutoff (in Angstroms) between base reference axes along the Z axis (i.e. stagger) for determining base pairing (default 2).
[zanglecut] Angle cutoff (in degrees) between base reference Z axes for determining base pairing (default 65).
[groovecalc] Groove width calculation method: simple Use P-P distance for major groove, O4-O4 distance for minor groove. Output to ’BP.<suffix>’. 3dna Use groove width calculation of El Hassan and Calladine. Output to ’BPstep.<suffix>’.
[axesout <file>] Trajectory file to write base axes to.
[axesoutarg ] Trajectory argument to pass to base axes trajectory file (can specify more than once).
[axesparmout <file>] Topology file to write base axes pseudo topology to.
[bpaxesout <file>] Trajectory file to write base pair axes to.
[bpaxesoutarg ] Trajectory argument to pass to base pair axes trajectory file (can specify more than once).
[bpaxesparmout <file>] Topology file to write base pair axes pseudo topology to.
[stepaxesout <file>] Trajectory file to write base pair step axes to.
[stepaxesoutarg ] Trajectory argument to pass to base pair step axes trajectory file (can specify more than once).
[stepaxesparmout <file>] Topology file to write base pair step axes pseudo topology to.
[axisnameo ] Change name of axis origin pseudo atom (default ‘Orig’).
[axisnamex ] Change name of axis origin pseudo atom (default ‘X’).
[axisnamey ] Change name of axis origin pseudo atom (default ‘Y’).
[axisnamez ] Change name of axis origin pseudo atom (default ‘Z’).

How to determine base pairing:
[first] Use first frame to determine base pairing (default).
[reference | refindex <#> | ref <name>] Reference structure to use to determine base pairing; if not specified use first frame.
[allframes] If specified determine base pairing each frame.
[specifiedbp pairs <b1>-<b2>, ... ] User specified base pairing. Base pairs are specified in a comma-separated list after the ‘pairs’ keyword as <b1>-<b2>, where <b1> and <b2> are the residue numbers of bases in the base pair, e.g. ‘pairs 1-16,2-15,3-14,4-13’. Can specify ‘pairs’ multiple times.

DataSets Created:
<name>[pucker]:X Base X (residue number) sugar pucker.

Base pairs:
<name>[shear]:X Base pair X (starting from 1) shear.
<name>[stretch]:X Base pair stretch.
<name>[stagger]:X Base pair stagger.
<name>[buckle]:X Base pair buckle.
<name>[prop]:X Base pair propeller.
<name>[open]:X Base pair opening.
<name>[hb]:X Number of WC hydrogen bonds between bases in base pair.
<name>[bp]:X Contain 1 if bases are base paired, 0 otherwise.
<name>[major]:X (If groovecalc simple) Major groove width calculated between P atoms of each base.
<name>[minor]:X (If groovecalc simple) Minor groove width calculated between O4 atoms of each base.

Base pair steps:
<name>[shift]:X Base pair step X (starting from 1) shift.
<name>:X Base pair step slide.
<name>[rise]:X Base pair step rise.
<name>[title]:X Base pair step tilt.
<name>[roll]:X Base pair step roll.
<name>[twist]:X Base pair step twist.
<name>[zp]:X Base pair step Zp value.
<name>[major]:X (If groovecalc 3dna) Major groove width, El Hassan and Calladine.
<name>[minor]:X (If groovecalc 3dna) Minor groove width, El Hassan and Calladine.

Helical steps:
<name>[xdisp]:X Helical step X (starting from 1) X displacement.
<name>[ydisp]:X Helical Y displacement.
<name>[hrise]:X Helical rise.
<name>[incl]:X Helical inclination.
<name>[tip]:X Helical tip.
<name>[htwist]:X Helical twist.

Strands (sscalc only):
<name> [dx]:X Strand pair X (starting from 1) X displacement.
<name> [dy]:X Y displacement.
<name> [dz]:X Z displacement.
<name> [rx]:X Relative rotation around X axis.
<name> [ry]:X Relative rotation around Y axis.
<name> [rz]:X Relateive rotation around Z axis.

Note that data sets are not created until base pairing is determined.

Calculate basic nucleic acid (NA) structure parameters for all residues in the range specified by resrange (or all NA residues if no range specified). Residue names are recognized with the following priority: standard Amber residue names DA, DG, DC, DT, RA, RG, RC, and RU; 3 letter residue names ADE, GUA, CYT, THY, and URA; and finally 1 letter residue names A, G, C, T, and U. Non-standard/modified NA bases can be recognized by using the resmap keyword. For example, to make cpptraj recognize all 8-oxoguanine residues named ’8OG’ as a guanine-based residue:

nastruct naout nastruct.dat resrange 274-305 resmap 8OG:G

The resmap keyword can be specified multiple times, but only one mapping per unique residue name is allowed. Note that resmap may fail if the residue is missing heavy atoms normally present in the specified base type. Base pairs are determined either once from the first frame or from a reference structure, or can be determined each frame if allframes is specified. Base pairing is determined first by base reference axis origin distance, then by stagger, then by angle between base Z axes, then finally by hydrogen bonding (at least one hydrogen bond must be present). Base pair parameters will only be written for determined base pairs. Both Watson-Crick and other types of base pairing can be detected. Note that although all possible hydrogen bonds are searched for, only WC hydrogen bonds are reported in the BP.<suffix> file.

The procedure used to calculate NA structural parameters is the same as 3DNA, with algorithms adapted from Babcok et al. and reference frame coordinates from Olson et al. Given the same base pairs are determined, CPPTRAJ nastruct gives the exact same numbers as 3DNA. One notable exception are parameters for G-quadruplex structures.

Calculated NA structure parameters are written to three separate files, the suffix of which is specified by naout. Base pair parameters (shear, stretch, stagger, buckle, propeller twist, opening, # WC hydrogen bonds, base pairing, and simple groove widths) are written to BP.<suffix>, along with the number of WC hydrogen bonds detected. Base pair step parameters (shift, slide, rise, tilt, roll, twist, Zp, and El Hassan and Calladine groove widths) are written to BPstep.<suffix>, and helical parameters (X-displacement, Y-displacement, rise, inclination, tip, and twist) are written to Helix.<suffix>. If noheader is specified a header will not be written to the output files. Note that although base puckering is calculated, it is not written to an output file by default. You can output pucker to a file via the create or write/writedata commands after the data has been generated, e.g.:

nastruct NA naout nastruct.dat resrange 1-3,28-30
run
writedata NApucker.dat NA[pucker]

Note that while the underlying procedure is geared towards calculating parameters for base pairs, the code can be made to calculate parameters between consecutive bases in single strands by specifying sscalc. Base axes, base pair axes, and base pair step axes can be written to trajectory files using the axesout, bpaxesout, and stepaxesout and related keywords. The axes are written using 4 points: an origin, and X Y and Z which are bonded to the origin. The names of these pseudo atoms can be changed using the axisnameo, axisnamex, axisnamey, and axisnamez keywords.

Custom Nucleic Acid Base References.
Users can now specify baseref <file> to load a custom nucleic acid base reference. The base reference files are white-space delimited, begin with the line NASTRUCT REFERENCE, and have the following format:

NASTRUCT REFERENCE
<base character> <res name 0> [<res name 1> ...]
<atom name> <X> <Y> <Z> <HB type> <RMS fit>
...

There is a line for each reference atom. Lines beginning with ’#’ are ignored as comments.
<base character> Used to identify the underlying base type: A G C T or U. If none of these, it will be considered an unknown residue (which just means WC hydrogen bonding will not be identified).
<res name X> Specifies what residue names this reference corresponds to. There must be at least one residue name. There can be any number of these specified.
<atom name> A reference atom name.
<X> <Y> <Z> The X Y and Z coordinates of the reference atom.
<HB type> Denotes if and how the atom participates in hydrogen bonding. Can be ’d’onor, ’a’cceptor, or ’n’one (or the numbers 1, 2, 0 respectively). Only the first character of the word actually matters.
<RMS fit> Denotes whether the atom is involved in RMS-fitting.

Here is an example for GUA:

NASTRUCT REFERENCE
G G G5 G3
# Modified into format readable by cpptraj nastruct
C1’ -2.477 5.399 0.000 0 0
N9 -1.289 4.551 0.000 0 1
C8 0.023 4.962 0.000 0 1
N7 0.870 3.969 0.000 accept 1
C5 0.071 2.833 0.000 0 1
C6 0.424 1.460 0.000 0 1
O6 1.554 0.955 0.000 accept 0
N1 -0.700 0.641 0.000 donor 1
C2 -1.999 1.087 0.000 0 1
N2 -2.949 0.139 -0.001 donor 0
N3 -2.342 2.364 0.001 accept 1
C4 -1.265 3.177 0.000 0 1

Example of a nastruct command with resmap for C2′-Fluoro modifications of a dodecamer:

##########################################
noexitonerror
parm nowater.parm7
trajin 1/nowater.nc 1 last 1
trajin 2/nowater.nc 1 last 1
trajin 3/nowater.nc 1 last 1
check @N1,N2,P,OP1,OP2,C1',C2',C3',C4',O3',O4' skipbadframes silent
##########################################
rms fit :1-24&!@H=
nastruct MyData resrange 2-11,14-23 naout data/nastruct.dat \
resmap 2F5:C \
resmap 2FG:G \
resmap 2FC:C \
resmap 2FA:A \
resmap 2FT:T \
resmap 2F3:G
pwd