Atom Mask Selection Syntax

The mask syntax is similar to PTRAJ.

Note that the characters ’:’, ’@’, and ’*’ are reserved for masks and should not be used in output file or data set names. All masks are case-sensitive. Either names or numbers can be used.

Masks can contain ranges (denoted with ’-’) and comma separated lists.
The logical operands ’&’ (and),’|’ (or), and ’!’ (not) are also supported.

The syntax for elementary selections is the following:

Syntax Example
:{residue numlist} ’:1-10’, ’:1,3,5’, ’:1-3,5,7-9’
@{atom numlist} ’@12,17’, ’@54-85’, ’@12,54-85,90’
:{residue namelist} ’:LYS’, ’:ARG,ALA,GLY’
@{atom namelist} ’@CA’, ’@CA,C,O,N,H’
@%{atom type name} ’@%CT’
@/{atom_element_name} ’@/N’

Selection by distance, see below.
<mask><distance op><distance>

Several wildcard characters are supported:
’*’ Zero or more characters.
’=’ Same as ’*’
’?’ One character.

The wildcards can also be used with numbers or other mask characters, e.g. ’:?0’ means “:10,20,30,40,50,60,70,80,90”, ’:*’ means all residues and ’@*’ means all atoms.

Compound expressions of the following type are allowed:
:{residue numlist | namelist}@{atom namelist | numlist}
and are processed as:
:{residue numlist | namelist} & @{atom namelist | numlist}

e.g.

’:1-10@CA’ is equivalent to “:1-10 & @CA”.

More examples (some examples have been collected from the AMBER mailing list):

 

:ALA,TRP All alanine and tryptophan residues.
:5,10@CA CA carbon in residues 5 and 10.
:*&!@H= All non-hydrogen atoms (equivalent to “!@H=”).
@CA,C,O,N,H All (protein) backbone atoms.
!@CA,C,O,N,H All non-backbone atoms (=sidechains for proteins only).
:1-500@O&!(:WAT|:LYS,ARG) All backbone oxygens in residues 1-500 but not in water, lysine or arginine
residues.
(:11@CD<:5.5)&:Na+ Select all residues within 5.5 Ang. of atom CD from residue 11) AND those residues must be named Na+
Use of :; to keep the same residue ID in stripped files

CPPTRAJ will use/add the RESIDUE_NUMBER flag to an Amber topology file, which tracks the “original” residue number. You can see this works by doing a brief test in the $CPPTRAJHOME/test directory. First do:

parm tz2.parm7
trajin tz2.rst7 distance d1 :5 :6 out temp.d1.dat strip :1-4 parmout temp.tz2.strip.parm7 trajout temp.tz2.strip.rst7 run

then run

parm temp.tz2.strip.parm7

trajin temp.tz2.strip.rst7

distance d1 :;5 :;6 out temp.d2.dat

run

The values in temp.d1.dat and temp.d2.dat are identical since residues 5 and 6 do not change.

Distance-based Masks

There are two very important things to keep in mind when using distance based masks:

  1. Distance-based masks that update each frame are currently only supported by the mask action.
  2. Selection by distance for everything but the mask action requires defining a reference frame with reference; distances are then calculated using the specified reference frame only. This reference frame can be changed using the activeref command.

The syntax for selection by distance is a expression followed by a followed by a (which is in Angstroms). The consists of 2 characters: ‘<‘ (within) or ‘>’ (without) followed by either ‘^’ (molecules), ‘:’ (residues), or ‘@’ (atoms).

For example, ‘<:3.0’ means “residues within 3.0 Angstroms” etc.

For residue- and molecule-based distance selection, if any atom in that residue/molecule matches the given distance criterion, the entire residue/molecule is selected. In plain language, the entire distance mask can be read as “Select < distance operator > < distance > of < mask >”.

So for example, the mask expression:

:11-17<@2.4

Means “Select atoms within 2.4 Å distance of atoms selected by ’:11-17’ (residues numbered 11 through 17).

As another example, to strip everything outside 3.0 Å (i.e. without 3.0 Å) from residue 4 using specified reference coordinates:

reference mol.rst7
 trajin mol.rst7
 strip !(:4<:3.0)