prepareforleap

Prepare a structure (usually loaded from a PDB) for processing with LEaP from Amber.

prepareforleap crdset <coords set> [frame <#>] name <out coords set> 
     [pdbout <pdbfile> [terbymol]] 
     [leapunitname <unit>] [out <leap input file> [runleap <ff file>]] 
     [skiperrors] 
     [nowat [watermask <watermask>] [noh] 
          [keepaltloc {<alt loc ID>|highestocc}] 
     [stripmask <stripmask>] [solventresname <solventresname>] 
     [molmask <molmask> ...] [determinemolmask <mask>] 
     [{nohisdetect | 
          [nd1 <nd1>] [ne2 <ne2] 
          [hisname <his>] [hiename <hie>] [hidname <hid>] [hipname <hip]}] 
     [{nodisulfides | 
          existingdisulfides | 
          [cysmask <cysmask>] [disulfidecut <cut>] [newcysname <name>]}] 
     [{nosugars | 
          sugarmask <sugarmask> [noc1search] [nosplitres] 
               [resmapfile <file>] 
          [hasglycam] [determinesugarsby {geometry|name}] 
     }]

crdset <coords set> COORDS data set containing coordinates and topology to prepare.
[frame <#>] Frame to use from COORDS set (default first).
name <out coords set> Output COORDS set containing prepared topology/coordinates.
[pdbout <pdbfile>] Output PDB name.
[terbymol] If specified, base TER cards on molecules instead of PDB chains.
[leapunitname <unit>] LEaP unit name to use when writing to <leap input file> (i.e. the LEaP input file will contain ‘<unit> = loadpdb <pdbfile>’).
[out <leap input file>] File containing LEaP input needed to read in the prepared system (loadpdb, bond commands for disulfides, etc).
[runleap <ff file>] If specified, CPPTRAJ will attempt to run LEaP directly to generate a topology and coordinates; <ff file> should contain the appropriate ‘source’ commands for loading the desired force field parameters. Will attempt to produce topology <unit>.parm7 and coordinates <unit>.rst7.
[skiperrors] If specified, the command will try to ignore any errors encountered. Can be useful for debugging.
[nowat] If specified, remove waters from the system.
[watermask <watermask>] Mask selecting waters to remove (default ‘:<solventresname>’).
[noh] If specified, strip all hydrogen atoms from the system (recommended).
[keepaltloc {<alt loc ID>|highestocc}] LEaP cannot handle alternate atom locations, so the command will choose location ‘A’ by default. This can be changed to either <alt loc id> or the location with the highest occupancy if ‘highestocc’ is specified.
[stripmask <stripmask>] Mask of atoms to remove from the system.
[solventresname <solventresname>] Solvent residue name (default ‘HOH’).
[molmask <mask>] If specified, atoms in <mask> will be considered all part of one molecule. May be specified multiple times.
[determinemolmask <mask>] If specified, determine if atoms selected in <mask> are in the same molecule via bonds.

Histidine Detection:
[nohisdetect] Disable renaming of histidine residues based on existing hydrogens.
[nd1 <nd1>] Delta nitrogen atom name (default ‘ND1’).
[ne2 <ne2<] Epsilon nitrogen atom name (default ‘NE2’).
[hisname <his>] Histidine residue name (default ‘HIS’).
[hiename <hie>] Epsilon-protonated histidine name (default ‘HIE’).
[hidname <hid>] Delta-protonated histidine name (default ‘HID’).
[hipname <hip>] Doubly-protonated histidine name (default ‘HIP’).

Disulfide Handling:
[nodisuldes] Disable handling of disulfides.
[existingdisuldes] Only handle disulfides already present; do not search for additional disulfides.
[cysmask <cysmask>] Mask for selecting cysteine residues (default ‘CYS’).
[disuldecut <cut>] Sulfur to sulfur atom distance cutoff for forming a disulfide (default 2.5 Ang).
[newcysname <name>] Name to change cysteine residues that participate in a disulfide bond to (default ‘CYX’).

Sugar Handling:
[nosugars] Disable handling of sugars.
[sugarmask <sugarmask>] Mask selecting sugars to be handled. If not specified the default is all residues defined in resmapfile.
[noc1search] If specified disable search for missing sugar C1 atom bonds.
[nosplitres] If specified do not attempt to split off functional groups from sugars into separate residues.
[resmaple <file>] File containing sugar residue/atom name mapping. Default is ‘$CPPTRAJHOME/dat/Carbohydrate_PDB_Glycam_Names.txt’.
[hasglycam] If specified, assume sugars already have GLYCAM residue names; just check sugar anomer type/configuration/linkage.
[determinesugarsby {geometry|name}] Determine whether sugar anomer type/configuration should be chosen based on sugar geometry (default) or the residue name. CPPTRAJ will report when a mismatch is detected between the sugar anomer type/configuration based on geometry and anomer type/configuration based on the residue name.

More information about this command can be found in the article by Roe and Bergonzo.

This command will prepare a structure (usually from a PDB) for processing with the Amber program LEaP to generate topology and coordinates les for MD simulations. It will handle things like choosing alternate atom locations, removing waters/hydrogen atoms from the structure, renaming residues and generating ‘bond’ commands for disulde bonds, change histidine names based on any existing protonation, and renaming residues/atoms and generating ‘bond’ commands for carbohydrates. The command can also call LEaP directly to generate the parameters once the structure is prepared.

If hydrogen atoms are present in the structure, the command will attempt a simple and straightforward determination of the protonation state of any histidine residues based on where hydrogens are bonded, and assign the appropriate residue name. The command will also identify any existing disulde bonds as well as potential disulde bonds and generate the corresponding LEaP `bond’ commands which can be applied after the structure is loaded in LEaP. Potential disulde bonding atoms can be identied via a user-speciable mask expression.

By default, sugars will have their residue names changed to those compatible with the GLYCAM force eld based on their anomer type (alpha/beta), conguration (D/L), and linkages (glycosidic and covalent sugar to non-sugar). Any recognized functional groups that are part of sugar residues (hydroxyl, acetyl, sulfate, etc) will be split into separate residues as required by GLYCAM. If this happens and ‘runleap’ has not been specied, CPPTRAJ will warn about any residues/atoms that require charge to be adjusted. If ‘runleap’ has not been specied the command will warn about any atoms that need to have their charges adjusted after LEaP is run.

The command will try to report any potential problems that LEaP might encounter. These include residue names that may be unrecognized (and therefore may not have parameters), mismatches between detected sugar anomer type/conguration and anomer type/conguration based on the sugar residue name, unrecognized sugar linkages, and so on.

For example, the following input prepares PDB 4zzw for processing with PDB, putting the proper leap commands in leap.4zzw.in, writing the prepared PDB to 4zzw.cpptraj.pdb, removing waters and hydrogen atoms, and keeping alternate atom locations with the highest occupancy:

parm 4zzw.pdb 
loadcrd 4zzw.pdb name MyCrd 
prepareforleap crdset MyCrd name Final out leap.4zzw.in leapunitname m pdbout 4zzw.cpptraj.pdb nowat noh keepaltloc highestocc

Sugar Residue/Atom Name Mapping File.
This le controls how CPPTRAJ will name sugars based on sugar form/chirality linkage. It consists of three sections separated by a blank line. The rst section denes sugar PDB residue names and how they are mapped to GLYCAMresidue characters:

Format: <ResName> <GlycamCode> <Anomer> <Config> <RingType> "<Name>"
Anomer: A=alpha, B=beta 
Config: D/L 
RingType: P=pyranose, F=furanose

Example: 64K A A D P “alpha-D-arabinopyranose”

The second section contains PDB to GLYCAM atom name maps for residues:

Format: <GLYCAM residue codes> <PDB atom name>,<GLYCAM atom name>[,<anomer>] ... 
If <anomer> (A=alpha, B=beta) is specified, the atom name map is only valid for that specific form.

Example: V,W,Y C7,C2N O7,O2N C8,CME

The third section contains PDB to GLYCAM linkage residue (i.e. non-sugar residues bonded to sugars) name maps:

Format: <PDB residue name> <GLYCAM residue name>

Example: SER OLS