prepareforleap

Prepare a structure (usually loaded from a PDB) for processing with LEaP from Amber.

prepareforleap crdset <coords set> [frame <#>] name <out coords set> 
     [pdbout <pdbfile> [terbymol]] 
     [leapunitname <unit>] [out <leap input file> [runleap <ff file>]] 
     [skiperrors] 
     [nowat [watermask <watermask>] [noh] 
          [keepaltloc {<alt loc ID>|highestocc}] 
     [stripmask <stripmask>] [solventresname <solventresname>] 
     [molmask <molmask> ...] [determinemolmask <mask>]
     [rescut <residue cutoff&gt]      
     [bondoffset <offset> ]      
     [{nohisdetect |           
          [nd1 <nd1>] [ne2 <ne2]           
          [hisname <his>] [hiename <hie>] [hidname <hid>] [hipname <hip]}]      
     [{nodisulfides |           
          existingdisulfides |          
          [cysmask <cysmask>] [disulfidecut <cut>] [newcysname <name>]}]      
     [{nosugars |           
          sugarmask <sugarmask> [noc1search] [nosplitres]               
          [resmapfile <file>]           
          [hasglycam]
          [determinesugarsby {geometry|name}]      }]

crdset <coords set> COORDS data set containing coordinates and topology to prepare.
[frame <#>] Frame to use from COORDS set (default first).
name <out coords set> Output COORDS set containing prepared topology/coordinates.
[pdbout <pdbfile>] Output PDB name.
[terbymol] If specified, base TER cards on molecules instead of PDB chains.
[leapunitname <unit>] LEaP unit name to use when writing to <leap input file> (i.e. the LEaP input file will contain ‘<unit> = loadpdb <pdbfile>’).
[out <leap input file>] File containing LEaP input needed to read in the prepared system (loadpdb, bond commands for disulfides, etc).
[runleap <ff file>] If specified, CPPTRAJ will attempt to run LEaP directly to generate a topology and coordinates; <ff file> should contain the appropriate ‘source’ commands for loading the desired force field parameters. Will attempt to produce topology <unit>.parm7 and coordinates <unit>.rst7.
[skiperrors] If specified, the command will try to ignore any errors encountered. Can be useful for debugging.
[nowat] If specified, remove waters from the system.
[watermask <watermask>] Mask selecting waters to remove (default ‘:<solventresname>’).
[noh] If specified, strip all hydrogen atoms from the system (recommended).
[keepaltloc {<alt loc ID>|highestocc}] LEaP cannot handle alternate atom locations, so the command will choose location ‘A’ by default. This can be changed to either <alt loc id> or the location with the highest occupancy if ‘highestocc’ is specified.
[stripmask <stripmask>] Mask of atoms to remove from the system.
[solventresname <solventresname>] Solvent residue name (default ‘HOH’).
[molmask <mask>] If specified, atoms in <mask> will be considered all part of one molecule. May be specified multiple times.
[determinemolmask <mask>] If specified, determine if atoms selected in <mask> are in the same molecule via bonds.
[bondoffset <offset>] Offset (default 0.2 Ang.) to add to “ideal” bond distances when looking for missing sugar linkages. Can be increased to accommodate distorted structures.
[rescut <residue cutoff>] Initial distance cutoff (default 8 Ang.) for residue center to residue center distance when looking for missing sugar linkages.

Histidine Detection:
[nohisdetect] Disable renaming of histidine residues based on existing hydrogens.
[nd1 <nd1>] Delta nitrogen atom name (default ‘ND1’).
[ne2 <ne2<] Epsilon nitrogen atom name (default ‘NE2’).
[hisname <his>] Histidine residue name (default ‘HIS’).
[hiename <hie>] Epsilon-protonated histidine name (default ‘HIE’).
[hidname <hid>] Delta-protonated histidine name (default ‘HID’).
[hipname <hip>] Doubly-protonated histidine name (default ‘HIP’).

Disulfide Handling:
[nodisuldes] Disable handling of disulfides.
[existingdisuldes] Only handle disulfides already present; do not search for additional disulfides.
[cysmask <cysmask>] Mask for selecting cysteine residues (default ‘CYS’).
[disulfidecut <cut>] Sulfur to sulfur atom distance cutoff for forming a disulfide (default 2.5 Ang).
[newcysname <name>] Name to change cysteine residues that participate in a disulfide bond to (default ‘CYX’).

Sugar Handling:
[nosugars] Disable handling of sugars.
[sugarmask <sugarmask>] Mask selecting sugars to be handled. If not specified the default is all residues defined in resmapfile.
[noc1search] If specified disable search for missing sugar C1 atom bonds.
[nosplitres] If specified do not attempt to split off functional groups from sugars into separate residues.
[resmaple <file>] File containing sugar residue/atom name mapping. Default is ‘$CPPTRAJHOME/dat/Carbohydrate_PDB_Glycam_Names.txt’.
[hasglycam] If specified, assume sugars already have GLYCAM residue names; just check sugar anomer type/configuration/linkage.
[determinesugarsby {geometry|name}] Determine whether sugar anomer type/configuration should be chosen based on sugar geometry (default) or the residue name. CPPTRAJ will report when a mismatch is detected between the sugar anomer type/configuration based on geometry and anomer type/configuration based on the residue name.

More information about this command can be found in the article by Roe and Bergonzo.

This command will prepare a structure (usually from a PDB) for processing with the Amber program LEaP to generate topology and coordinates files for MD simulations. It will handle things like choosing alternate atom locations, removing waters/hydrogen atoms from the structure, renaming residues and generating bond commands for disulde bonds, change histidine names based on any existing protonation, and renaming residues/atoms and generating bond commands for carbohydrates. The command can also call LEaP directly to generate the parameters once the structure is prepared.

If hydrogen atoms are present in the structure, the command will attempt a simple and straightforward determination of the protonation state of any histidine residues based on where hydrogens are bonded, and assign the appropriate residue name. The command will also identify any existing disulde bonds as well as potential disulfide bonds and generate the corresponding LEaP bond commands which can be applied after the structure is loaded in LEaP. Potential disulde bonding atoms can be identified via a user-speciable mask expression – use the disulfidecut <cut> option.

By default, sugars will have their residue names changed to those compatible with the GLYCAM force field based on their anomer type (alpha/beta), conguration (D/L), and linkages (glycosidic and covalent sugar to non-sugar). Any recognized functional groups that are part of sugar residues (hydroxyl, acetyl, sulfate, etc) will be split into separate residues as required by GLYCAM. If this happens and runleap has not been specied, CPPTRAJ will warn about any residues/atoms that require charge to be adjusted. If runleap has not been specied the command will warn about any atoms that need to have their charges adjusted after LEaP is run.

The command will try to report any potential problems that LEaP might encounter. These include residue names that may be unrecognized (and therefore may not have parameters), mismatches between detected sugar anomer type/configuration and anomer type/configuration based on the sugar residue name, unrecognized sugar linkages, and so on.

For example, the following input prepares PDB 4zzw for processing with PDB, putting the proper leap commands in leap.4zzw.in, writing the prepared PDB to 4zzw.cpptraj.pdb, removing waters and hydrogen atoms, and keeping alternate atom locations with the highest occupancy:

parm 4zzw.pdb 
loadcrd 4zzw.pdb name MyCrd 
prepareforleap crdset MyCrd name Final out leap.4zzw.in leapunitname m pdbout 4zzw.cpptraj.pdb nowat noh keepaltloc highestocc

Sugar Residue/Atom Name Mapping File.
This file controls how CPPTRAJ will name sugars based on sugar form/chirality linkage. It consists of three sections separated by a blank line. The rst section defines sugar PDB residue names and how they are mapped to GLYCAM residue characters:

Format: <ResName> <GlycamCode> <Anomer> <Config> <RingType> "<Name>"
Anomer: A=alpha, B=beta 
Config: D/L 
RingType: P=pyranose, F=furanose

Example: 64K A A D P “alpha-D-arabinopyranose”

The second section contains PDB to GLYCAM atom name maps for residues:

Format: <GLYCAM residue codes> <PDB atom name>,<GLYCAM atom name>[,<anomer>] ... 
If <anomer> (A=alpha, B=beta) is specified, the atom name map is only valid for that specific form.

Example: V,W,Y C7,C2N O7,O2N C8,CME

The third section contains PDB to GLYCAM linkage residue (i.e. non-sugar residues bonded to sugars) name maps:

Format: <PDB residue name> <GLYCAM residue name>

Example: SER OLS

Another example to use the prepareforleap command with a PDB file (requires the loadcrd command):

parm file.pdb
loadcrd file.pdb name CRD
prepareforleap crdset CRD name Final out build.tleap leapunitname x pdbout file-cpptraj.pdb nowat noh
 go

More examples commonly used for PDB’s

parm ../6nit.pdb 
loadcrd ../6nit.pdb name edited 
prepareforleap crdset edited name from-prepareforleap \   
      out from-prepareforleap-tleap.in leapunitname x \    
      pdbout from-prepareforleap.pdb nowat noh 
go
parm Final.pdb
loadcrd Final.pdb name edited
prepareforleap crdset edited name from-prepareforleap \
out from-prepareforleap-tleap.in leapunitname x \
disulfidecut 4.0 \
pdbout from-prepareforleap.pdb nowat noh
go