Introduction to Data Sets

Working With Data Sets in CPPTRAJ.

In CPPTRAJ, Actions and Analyses can generate one or more data sets which are available for further processing. For example, the distance command creates a data set containing distances vs time. The data set can be named by the user simply by specifying a non-keyword string as an additional argument. If no name is given, a default one will be generated based on the action name and data set number.

Using the same example as the introductory start-here files, we can do the same analysis with a text file that has all the commands that will look like:

parm trpzip2.ff10.mbondi.parm7
trajin trpzip2.gb.nc
distance end-to-end :1 :13 out dist-end-to-end.agr
run

This commands can be saved in a text file (for example analysis.cpptraj ) and the executed by CPPTRAJ using this inputs in the command line:

cpptraj -i analysis.cpptraj

This will make CPPTRAJ read your input file, follow each command and execute them when the run keyword is called. As you can see from the output, a Dataset is created for the distance command with the name: end-to-end.  We see that we have generated one data set named ‘end-to-end’, with the legend “end-to-end”, that is a double-precision distance data set with 1201 elements. We can now continue to manipulate this data set if desired. Say for example you also want to write this data in the standard (column) data format. You can use the writedata command like so:

parm trpzip2.ff10.mbondi.parm7 
trajin trpzip2.gb.nc 
distance end-to-end :1 :13 out dist-end-to-end.agr 
run
writedata end-to-end.dat end-to-end

This dataset is a placeholder for data information that CPPTRAJ stores for further analysis. Now you will have a text file with the name end-to-end.dat. The linux ‘head’ command can be used directly from the CPPTRAJ command line to view the first few lines of ‘end-to-end.dat’

> head end-to-end.dat 
#Frame     end-to-end
       1       6.4251
       2       5.9250
       3       6.7926
       4       6.3125
       5       5.7580
       6       5.4389
       7       6.1086
       8       6.5588
       9       5.6949

This basic examples demonstrate the functionality of CPPTRAJ in both interactive and batch mode to perform analysis in MD simulation data.

Now we can move to the last ‘start-here’ example where we will perform a Root Mean Square Deviation analysis in our protein system – Follow this link.