# wavelet

Perform wavelet analysis on coordinates from given COORDS set.

wavelet [crdset <set name>] nb <n scaling vals> [s0 <s0>] [ds <ds>][correction <correction>] [chival <chival>] [type <wavelet>][out <filename>] [name <setname>] [cluster [minpoints <#>] [epsilon <value>] [clusterout <file>] [clustermapout <file>] [cmapdetail] [kdist] [cprefix <PDB prefix>] [overlay <trajfile>] [overlayparm <parmfile>]]

[crdset <set name>] COORDS data set to use
nb <n scaling vals> Number of scales. The smaller the number the better resolution, but slower to plot.
[s0 <s0>] The smallest scale of the wavelet function (default 2dt where dt is time between snapshots in ps )
[ds <ds>] Spacing between discrete scales. (Default is 0.25. Smaller value of ds gives finer resolution. The largest values that give adequate sampling in scale for Morlet and Paul are 0.5 and 1.5, respectively)
[correction <correction>] The scale-to-wavelength parameter (1.01 for Morlet, 1.389 for Paul). Automatically set based on wavelet if not otherwise specified.
[chival <chival>] The value of $\chi_2^2$ at a particular confidence level
[type <wavelet>] Type of wavelet function to use <morlet> or <paul>
[out <filename>] Write results to file named <filename>
[name <setname>] Store results in data set with name <setname>
[cluster] Perform wavelet clustering i.e. wavelet feature extraction analysis.
[minpoints <#>] Minimum number of points necessary to form a region of interest.
[epsilon <value>] Minimum region of interest size.
[clusterout <file>] Output for clustering (see below).
[clustermapout <file>] Output cluster map (recommended gnuplot format, see below).
[cmapdetail] Instead of the map being smoothed to cluster regions, show full detail.
[kdist] Can be used to determine minpoints and epsilon – see below.
[cprefix <PDB prefix>] Output cluster region PDBs (only containing from minimum to maximum atom and minimum to maximum frame) with given prefix.
[overlay <trajfile>] Create a trajectory that can be overlaid with the original trajectory to highlight atoms of interest. Atoms in cluster regions will get their normal coordinates – all others are set to the common center of mass.
[overlayparm <parmfile>] Topology that can be used with the overlay trajectory.
<wavelet> morlet, paul

Perform the wavelet analysis using fast Fourier transform (FFT) algorithm on specified trajectory and write out to a gnuplot-formatted file named <name.gnu>. The created Wavelet map provides a clear picture of the significant motions which are characterized both in time and space. Note that typically the trajectory in question should have rotational and translational movement removed (via e.g. the rms command); otherwise these will be reflected in the wavelet analysis results.

Wavelet analysis contains two main steps which performs continues wavelet transform (CWT) and statistical significance testing as proposed by Torrence and Compo. Analysis is executed on one dimensional (1-D) coordinate which is defined as the displacement from the starting position. For each atom, CWT is calculated over a specified range of scales from $S_0$ up to $S_{0}2^{(nb-1)ds}$. To obtain the CWT of the trajectory the Fourier transform of atom’s displacement and wavelets which scaled by S (S is calculated from: $S = S_{0}2^{jds}; j = 0,1,2,\cdots , nb-1$ ) is computed and then the inverse Fourier transform of the product of Fourier transforms will be calculated as the CWT. After calculating the wavelet coordinates for all atoms, a significance testing is performed to determine the significance of each wavelet coordinate. For doing this test we need to have an appropriate background spectrum to consider as a mean or expected spectrum and compare our wavelet coordinates against this background. In order to calculate the background spectrum since wavelet spectrum (according to the convolution theorem) follows the Fourier spectrum, the Fourier coefficients over every atom’s displacement is calculated using the following formula and a model ($\mu _{k}$) is constructed on average which Fourier coefficients fit ($X_n$) is the time series which is the atom’s displacement and k is the frequency index[reference].

$f_{k=\frac{1}{N}} \sum^{N-1}_{n=0} \exp \left( \frac{ -2 \pi ikn}{N} \right) X_n$

This test is implemented based on the null hypothesis that the assumption is that Fourier coordinates normally distributed around the expected value, then the wavelet coordinates should also be normally distributed. Assuming the expected background spectrum and since the square of a normally distributed variable is chi-square distributed, then the distribution for the square of the absolute values of wavelet coordinates (|Wi;k|^2 is as follows ($\sigma ^2$ is the variance of the atom’s displacement).

$\sigma^2 \mu_k \chi_2^2 / 2$

Then choosing a confidence level we can determine the minimum acceptable value for jWi;kj2to be considered as a significant coordinates at that certain confidence level. In the final map the scales of only those wavelet coordinates which are significantly above the expected distribution are stored.

For example, to perform wavelet analysis on residues 1 to 17 with 40 scaling values starting from scaling of 0.2 with a spacing of 0.25 using the Morlet wavelet:

parm nowat.withions.parm7
trajin nowat.image.nc
rms :1-17@C*,N*,O*,P* first mass
wavelet nb 40 s0 0.2 ds 0.25 correction 1.01 chival 1.6094 type morlet \
:1-17 out wavelet.gnu usemap

Wavelet Analysis Feature Extraction
Wavelet analysis feature extraction (WAFEX) uses a density-based clustering algorithm (a modified version of the DBSCAN algorithm) to highlight physical and temporal regions that have significant motions from wavelet mapsand can extract the specific atoms and frames involved in these motions for further analysis. Cluster regions shown in the map will be smoother by default for easier visualization (unless cmapdetail is specified).

Details of the clustering are provided via the clusterout keyword with format:

#Cluster    [points]    [minatm]    [maxatm]    [minfrm]    [maxfrm]    [avgval]

#Cluster Cluster region number.
points Number of points in the cluster.
minatm Starting atom of the region.
maxatm End atom of the region.
minfrm Starting frame of the region.
maxfrm End frame of the region.
avgval Average value of points in the region.

For example, to create a 2D gnuplot map highlight regions of interest called ’cluster.gnu’ one could use the following input:

parm ../DPDP.parm7
trajin ../DPDP.nc
rms @C,CA,N first
wavelet nb 10 s0 2 ds 0.25 type morlet correction 1.01 chival 0.25 :1-22 name DPDP \
cluster clustermapout cluster.gnu clusterout cluster.dat minpoints 66 epsilon 10.0
datafile cluster.gnu usemap palette kbvyw

Some experimentation with kdist may be required to obtain reasonable values for minpoints and epsilon. Refer to the Heidari et. al. article for more information.