From: Andrew Dalke (dalke_at_ks.uiuc.edu)
Date: Fri Mar 28 1997 - 12:14:17 CST

Hello,

  This is the first VMDTech article after the release of version
1.2b1. For the new people who just joined the list, VMDTech is a
weekly article we write for the list concerning some detail or topic
of VMD that we believe will be helpful to most users. We will also
include much of this material in the new version of the documentation,
so please tell us if there are any problems. Back issues of VMDTech
are available from the vmd-l mailing list archives, which is available
from http://www.ks.uiuc.edu/Research/vmd/mailing_list/ .

  Before we start on this week's topic, we would like to describe a
few of the numbers about VMD. This release has been the first public
announcement in nearly a year. In the last few days we have had around
250 downloads, which is almost 10% of the total number of downloads in
its two years of public availability. Factoring in the estimated 60%
or so of people who don't have the time to learn more about VMD, or
find that their Linux box is too slow, or are looking for a viewer for
Vocaltech Media Descriptor files, this is still an impressive number,
backed by the nearly 80 external members of this list.

  Now that we've patted ourselves on the back enough, this week's
VMDTech will describe how to build a trajectory that morphs between
two coordinate frames. Next week's VMDTech will describe how to use
VMD to control a web browser so that clicking on an atom brings up a
page describing that part of the molecule.

  As always, if you have comments or suggestions or want to contribute
a VMDTech article, bring it up on the list or send it to us directly
at vmd_at_ks.uiuc.edu . However, none of the VMD developers are in town
(it is spring break here :) so it will take a few days for a response
from one of us.

                                                Andrew Dalke
                                                dalke_at_ks.uiuc.edu

====================================================================
                        Morphing

  Proteins move. Sometimes a structure has been determined for
multiple conformation, as shown in Protein Motions Database at
http://hyper.stanford.edu/~mbg/ProtMotDB/ and in other times the
structures are made from MD simulations. In general, if you have an
initial and a final structure it may be helpful to see how one can
transform into another. The easiest way to do this is to apply a
linear transform to convert the first into the second.

  Given a factor f (0.0 <= f <= 1.0), an initial coordinate x1 and a
final coordinate x2, I can define an intermediate coordinate x(f) as:

        x(f) = (1.0 - f) * x1 + f * x2

so x(f=0.0) = x1 and x(f=1.0) = x2. For an animation, f needs to be
some function of time, which for now we'll let be equispaced values
from 0 to 1.

This is easy to define, but a bit harder to implement. For that VMD
needs to know the initial and final coordinates and the number of
animation steps to create.

---------------------------------------------------------------------
                animate dup (and a bit of 'molinfo')

Yes, create animation frames. This is a new feature for version 1.2.
It is possible to duplicate a given frame and add it to the end of the
trajectory using one of the subcommands of 'animate'. The usage is:

        animate dup [|frame <number>] <molecule id>

which you can get by entering 'animate' by itself into the text
console window (another new feature in 1.2). For example, to copy the
first frame of the "top" molecule to the end of the animation, the
command is:

        animate dup frame 0 top

while

        animate dup 8

will duplicate the currently display frame of molecule 8 to the end of
the animation.

  This command can be used to create as many frames as needed, which
will then be modified. For the morphing example we will need N frames
when 2 are present. This can be done by copying the 2nd frame N-2
times, as this code snippet does to the top molecule (to check this
for yourself, just load a molecule, copy and paste this code into the
text window, and look at the number of frames in the animate form) :

        set N 10
        for {set i 2} {$i < $N} {incr i} {
                animate dup frame 1 top
        }

We'll also need to know how many frames are present in the molecule.
The "molinfo" command is used to get global information about a
molecule (as compared to information about each atom). To get the
total number of frames available, use

        molinfo <molid> get numframes

as in

        molinfo top get numframes

-------------------------------------------------------------------------
                                Bugs

As I just found out, we need to make a more helpful error message for
molinfo as well as some way to get a list of the available keywords.
The animate discussion also brings up an interesting question, what if
there are no coordinate frames to duplicate but you want to add a
frame? We'll have to work on this. Oh, and as a buglet, it isn't
possible to say "animate dup frame last top" which you might expect
from the equivalent atomselection commands.) For a real bug, the
commands:

        animate dup top
        animate dup frame 0 6

do work while

        animate dup frame 0 top

does not. The workaround we'll use is to call get the molecule id
number using the molinfo command to do the translation, as:

        animate dup frame 0 [molinfo top get id]

The advantage (not a problem!) of these VMDTech articles is
they really test all the variations of a command, and expose any
latent bugs that got passed over by our original testing.

---------------------------------------------------------------------
                        The atomselect command

The atomselect commands is well discussed in the User's Guide, so the
discussion here will be short. To get information about an atom you
need to create an atom selection, which you can use to query for
information about the selected atoms. We'll need to use this command
to get coordinate information, as with the following examples:

  # get the first frame
  set sel1 [atomselect top "all"] frame 0
  # get the last frame
  set sel2 [atomselect top "all"] frame last

  # get all the coordinates in the first step
  set x1 [$sel1 get {x y z}]
  # get all the coordinates in the last step
  set x2 [$sel2 get {x y z}]

  # print the difference
  foreach atomid [$sel1 get index] pos1 $x1 pos2 $x2 {
    puts "$atomid { $pos2 } - { $pos1 } = { [vecsub $coor2 $coor1] }"
  }

To see this example at work, make sure the first and last frames of an
animation are different (if you don't have two frames, load a
molecule, enter "animate dup top", and use the "Move Residue" option
from the popup menu to alter one of the frames) and copy and paste the
above code snippet.

You can also modify the coordinates using "set". For instance,
given the information above, we can set all the atoms of the second
selection to the positions of the first seletion by:

  $sel1 set {x y z} $x2

Again, for more information on how this works, please refer to the
User's Guide.

---------------------------------------------------------------------
                        Linear Interpolation

With this knowledge, the morphing script is easier to write. As with
any software project, this needs a specification.

Input:
  A molecule (with molecule id of "molid") with only two coordinate
frames loaded (as frame 0 and frame 1)
  N, the number of final frames, N > 2

Output:
  The same molecule but with N animation frames such that frame 0 of
the final animation is the same as frame 0 of the initial animation,
and frame (N-1) of the final animation is the same as frame 1 of the
initial animation.

proc linear_morph {molid N} {
    # make sure there are only two animation frames
    if {[molinfo $molid get numframes] != 2} {
        error "Molecule $molid must have 2 animation frames"
    }
    # workaround for the 'animate dup' bug; this will translate
    # 'top' to a number, if needed
    set molid [molinfo $molid get id]

    # Do some error checking on N
    if {$N != int($N)} {
        error "Need an integer number for the number of frames"
    }
    if {$N <= 2} {
        error "The number of frames must be greater than 2"
    }

    # Get the coordinates of the first and last frames (there are only 2)
    set sel1 [atomselect $molid "all" frame 0]
    set sel2 [atomselect $molid "all" frame 1]
    set x1 [$sel1 get {x y z}]
    set x2 [$sel2 get {x y z}]

    # Make N-2 new frames (copied from the last frame)
    for {set i 2} {$i < $N} {incr i} {
        animate dup frame 1 $molid
    }
    # there are now N frames

    # Do the linear interpolation in steps of 1/N so
    # f(0) = 0.0 and f(N-1) = 1.0
    set stepsize [expr 1.0 / double($N) ]
    for {set t 0} {$t < $N} {incr t} {
        set f [expr $stepsize * $t]
        # calculate the linear interpolation for each coordinate
        set coords {}
        foreach coor1 $x1 coor2 $x2 {
            lappend coords [vecadd [vecscale [expr 1 - $f] $coor1] \
                                [vecscale $f $coor2]]
        }
        # go to the given frame and set the coordinates
        $sel1 frame $t
        $sel1 set {x y z} $coords
    }
}

To enter this script into VMD, copy and paste it into the vmd console,
or save it to a file and source it.

-------------------------------------------------------------------------
                        Using this script

Coming up with an example for this script was surprisingly hard.
We've used this method to produce some animations, but the coordinates
for those systems are not publically available. There are many
examples from the aforementioned Protein Motions Database, but in most
cases the example structures didn't have the same number of atoms in
them. One example that does work is a linear transformation from
atoms 1 to 1340 of PDB entry 6p21 and the same atoms in entry 4q21.
This is for a ras protein and the database says there are "2 loop
movements [that] move CA atoms up to 10 Angstrom."

Even though these two structures have the same topology (atom names,
indicies, and residue names), they aren't aligned, so we'll need to
apply an alignment matrix. For more information on this, see the
previous VMDTech on "RMSD and best-fit alignments".

The following script will do everything for you. It uses the
"pdbload" command to download entries from the PDB ftp site
automatically, so you'll need a network connection for this to work.
For each of the two files it extracts the correct substructure and
saves it to a file. After that, it loads the first substructure then
appends the second substructure as an animation frame.

All you'll have to do is source this text.

        # load the first molecule and extract the right substructure
        mol pdbload 6q21
        set sel1 [atomselect top "index 0 to 1339"]
        $sel1 writepdb from_1ldm.pdb
        mol delete top

        # load the second molecule and extract the right substructure
        mol pdbload 4q21
        set sel2 [atomselect top "index 0 to 1339"]
        $sel2 writepdb from_4q21.pdb
        mol delete top

        # load the first substructure
        mol load pdb from_1ldm.pdb
        # load the second substructure as an animation frame
        animate read pdb from_4q21.pdb top

        # perform the alignment between the first and second frames
        set sel1 [atomselect top all frame 0]
        set sel2 [atomselect top all frame 1]
        set transmat [measure fit $sel2 $sel1 weight mass]
        # and apply it to the second frame
        $sel2 move $transmat

        # remove the temporary files (commented out for now so
        # can reload from the temporary files)
        #file delete from_1ldm.pdb
        #file delete from_4q21.pdb

Once this is loaded you can use the morphing script, like this

        linear_morph top 10

(of course, you'll have to wait a bit because Tcl is rather slow. Tcl
8 should be faster.)

Use the animate menu (remember, you can use the Alt-a key combination
to open it) and play forward or back (try changing the animation style
to "Rock"). You might also try something like use the graphics menu
(Alt-g) to show it as an animated secondary structure.

To remove the new frames (which are 1 to 8) you can use the Edit menu
(Alt-e) with the begin set to 1 and the end set to 8.

-------------------------------------------------------------------------
                                Problems

  The biggest problem with linear morphing is that proteins don't do
linear transformations. Imagine the case where you have the one
structure and its mirror image (which I know is not biologically
realistic, but bear with me here). Half way through the
transformation, all the atoms will be in the center of the system
at the same position, which is pretty worthless.

  We were thinking about how to make a better morph. We could image
finding the 4x4 transformation matrix T that brought the first structure
to the last one, then for step i of N total steps let

                         f -f
         x(i) = (1-f) * T * x1 + f * T * x2

This might work, but how do you compute the Nth root of a matrix? I
know from quantum class one trick is to to a Taylor series expansion,
but the convergence rate for roots is pretty slow. I suggested using
quaternions, but being able to pronounce it was about the extent of my
knowledge. Anyone here know of a useful way to transform biological
systems which is a bit more realistic?

-------------------------------------------------------------------------
                        More Advanced Morphing

In this case the morph went linearly from the initial conformation to
the final one. There are other ways to do this, for instance, you
could use a function which slows down near the ends but is faster in
the middle, making for a smoother looking animation. Another
possibility would be to cycle the coordinates in a loop, like so the
last frame does a nice, smooth segue back to the first frame. In the
linear_morph script above, the mathematical statement would replace
the following line:

        set f [expr $stepsize * $t]

We can actually generalize the command somewhat by calling another
function, like:

        set f [morph_cycle $t $N]
or
        set f [morph_sin2 $t $N]

and define the procedures you want,

        proc morph_linear {t N} {
          return [expr double($t) / double($N)]
        }
        proc morph_cycle {t N} {
          global M_PI
          return [expr (1.0 - cos( $M_PI * double($t) / ($N + 1.0)))/2.0]
        }
        proc morph_sin2 {t N} {
          global M_PI
          return [expr pow(sin( $M_PI * double($t) / double($N) / 2.0), 2)]
        }

To make things more general, you can make a new procedure which is
almost the same as linear_morph. The only difference is that you can
pass the type of morph transformation as a parameter, which is
called in the line where f is set.

proc morph {molid N {morph_type morph_linear}} {
    # make sure there are only two animation frames
    if {[molinfo $molid get numframes] != 2} {
        error "Molecule $molid must have 2 animation frames"
    }
    # workaround for the 'animate dup' bug; this will translate
    # 'top' to a number, if needed
    set molid [molinfo $molid get id]

    # Do some error checking on N
    if {$N != int($N)} {
        error "Need an integer number for the number of frames"
    }
    if {$N <= 2} {
        error "The number of frames must be greater than 2"
    }

    # Get the coordinates of the first and last frames (there are only 2)
    set sel1 [atomselect $molid "all" frame 0]
    set sel2 [atomselect $molid "all" frame 1]
    set x1 [$sel1 get {x y z}]
    set x2 [$sel2 get {x y z}]

    # Make N-2 new frames (copied from the last frame)
    for {set i 2} {$i < $N} {incr i} {
        animate dup frame 1 $molid
    }
    # there are now N frames

    # Do the linear interpolation in steps of 1/N so
    # f(0) = 0.0 and f(N-1) = 1.0
    for {set t 0} {$t < $N} {incr t} {
        # Here's the call to the user-defined morph function
        set f [$morph_type $t $N]
        # calculate the linear interpolation for each coordinate
        set coords {}
        foreach coor1 $x1 coor2 $x2 {
            lappend coords [vecadd [vecscale [expr 1 - $f] $coor1] \
                                [vecscale $f $coor2]]
        }
        # go to the given frame and set the coordinates
        $sel1 frame $t
        $sel1 set {x y z} $coords
    }
}

The default method, morph_linear, is exactly the same as the
linear_morph implemnetation.

        morph top 10

However, you can instead do a cycle as

        morph top 15 morph_cycle

Because of its generality, this is the script we suggest you use, and
we will be adding it to our on-line script library.

-----------------------------------------------------------------------
                        Personal Commentary

Finally, as a personal endnote to this discussion on linear morphing,
I can remember the first time I did a linear 2D morph like this - it
was around 1985 on a TI-99 4/A using text mode graphics after seeing a
"3-2-1 Contact" TV show where a vector based computer system morphed
an egg shape into a chicken shape. It is nice to see how much better
computers have gotten since then -- Andrew

========================================================================