## VMD-L Mailing List

**From:** Sergei Izrailev (*sergei_at_ks.uiuc.edu*)

**Date:** Wed Jun 11 1997 - 16:45:17 CDT

**Next message:**Willy Wriggers: "Re: [VMDTECH] Hingefind"**Previous message:**Andrew Dalke: "residue-residue distances and atom selection bug"**Next in thread:**Willy Wriggers: "Re: [VMDTECH] Hingefind"**Reply:**Willy Wriggers: "Re: [VMDTECH] Hingefind"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hello,

In this issue of VMDTech we present Hingefind -

a novel algorithm to investigate domain motions in proteins

(by Willy Wriggers).

---------------------------------------------------------------------

1. General remarks

==================

Hingefind is an algorithm for the identification of domain

movements and their characterization and visualization by effective

rotation axes (hinges). The program compares two known structures

(e.g. two different crystal structures of a protein or the results

of molecular dynamics simulations) and partitions the protein with a

prespecified tolerance in preserved subdomains. It then determines

effective rotation axes which characterize the domain movements with

respect to the reference domain ("rigid core"). With little

modification, the parts of the algorithm can also be used alone,

i.e. one can assign domains manually and let the algorithm determine

the movements between the domains, or one can use the algorithm to

partition a protein into preserved subdomains. The method does not

require any previous knowledge about functionally relevant domains

or hinged domain motions, however a critical analysis of the results

using the program output is recommended. The output files provide

information about the accuracy of the found effective rotation. The

variety of options and tolerances allow to find an optimal

partitioning. The user can assess the validity of the depicted

movement and change the script if necessary.

2. Examples

===========

Examples of the application of the method can be found at

http://www.ks.uiuc.edu/Research/METHOD/domain_mov/ - or simply

follow the instructions below in 6.) and check it out yourself.

3. About the method

===================

The algorithm is separated in two parts: the "partitioning" and the

"rotational fit" section. The "partitioning" part determines

domains with preserved structure in the two compared coordinate

sets, depending on a prespecified tolerance. The method uses the

least-squares fit method (W. Kabsch, 1976) to superimpose

structures. A domain is found in a iterative "adaptive selection"

procedure, in which poor matching residues are excluded from the

domain and good matches are included (in "slo" mode partitiononing,

available in the X-PLOR version, only the heaviest connected set is

considered, maintaining the connectivity of the changing domain).

After a partitioning of the protein the rigid domains are sorted by

their size and domains with a size larger than a cutoff considered

for further evaluation.

The "rotational fit" method attempts to locate a rotation axis

which characterize the movement of the domain as a hinged rigid body

motion. The problem is to find an approximation of the rigid body

motion with the constraint that transformations are not allowed,

only rotations about the unknown hinge axis. Note that the the

rotation about an axis without translation, in general, will not

yield the closest fit of the Kabsch least-squares method. The

accuracy of the approximation can be assessed by comparing the

least-squares fit with the proposed rotational fit. The relative

error (in percent) is computed as

[ RMSD (rot. fit) - RMSD (least sq.) ] / v,

where v is the COM separation of the domain in both positions.

The constructed rotation upholds (similar to a least-squares fit)

the removal of the COM difference between the two coordinate sets,

but approximates the rotation relative to a least-squares fit. Thus,

the validity of the approximation can also be assessed by checking

the "projection angle" beta between the least-squares rotation axis

and the hinge axis, which should be small. One finds that the method

works best for larger domains comprising several secondary

structure elements. The projection angle beta and the relative error

can be found in the output of the Hingefind runs.

4. Implementations

==================

Currently, users have the choice of two implementations of Hingefind.

* X-PLOR scripting language implementation.

* Tcl plug-in for our free molecular graphics program vmd.

The two versions of Hingefind are complementary in their

application. The X-PLOR version is designed for longer batch jobs

which allow the user to run a large number of trial runs in various

modes. The Tcl version of the program runs with our free molecular

graphics package vmd and offers the user greater flexibility in the

selection of suitable atoms for comparison. It is designed for

interactive use. For efficiency in interactive use, the Tcl version

does not provide the option to maintain the spatial connectivity of

found domains.

In the following, I will only discuss the Tcl version for vmd. The

X-PLOR version of hingefind is described at URL

ftp://lisboa.ks.uiuc.edu/pub/hingefind/hingefind_xplor.html

5. About the Tcl version

========================

The Tcl (John Ousterhout, 1988) version of Hingefind is designed

for interactive use with our molecular graphics package vmd. The

version for vmd has the advantage that it only requires PDB files as

input. Thereby, selections of atoms are easier to adapt to the

user's needs than with X-PLOR. E.g., conformations of nonidentical

proteins can be compared, as long as pairs of atoms in both

structures can be matched sequentially.

6. The Tcl source file for vmd

==============================

There is a single file at the ftp site

(ftp://ftp.ks.uiuc.edu/pub/hingefind/hingefind.tcl) which contains

the algorithm.

Start up vmd and source the Tcl script by typing

source hingefind.tcl

in the vmd command shell and return. The default setup of the

script loads two structures of lactoferrin from the PDB databank. To

load these structures, type

load

and return. The procedure load now initializes the program, selects

the pairs of corresponding atoms (default: alpha carbons) and

initializes some global variables. To partition the protein

structures with a given tolerance epsilon, type

partition epsilon

and return. The tolerance epsilon must be entered in Angstrom

units. The tolerance should be larger than the imprecision of the

coordinates in the two structures (local noise level), but smaller

than the total rms deviation of the two conformations. The protein

now iteratively finds the rigid domains, sorts them by size, and

draws tubes in the color of the domain-id, starting at domain-id 0,

the largest found domain. To determine the effective rotation axis

(hinge) and other information about the relative movement of two

domains domid1 and domid2, enter

hinge domid1 domid2

and return. Domain domid1 is the reference domain and domain domid2

the moving domain. Both structures are superimposed by the

reference domain. The rotation axis is shown as an arrow with it's

orientation representing a left-handed rotation about the axis. A

"pivot" point can be found in the middle of the arrow which is

connected to the COM of the moving domain in both structures to

illustrate the rotation angle. The rotation angle and other useful

information about the run, the domains, and the accuracy of the

rotational fitting can be found in the output (vmd command shell).

An example vmd session would look like this:

source hingefind.tcl

load

partition 2.0

hinge 0 1

hinge 0 2

partition 1.8

hinge 0 1

The user has to edit the procedure load to select different PDB

files for comparison, to change the selection of atoms for

comparison, or to customize the hingefind computation by resetting

some of the global variables. Here is a partial list of variables

which may be changed:

* $maxccounter: The number of maximum cycles of the "converge"

loop. In case the algorithm does not converge within the specified

number of cycles (this was very rarely observed at extreme

tolerances), a warning message is written.

* $ndomains: The max. number of domains to be found. Recommended:

Set very large to yield full partitioning. Small values (2-5) should

be used for test runs.

* $cutdom: Minimum domain size of the found domains.

The initial rendering of the loaded molecules may also be altered

in procedure load. By default molecule 1 is shown in purple and

codes with colored tubes for the found domains, while molecule 2

remains a solid white line which follows the backbone.

7. Output

=========

The rotation angle and other useful information about the hinge

rotation, the domains, and the accuracy of the rotational fitting

are written to the vmd command shell. The information includes:

* the pivot point of a rotation

* the effective rotation axis (hinge)

* the effective rotation angle in degrees

* the rmsd (least squ.) of the moving domain

* the rmsd of the moving domain after hinge rotation

* the relative error (see Hingefind homepage)

* the projection (deviation) angle beta in degrees (see Hingefind

homepage)

8. About the Tcl code

=====================

A programmer can find more information about available vmd commands

and routines at the vmd user's guide. All procedures and variables

are documented in the program. The datastructures used to store

structural information and domain membership are briefly described

here:

* list of lists of indices - Used by procedures criterion,

superimpose, seed and convergence to store information about

selected atomindices in molecules 1 and 2. A list containing a pair

of lists of equal length. The first stores the atom indices of

molecule 1. The second list stores the corresponding atom indices of

molecule 2.

* list $dindex - List of domain membership indices for each atom of

$full1 (atom selection in procedure load). This list gets updated

by procedure update_boundaries once a domain has been found.

* list $sup_rmsd - Contains a history of conformance of each atom

of $full1 with previously found domains. Procedure update_boundaries

updates the domain membership indices only for those atoms which

show an improved conformance with the new domain. The conformance of

an atom is measured by the separation of the corresponding atoms in

molecule 1 and 2 depending on the domain chosen for superposition.

* 'domindex' arrays - after the protein is fully partitioned into

domains, the procedure partition transfers the membership

information from the list $dindex to the arrays $domindex1($i) and

$domindex2($i), which contain the atom indices of molecules 1 and 2

corresponding to each domain $i. The domindex arrays are then sorted

in the procedure sort_n_render to be used for the determination of

the effective rotation axis in procedure hinge.

9. Reference

============

W. Wriggers and K. Schulten, Protein Domain Movements: Detection of

Rigid Domains and Visualization of Hinges in Comparisons of Atomic

Coordinates, Proteins (1997) In press.

10. Correspondence

==================

The author would appreciate any comments regarding the usefulness

of the algorithm. Please send any comments to wriggers_at_ks.uiuc.edu.

--------------------------------------------------

Sergei Izrailev

--

**Next message:**Willy Wriggers: "Re: [VMDTECH] Hingefind"**Previous message:**Andrew Dalke: "residue-residue distances and atom selection bug"**Next in thread:**Willy Wriggers: "Re: [VMDTECH] Hingefind"**Reply:**Willy Wriggers: "Re: [VMDTECH] Hingefind"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]