next up previous
Up: Topology Tutorial Previous: Glutathione

Solution to Norleucine Problem

Norleucine has the same number of carbon atoms as isoleucine. Both molecules are shown below.

Isoleucine:

[fontsize=\footnotesize]
     |    HG21 HG22
  HN-N      | / 
     |     CG2--HG23
     |    /
  HA-CA--CB-HB    HD1
     |    \       /
     |     CG1--CD--HD2
   O=C    / \     \	 
     | HG11 HG12  HD3

Norleucine:

[fontsize=\footnotesize]
     |                   
  HN-N                   
     |   HB1 HG1 HD1   HE1 
     |   |   |   |    /    
  HA-CA--CB--CG--CD--CE--HE2
     |   |   |   |    \   
     |   HB2 HG2 HD2   HE3  
   O=C                  
     |

Since norleucine and isoleucine have the same number of carbon atoms, it naturally suggests a source of the topology file for norleucine. Moreover, the norleucine ``single chain" structure also reminds us that of lysine.

We've seen the structure and topology file for lysine above, so let's now look at the topology file for isoleucine on the next page.

Figure 7: Diagram showing an outline of the topology information for isoleucine. Atom names are displayed in black, atom types in blue, and atomic partial charges in red.
\begin{figure}\begin{center}
\includegraphics[height=1.6in]{pictures/ile}\end{center}\end{figure}

[frame=single, framerule=1.2mm, framesep=3mm, label=Isoleucine Topology Entry, fontsize=\scriptsize]
RESI ILE          0.00
GROUP   
ATOM N    NH1    -0.47  !     |    HG21 HG22
ATOM HN   H       0.31  !  HN-N      | / 
ATOM CA   CT1     0.07  !     |     CG2--HG23
ATOM HA   HB      0.09  !     |    /
GROUP                   !  HA-CA--CB-HB    HD1
ATOM CB   CT1    -0.09  !     |    \       /
ATOM HB   HA      0.09  !     |     CG1--CD--HD2
GROUP                   !   O=C    / \     \	 
ATOM CG2  CT3    -0.27  !     | HG11 HG12  HD3
ATOM HG21 HA      0.09
ATOM HG22 HA      0.09
ATOM HG23 HA      0.09
GROUP   
ATOM CG1  CT2    -0.18
ATOM HG11 HA      0.09
ATOM HG12 HA      0.09
GROUP   
ATOM CD   CT3    -0.27
ATOM HD1  HA      0.09
ATOM HD2  HA      0.09
ATOM HD3  HA      0.09
GROUP   
ATOM C    C       0.51
ATOM O    O      -0.51
BOND CB  CA   CG1 CB   CG2 CB   CD  CG1   
BOND N   HN   N   CA    C   CA   C   +N   
BOND CA  HA   CB  HB   CG1 HG11 CG1 HG12 CG2 HG21   
BOND CG2 HG22 CG2 HG23 CD  HD1  CD  HD2  CD  HD3 
DOUBLE  O   C
IMPR N -C CA HN  C CA +N O   
DONOR HN N   
ACCEPTOR O C   
IC -C   CA   *N   HN    1.3470 124.1600  180.0000 114.1900  0.9978
IC -C   N    CA   C     1.3470 124.1600  180.0000 106.3500  1.5190
IC N    CA   C    +N    1.4542 106.3500  180.0000 117.9700  1.3465
IC +N   CA   *C   O     1.3465 117.9700  180.0000 120.5900  1.2300
IC CA   C    +N   +CA   1.5190 117.9700  180.0000 124.2100  1.4467
IC N    C    *CA  CB    1.4542 106.3500  124.2200 112.9300  1.5681
IC N    C    *CA  HA    1.4542 106.3500 -115.6300 106.8100  1.0826
IC N    CA   CB   CG1   1.4542 112.7900  180.0000 113.6300  1.5498
IC CG1  CA   *CB  HB    1.5498 113.6300  114.5500 104.4800  1.1195
IC CG1  CA   *CB  CG2   1.5498 113.6300 -130.0400 113.9300  1.5452
IC CA   CB   CG2  HG21  1.5681 113.9300 -171.3000 110.6100  1.1100
IC HG21 CB   *CG2 HG22  1.1100 110.6100  119.3500 110.9000  1.1102
IC HG21 CB   *CG2 HG23  1.1100 110.6100 -120.0900 110.9700  1.1105
IC CA   CB   CG1  CD    1.5681 113.6300  180.0000 114.0900  1.5381
IC CD   CB   *CG1 HG11  1.5381 114.0900  122.3600 109.7800  1.1130
IC CD   CB   *CG1 HG12  1.5381 114.0900 -120.5900 108.8900  1.1141
IC CB   CG1  CD   HD1   1.5498 114.0900 -176.7800 110.3100  1.1115
IC HD1  CG1  *CD  HD2   1.1115 110.3100  119.7500 110.6500  1.1113
IC HD1  CG1  *CD  HD3   1.1115 110.3100 -119.7000 111.0200  1.1103

First of all, we will split norleucine into several atom groups, each with an integer charge, and each of which resembles similar atom groups in either lysine or isoleucine. As shown in figure 8, the backbone group and the -CH2 groups are very similar to those of lysine, so we will use the corresponding atom groups in lysine to assign atom types and charges for norleucine.

The terminal methyl group, -CH3, in norleucine is very similar to that of isoleucine. We will use those isoleucine group properties for the norleucine -CH3.

The figure below shows how we have grouped the norleucine molecule and how we will piece together its topology from existing topologies.

Figure 8: Diagram showing the structure of norleucine. Groups circled in green have corresponding groups in the lysine topology entry, while the terminal methyl group circled in pink is similar to that of isoleucine.
\begin{figure}\begin{center}
\includegraphics[height=1.8in]{pictures/nle-step1}\end{center}\end{figure}

After piecing the molecule together from existing groups, we can create a topology entry for norleucine. It should look like the one on the next page.

[frame=single, framerule=1.2mm, framesep=3mm, label=Norleucine Topology Entry, fontsize=\scriptsize]
RESI NLE          0.00
!!! Lysine groups:
GROUP   
ATOM N    NH1    -0.47  !     |                   
ATOM HN   H       0.31  !  HN-N                   
ATOM CA   CT1     0.07  !     |   HB1 HG1 HD1    HE1
ATOM HA   HB      0.09  !     |   |   |   |     /   
GROUP                   !  HA-CA--CB--CG--CD--CE--HE2
ATOM CB   CT2    -0.18  !     |   |   |   |     \
ATOM HB1  HA      0.09  !     |   HB2 HG2 HD2    HE3
ATOM HB2  HA      0.09  !   O=C                  
GROUP                   !     |                  
ATOM CG   CT2    -0.18
ATOM HG1  HA      0.09
ATOM HG2  HA      0.09    
GROUP   
ATOM CD   CT2    -0.18
ATOM HD1  HA      0.09
ATOM HD2  HA      0.09
GROUP   
ATOM C    C       0.51
ATOM O    O      -0.51
!!! Isoleucine groups:
GROUP
ATOM CE   CT3    -0.27
ATOM HE1  HA      0.09
ATOM HE2  HA      0.09
ATOM HE3  HA      0.09
BOND CB CA   CG CB   CD CG   CE CD
BOND N  HN   N  CA    C  CA   
BOND C  +N   CA HA   CB HB1  CB HB2  CG HG1   
BOND CG HG2  CD HD1  CD HD2   
DOUBLE   O  C   
BOND CE HE1  CE HE2  CE HE3   
IMPR N -C CA HN  C CA +N O    
DONOR HN N   
ACCEPTOR O C
!!! Internal coordinate entries
!!! Lysine ICs:
IC -C   CA   *N   HN    1.3482 123.5700  180.0000 115.1100  0.9988
IC -C   N    CA   C     1.3482 123.5700  180.0000 107.2900  1.5187
IC N    CA   C    +N    1.4504 107.2900  180.0000 117.2700  1.3478
IC +N   CA   *C   O     1.3478 117.2700  180.0000 120.7900  1.2277
IC CA   C    +N   +CA   1.5187 117.2700  180.0000 124.9100  1.4487
IC N    C    *CA  CB    1.4504 107.2900  122.2300 111.3600  1.5568
IC N    C    *CA  HA    1.4504 107.2900 -116.8800 107.3600  1.0833
IC N    CA   CB   CG    1.4504 111.4700  180.0000 115.7600  1.5435
IC CG   CA   *CB  HB1   1.5435 115.7600  120.9000 107.1100  1.1146
IC CG   CA   *CB  HB2   1.5435 115.7600 -124.4800 108.9900  1.1131
IC CA   CB   CG   CD    1.5568 115.7600  180.0000 113.2800  1.5397
IC CD   CB   *CG  HG1   1.5397 113.2800  120.7400 109.1000  1.1138
IC CD   CB   *CG  HG2   1.5397 113.2800 -122.3400 108.9900  1.1143
IC CB   CG   CD   CE    1.5435 113.2800  180.0000 112.3300  1.5350
!!! Isoleucine ICs:
IC CE   CG   *CD  HD1   1.5381 114.0900  122.3600 109.7800  1.1130
IC CE   CG   *CD  HD2   1.5381 114.0900 -120.5900 108.8900  1.1141
IC CG   CD   CE   HE1   1.5498 114.0900 -176.7800 110.3100  1.1115
IC HE1  CD   *CE  HE2   1.1115 110.3100  119.7500 110.6500  1.1113
IC HE1  CD   *CE  HE3   1.1115 110.3100 -119.7000 111.0200  1.1103

Recall that the ! sign is simply for comments, and helps to make the topology file more readable and user-friendly.

Note that since we have pieced together the norleucine molecule from lysine and isoleucine, the IC records of norleucine also come in two parts: lysine-similar and isoleucine-similar. For the N-terminal half of norleucine, from carbon atom CA to CG, we use ICs lysine’s topology file, as shown. As for C-terminal half of norleucine, from the carbon atom CE to CG, we can refer to the corresponding atoms CG1 and CD of isoleucine to get the IC records.

Before simulating a protein with norleucine, you should create the above topology entry in your topology file, and rename the file appropriately, as done with D-alanine. The file top_all27_prot_lipid_nle.inp has been provided for you in the directory 3-solution/.

Figure 9: Diagram showing a complete outline of the topology information for norleucine. The molecule is pieced together using groups from lysine and isoleucine as shown. The atoms types and charges, which have been taken from corresponding parts in lysine and isoleucine are shown. Atom names are displayed in black, atom types in blue, and atomic partial charges in red.
\begin{figure}\begin{center}
\includegraphics[height=1.8in]{pictures/nle-step2}\end{center}\end{figure}

We have created a topology entry for norleucine by spliting it into two components, each of which is similar to an amino acid with known topology file. Now let's have a look at a specific protein that contains norleucine, HIV-1 GAG(Group-specific Antigen) protein, which is a structural protein in HIV. The pdb file 1FGL.pdb contains a structure of HIV-1 GAG. Chain B is part of HIV-1 GAG which contains norleucine.

Load the pdb file into VMD and select the Coloring Method to be ``ResType". All the residues are now colored by white, green, red and blue, which correspond to nonpolar, polar, acidic and basic residues. The only exception is norleucine, which is colored cyan, because VMD doesn't have information about its residue type. It is ``unassigned".

With the topology file we just generated, we can build a psf file for the whole protein. First, load 1FGL.pdb into VMD and write a separate pdb file for each chain:

set chaina [atomselect top "chain A"]  
$chaina writepdb 1FGL-chainA.pdb  
set chainb [atomselect top "chain B"]  
$chainb writepdb 1FGL-chainB.pdb  
set water [atomselect top "water"]  
$water writepdb 1FGL-water.pdb  

The file nle.pgn is the psfgen script for building the psf and pdb file for the whole protein. Note that top_all22_prot_lipid_nle.inp is the topology file specified in nle.pgn. It is simply the file top_all22_prot_lipid_gsh.inp created earlier, but with our norleucine topology entry added. You can open it and browse the file by typing nedit top_all22_prot_lipid_nle.inp in a terminal window.

Now, generate the psf file for HIV-1 GAG by going to the VMD TkCon window and typing:

source nle.pgn  

You will generate files nle.psf and nle.pdb, which you can use to run a NAMD simulation!


next up previous
Up: Topology Tutorial Previous: Glutathione
tutorial-l@ks.uiuc.edu