Isoleucine:
[fontsize=\footnotesize] | HG21 HG22 HN-N | / | CG2--HG23 | / HA-CA--CB-HB HD1 | \ / | CG1--CD--HD2 O=C / \ \ | HG11 HG12 HD3
Norleucine:
[fontsize=\footnotesize] | HN-N | HB1 HG1 HD1 HE1 | | | | / HA-CA--CB--CG--CD--CE--HE2 | | | | \ | HB2 HG2 HD2 HE3 O=C |
Since norleucine and isoleucine have the same number of carbon atoms, it naturally suggests a source of the topology file for norleucine. Moreover, the norleucine ``single chain" structure also reminds us that of lysine.
We've seen the structure and topology file for lysine above, so let's now look at the topology file for isoleucine on the next page.
[frame=single, framerule=1.2mm, framesep=3mm, label=Isoleucine Topology Entry, fontsize=\scriptsize] RESI ILE 0.00 GROUP ATOM N NH1 -0.47 ! | HG21 HG22 ATOM HN H 0.31 ! HN-N | / ATOM CA CT1 0.07 ! | CG2--HG23 ATOM HA HB 0.09 ! | / GROUP ! HA-CA--CB-HB HD1 ATOM CB CT1 -0.09 ! | \ / ATOM HB HA 0.09 ! | CG1--CD--HD2 GROUP ! O=C / \ \ ATOM CG2 CT3 -0.27 ! | HG11 HG12 HD3 ATOM HG21 HA 0.09 ATOM HG22 HA 0.09 ATOM HG23 HA 0.09 GROUP ATOM CG1 CT2 -0.18 ATOM HG11 HA 0.09 ATOM HG12 HA 0.09 GROUP ATOM CD CT3 -0.27 ATOM HD1 HA 0.09 ATOM HD2 HA 0.09 ATOM HD3 HA 0.09 GROUP ATOM C C 0.51 ATOM O O -0.51 BOND CB CA CG1 CB CG2 CB CD CG1 BOND N HN N CA C CA C +N BOND CA HA CB HB CG1 HG11 CG1 HG12 CG2 HG21 BOND CG2 HG22 CG2 HG23 CD HD1 CD HD2 CD HD3 DOUBLE O C IMPR N -C CA HN C CA +N O DONOR HN N ACCEPTOR O C IC -C CA *N HN 1.3470 124.1600 180.0000 114.1900 0.9978 IC -C N CA C 1.3470 124.1600 180.0000 106.3500 1.5190 IC N CA C +N 1.4542 106.3500 180.0000 117.9700 1.3465 IC +N CA *C O 1.3465 117.9700 180.0000 120.5900 1.2300 IC CA C +N +CA 1.5190 117.9700 180.0000 124.2100 1.4467 IC N C *CA CB 1.4542 106.3500 124.2200 112.9300 1.5681 IC N C *CA HA 1.4542 106.3500 -115.6300 106.8100 1.0826 IC N CA CB CG1 1.4542 112.7900 180.0000 113.6300 1.5498 IC CG1 CA *CB HB 1.5498 113.6300 114.5500 104.4800 1.1195 IC CG1 CA *CB CG2 1.5498 113.6300 -130.0400 113.9300 1.5452 IC CA CB CG2 HG21 1.5681 113.9300 -171.3000 110.6100 1.1100 IC HG21 CB *CG2 HG22 1.1100 110.6100 119.3500 110.9000 1.1102 IC HG21 CB *CG2 HG23 1.1100 110.6100 -120.0900 110.9700 1.1105 IC CA CB CG1 CD 1.5681 113.6300 180.0000 114.0900 1.5381 IC CD CB *CG1 HG11 1.5381 114.0900 122.3600 109.7800 1.1130 IC CD CB *CG1 HG12 1.5381 114.0900 -120.5900 108.8900 1.1141 IC CB CG1 CD HD1 1.5498 114.0900 -176.7800 110.3100 1.1115 IC HD1 CG1 *CD HD2 1.1115 110.3100 119.7500 110.6500 1.1113 IC HD1 CG1 *CD HD3 1.1115 110.3100 -119.7000 111.0200 1.1103
First of all, we will split norleucine into several atom groups, each with an integer charge, and each of which resembles similar atom groups in either lysine or isoleucine. As shown in figure 8, the backbone group and the -CH2 groups are very similar to those of lysine, so we will use the corresponding atom groups in lysine to assign atom types and charges for norleucine.
The terminal methyl group, -CH3, in norleucine is very similar to that of isoleucine. We will use those isoleucine group properties for the norleucine -CH3.
The figure below shows how we have grouped the norleucine molecule and how we will piece together its topology from existing topologies.
After piecing the molecule together from existing groups, we can create a topology entry for norleucine. It should look like the one on the next page.
[frame=single, framerule=1.2mm, framesep=3mm, label=Norleucine Topology Entry, fontsize=\scriptsize] RESI NLE 0.00 !!! Lysine groups: GROUP ATOM N NH1 -0.47 ! | ATOM HN H 0.31 ! HN-N ATOM CA CT1 0.07 ! | HB1 HG1 HD1 HE1 ATOM HA HB 0.09 ! | | | | / GROUP ! HA-CA--CB--CG--CD--CE--HE2 ATOM CB CT2 -0.18 ! | | | | \ ATOM HB1 HA 0.09 ! | HB2 HG2 HD2 HE3 ATOM HB2 HA 0.09 ! O=C GROUP ! | ATOM CG CT2 -0.18 ATOM HG1 HA 0.09 ATOM HG2 HA 0.09 GROUP ATOM CD CT2 -0.18 ATOM HD1 HA 0.09 ATOM HD2 HA 0.09 GROUP ATOM C C 0.51 ATOM O O -0.51 !!! Isoleucine groups: GROUP ATOM CE CT3 -0.27 ATOM HE1 HA 0.09 ATOM HE2 HA 0.09 ATOM HE3 HA 0.09 BOND CB CA CG CB CD CG CE CD BOND N HN N CA C CA BOND C +N CA HA CB HB1 CB HB2 CG HG1 BOND CG HG2 CD HD1 CD HD2 DOUBLE O C BOND CE HE1 CE HE2 CE HE3 IMPR N -C CA HN C CA +N O DONOR HN N ACCEPTOR O C !!! Internal coordinate entries !!! Lysine ICs: IC -C CA *N HN 1.3482 123.5700 180.0000 115.1100 0.9988 IC -C N CA C 1.3482 123.5700 180.0000 107.2900 1.5187 IC N CA C +N 1.4504 107.2900 180.0000 117.2700 1.3478 IC +N CA *C O 1.3478 117.2700 180.0000 120.7900 1.2277 IC CA C +N +CA 1.5187 117.2700 180.0000 124.9100 1.4487 IC N C *CA CB 1.4504 107.2900 122.2300 111.3600 1.5568 IC N C *CA HA 1.4504 107.2900 -116.8800 107.3600 1.0833 IC N CA CB CG 1.4504 111.4700 180.0000 115.7600 1.5435 IC CG CA *CB HB1 1.5435 115.7600 120.9000 107.1100 1.1146 IC CG CA *CB HB2 1.5435 115.7600 -124.4800 108.9900 1.1131 IC CA CB CG CD 1.5568 115.7600 180.0000 113.2800 1.5397 IC CD CB *CG HG1 1.5397 113.2800 120.7400 109.1000 1.1138 IC CD CB *CG HG2 1.5397 113.2800 -122.3400 108.9900 1.1143 IC CB CG CD CE 1.5435 113.2800 180.0000 112.3300 1.5350 !!! Isoleucine ICs: IC CE CG *CD HD1 1.5381 114.0900 122.3600 109.7800 1.1130 IC CE CG *CD HD2 1.5381 114.0900 -120.5900 108.8900 1.1141 IC CG CD CE HE1 1.5498 114.0900 -176.7800 110.3100 1.1115 IC HE1 CD *CE HE2 1.1115 110.3100 119.7500 110.6500 1.1113 IC HE1 CD *CE HE3 1.1115 110.3100 -119.7000 111.0200 1.1103
Recall that the ! sign is simply for comments, and helps to make the topology file more readable and user-friendly.
Note that since we have pieced together the norleucine molecule from lysine and isoleucine, the IC records of norleucine also come in two parts: lysine-similar and isoleucine-similar. For the N-terminal half of norleucine, from carbon atom CA to CG, we use ICs lysine’s topology file, as shown. As for C-terminal half of norleucine, from the carbon atom CE to CG, we can refer to the corresponding atoms CG1 and CD of isoleucine to get the IC records.
Before simulating a protein with norleucine, you should create the above topology entry in your topology file, and rename the file appropriately, as done with D-alanine. The file top_all27_prot_lipid_nle.inp has been provided for you in the directory 3-solution/.
We have created a topology entry for norleucine by spliting it into two components, each of which is similar to an amino acid with known topology file. Now let's have a look at a specific protein that contains norleucine, HIV-1 GAG(Group-specific Antigen) protein, which is a structural protein in HIV. The pdb file 1FGL.pdb contains a structure of HIV-1 GAG. Chain B is part of HIV-1 GAG which contains norleucine.
Load the pdb file into VMD and select the Coloring Method to be ``ResType". All the residues are now colored by white, green, red and blue, which correspond to nonpolar, polar, acidic and basic residues. The only exception is norleucine, which is colored cyan, because VMD doesn't have information about its residue type. It is ``unassigned".
With the topology file we just generated, we can build a psf file for the whole protein. First, load 1FGL.pdb into VMD and write a separate pdb file for each chain:
set chaina [atomselect top "chain A"] | |
$chaina writepdb 1FGL-chainA.pdb | |
set chainb [atomselect top "chain B"] | |
$chainb writepdb 1FGL-chainB.pdb | |
set water [atomselect top "water"] | |
$water writepdb 1FGL-water.pdb |
The file nle.pgn is the psfgen script for building the psf and pdb file for the whole protein. Note that top_all22_prot_lipid_nle.inp is the topology file specified in nle.pgn. It is simply the file top_all22_prot_lipid_gsh.inp created earlier, but with our norleucine topology entry added. You can open it and browse the file by typing nedit top_all22_prot_lipid_nle.inp in a terminal window.
Now, generate the psf file for HIV-1 GAG by going to the VMD TkCon window and typing:
source nle.pgn |
You will generate files nle.psf and nle.pdb, which you can use to run a NAMD simulation!