From: Leandro Martínez (leandromartinez98_at_gmail.com)
Date: Wed Oct 22 2008 - 13:53:20 CDT

In fortran77, if you write a program to read a DCD file and you
want to use it in a 64 bit machine, you need to compile it
with the "-m32" flag.

Just to add some information on this, when using Fortran90 (or latter
versions), the problem of the size of the record is not present anymore,
because the size of the variables is defined independently on the
platform (that is, a double precision variable is the same size in
a 32 or 64 machine in Fortran90+). I think you can also deal with
the endianess problem in these newer fortran versions, but that
will be more tricky as it is in the other languages (and I have never
faced that problem, probably because the scope of architectures
I use is limited).

Leandro.

On Wed, Oct 22, 2008 at 3:49 PM, Axel Kohlmeyer
<akohlmey_at_cmm.chem.upenn.edu> wrote:
> On Wed, 22 Oct 2008, Tibbitt, Jeffrey A. wrote:
>
>
> hi,
>
> please let me point out that there is a significant difference
> between reading binary files from fortran and from c/c++/perl/etc.
> the fortran code only works, if the platform reading the .dcd file
> is equivalent to the way it is written.
>
> .dcd is a binary format, and thus "endianness" of a machine matters.
> to explain: there are some machines that store a (32-bit) integer
> like 66100 as the four bytes 0, 1, 2, 52 and others that would do
> 52, 2, 1, 0. so you have to know which machine a dcd file was written
> on or what endiannes it was encoded into (some fortran compilers can
> swap on the fly based on compiler flags and environment variables).
>
> the really nasty part is, that fortran binary files do not only contain
> the data that is written, but a record length indicator that tells
> the fortran binary reader how large a block (a single write) exactly is.
>
> to explain: in fortran you can do on writing (in unformatted binary).
>
> write(8) a, b, c
> write(8) d, e, f
> write(8) (g(i),i=1,100)
>
> if you are only interested in a part of the data you can do:
>
> read(8) a, b
> read(8) d
> read(8) (g(i),i=1,50)
>
> and it will _still_ read what was in "a" into "a" and what was
> in "d" into "d" and so on.
>
> now the size of this record length indicator can be set manually
> for some compilers or is dependent on the platform (most have 32-bit
> even on 64-bit machines, some have 64-bit if their integer type is
> 64-bit by default). if you try to read a .dcd file with fortran, it
> will either work or fail miserably (frequenly with a "failed to seek
> to the end of block" or a "file too short" type error message or
> even a segmentation fault), depending on your platform. in c you
> have a bit more flexibility and the same applied to perl, but the
> correct handling of all these issues is tricky.
>
> please have a look at the dcdplugin source code of the molfile_plugin
> package that reads .dcd files for VMD (and after many, many tests is
> almost perfect at handling endiannes and record size length
> automatically and correctly...
>
> cheers,
> axel.
>
>
> JT> Jonathan,
> JT> My FORTRAN code for reading an entire trajectory is different than what you have and may be easier to follow. It is very short and well commented. I'll explain it. The lines beginning with a c are comments. The first line is just a my name for the program. The second group of lines declare the variables. Here you notice the * in different places. It specifies the sizes of the data types read in (here character*4 means a word 4 characters in length). It works similarly for integers, but it signifies bytes (I think) when applied to real numbers. The third group of lines open up the trajectory file and start reading it from the beginning. Other information is stored at the beginning before the coordinates. The fourth group of lines loop through the frames and read the coordinates.
> JT>
> JT> The first thing in a trajectory to read is a word (here I call it header), 4 characters in length, containing the characters: CORD or VELO. It should be one of these two only and defines the trajectory as containing coordinates or velocities.
> JT>
> JT> The second thing in the trajectory coming right after that is a 20 dimensional array of integers (called control). This contains information (control(1)=number of frames, control(2)=number of previous dynamic steps, control(3)=skip frequency for writing frames, control(4)=number of dynamics steps).
> JT>
> JT> The third thing read is an integer (called ntitle). This is the dimension of the 4th thing read. This dimension is the same as the number of title lines written out by the MD software.
> JT>
> JT> The fourth thing read is an array of characters of dimension ntitle (the dimension previously read). This array is called title here.
> JT>
> JT> The fifth thing read is an integer stating the number of atoms (called natom here).
> JT>
> JT> The sixth thing read are the coordinates. First frame coordinates come first. But all the x-coordinates come first, then all the ys then all the zs. First, all the x coordinates for the first frame are read. It is read in at once as an array natom in length (here I call it xx(natom)). Next, all the y coordinates of the first frame are read (yy(natom)). Then all the z coordinates are read (zz(natom). Repeat reading the coordinates with the second frame.
> JT>
> JT> I've gone on at length here, because I know it is invaluable to have your own code to get at the information in the DCD trajectory. And I've seen this question before. Note that this is used only for a simple DCD trajectory with no fixed atoms. They are a little different if they contain that sort of stuff. I am not an expert at DCD format, but I know that this format works for my trajectories. For fixed atoms and other stuff included someone else can help. But hopefully this helps you in constructing one in Perl for yourself.
> JT>
> JT> Jeff Tibbitt
> JT>
> JT>
> JT> ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
> JT> program readdcd
> JT>
> JT> c TRAJECTORY VARIABLES
> JT> character*4 header
> JT> integer control(20)
> JT> integer i,j,ntitle,natom,nf
> JT> character*80 title(4)
> JT> real*4 xx(natom)
> JT> real*4 yy(natom)
> JT> real*4 zz(natom)
> JT>
> JT> c OPEN TRAJECTORY FILE
> JT> open(unit=56,file="tmp.dcd",status='old',form='unformatted')
> JT> c READ IN HEADER AND CONTROL INFO
> JT> read(56) header,control
> JT> c READ IN THE TITLE LINES
> JT> read(56) ntitle,(title(j),j=1,ntitle)
> JT> c READ IN NUMBER OF ATOMS
> JT> read(56) natom
> JT>
> JT> c LOOP OVER FRAMES
> JT> do i=1,nf
> JT> c READ IN COORDINATES
> JT> read(56) xx
> JT> read(56) yy
> JT> read(56) zz
> JT> enddo
> JT>
> JT> end program
> JT> cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
> JT>
> JT>
> JT>
> JT>
> JT>
> JT>
> JT>
> JT>
> JT>
> JT> ________________________________________
> JT> From: owner-vmd-l_at_ks.uiuc.edu [owner-vmd-l_at_ks.uiuc.edu] On Behalf Of JONATHAN BLACK [nihilique_at_hotmail.com]
> JT> Sent: Wednesday, October 22, 2008 5:30 AM
> JT> To: vmd-l_at_ks.uiuc.edu
> JT> Subject: vmd-l: dcd format
> JT>
> JT> actually i have to write a perl script that can do that as part of a larger program. i've found a fortran script that can do that but it will take me a while to "decode" how it does what it does since i don't know fortran. however i would like to ask for the format with which a dcd is packed. i've found this :
> JT>
> JT> #===DCD=FORMAT==============================================
> JT> #HDR NSET ISTRT NSAVC 5-ZEROS NATOM-NFREAT DELTA 9-ZEROS
> JT> #CORD files step1 step zeroes (zero) timestep zeroes
> JT> #C*4 INT INT INT 5INT INT DOUBLE 9INT
> JT> # [CHARACTER*20]
> JT> #===========================================================
> JT> #NTITLE TITLE
> JT> #INT C*MAXTITL
> JT> #C*2 C*80
> JT> #===========================================================
> JT> #NATOM
> JT> #INT
> JT> #===========================================================
> JT> #X(I), I=1,NATOM (DOUBLE)
> JT> #Y(I), I=1,NATOM
> JT> #Z(I), I=1,NATOM
> JT> #===========================================================
> JT>
> JT> as part of another script but i don't know if it's true and accurate, and even if it is, i can't understand some parts of it (eg : character*20 at 5th line). Any guidelines would be appreciated. Thank you...
> JT>
> JT>
> JT>
> JT> ----------------------------------------
> JT> > From: gillescche_at_gmail.com
> JT> > Subject: Re: vmd-l:
> JT> > Date: Tue, 21 Oct 2008 08:22:01 -0400
> JT> > To: nihilique_at_hotmail.com
> JT> >
> JT> > Since you want the information contained in the dcd file I would use
> JT> > catdcd to write out the coordinates of a certain frame in some format
> JT> > you can use such as pdb or xyz.
> JT> >
> JT> > You can find catdcd here:
> JT> >
> JT> > http://www.ks.uiuc.edu/Development/MDTools/catdcd/
> JT> >
> JT> > If you want to include the files in a script I know there are some
> JT> > fortran reads that are given on the mailing list.
> JT> >
> JT> > Best
> JT> > Chris
> JT> >
> JT> > On Oct 21, 2008, at 5:16 AM, JONATHAN BLACK wrote:
> JT> >
> JT> >>
> JT> >> hi guys. i am a biologist and i haven't been programming in perl
> JT> >> for a long time so my question may seem a little naive but i'm at a
> JT> >> total loss and any help would be appreciated. i need a perl script
> JT> >> to convert some binary files (.dcd if anyone knows what they are)
> JT> >> to ascii or just extract the ascii that they contain. i tried plain
> JT> >> read with no results. i searched the net and i only found this code :
> JT> >>
> JT> >> use warnings;
> JT> >> use strict;
> JT> >> @ARGV == 2 or die "usage: $0 in_filename out_filename\n";
> JT> >> # get first argument, i.e filename my $in_filename = shift;
> JT> >> print "You chose input \n";
> JT> >> my $out_filename = shift;
> JT> >> print "You chose output \n";
> JT> >> #set infile to binary mode open INFILE, '', $out_filename or die
> JT> >> "can't open $out_filename: $!";
> JT> >> # read 8 bytes at a time
> JT> >> $/ = \8;
> JT> >> while ( )
> JT> >> {
> JT> >> print OUTFILE join( ', ', map sprintf( '0x%04x', $_ ), unpack 'S*',
> JT> >> $_ + ), "\n";
> JT> >> }
> JT> >>
> JT> >> but when i run it it comes out with four columns of something
> JT> >> seeming like hexadecimal (0xe15d ...). The dcd files are about 700
> JT> >> mb so i can't upload it for you to see it, but i managed to cut the
> JT> >> first 80kb of it with hexedit and i upload it here: http://
> JT> >> www.gigasize.com/get.php?d=7qfs5f331bf Does anyone have any idea or
> JT> >> some plain guidelines because i really need get it done for my
> JT> >> work. Thanks anyway!
> JT> >> _________________________________________________________________
> JT> >> You live life beyond your PC. So now Windows goes beyond your PC.
> JT> >> http://clk.atdmt.com/MRT/go/115298556/direct/01/
> JT> >
> JT>
> JT> _________________________________________________________________
> JT> You live life beyond your PC. So now Windows goes beyond your PC.
> JT> http://clk.atdmt.com/MRT/go/115298556/direct/01/
> JT> _________________________________________________________________
> JT> Stay organized with simple drag and drop from Windows Live Hotmail.
> JT> http://windowslive.com/Explore/hotmail?ocid=TXT_TAGLM_WL_hotmail_102008
> JT>
>
> --
> =======================================================================
> Axel Kohlmeyer akohlmey_at_cmm.chem.upenn.edu http://www.cmm.upenn.edu
> Center for Molecular Modeling -- University of Pennsylvania
> Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
> tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
> =======================================================================
> If you make something idiot-proof, the universe creates a better idiot.
>