Re: GTX-660 Ti benchmark

From: Aron Broom (broomsday_at_gmail.com)
Date: Mon Sep 17 2012 - 20:56:41 CDT

guanglei,

just a quick point to make about cards: keep in mind that the very
expensive workstation cards aren't actually any faster than the consumer
counterparts. For instance, a GTX580 vs. an M2090, the 580 has the same
number of cores and actually faster clock and memory speeds. The M2090 has
more memory and that memory has error correcting code, hence the extra
bucks. For the kepler series (I'm not sure the workstation cards are out
yet?) the consumer cards will also be faster than the workstation ones at
least in terms of single precision, but I think it's supposed to be the
reverse for double precision.

~Aron

On Mon, Sep 17, 2012 at 4:35 PM, Guanglei Cui
<amber.mail.archive_at_gmail.com>wrote:

> Hi Jason and Thomas,
>
> Thanks very much for your input. This is very useful, as I was
> struggling to gauge my expectations on the GPU workstation we have
> since I have no comparison. It seems Jason may have a similar hardware
> setup. The OS installed here is Centos5.8. I'm not sure if this
> matters.
>
> Thomas, if your timing was from 1GPU/1CPU, I'd be thoroughly upset
> 'cause that is almost twice as fast as I could get on a much more
> expensive card. Would you be able to share additional information on
> your OS and any configurations that matter?
>
> Regards,
> Guanglei
>
> On Sun, Sep 16, 2012 at 6:08 PM, Roberts, Jason <Jason.Roberts_at_mh.org.au>
> wrote:
> > Hi Guanglei,
> >
> > We are running a 2U rack (2x Xeon E5645, 4xM2090) and although I don't
> have the same setup I ran the Apoa1 benchmark allocating 6 cores and 1
> M2090 (./namd2 +idlepoll +p6 +devices 0 apoa1.namd > apoa1_6.out). The
> default benchmark gave 0.049 s/step. I changed the outputEnergies and
> outputTiming values to 1000 and extended the run to 10000 steps and got
> 0.038 s/step.
> >
> > If I run the last simulation with 1 core and 1 GPU (./namd2 +idlepoll
> +p1 +devices 0 apoa1.namd > apoa1_1.out) I get 0.122 s/step.
> >
> > Hope this helps.
> >
> > PS, if anyone is interested, I ran multiple simultaneous runs with
> different combinations of CPU and GPU allocations and obtained the
> following results:
> >
> > Apoa1 (10,000 steps, timestep = 1, outputs at 1000steps)
> > 1 run (12xThreads 4xM2090) = 0.015 s/step
> > 1 run (24xThreads 4xM2090) = 0.016 s/step
> > 2 runs (6xThreads, 2xM2090) each = 0.027 s/step
> > 2 runs (12xThreads, 4xM2090 shared) = 0.026 s/step
> > 4 runs (3xThreads, 1xM2090) each = 0.051 s/step
> > 4 runs (6xThreads, 4xM2090 shared) = 0.046 s/step
> > 8 runs (3xThreads, 4xM2090 shared) = 0.088 s/step
> >
> > (Hyperthreading is ON)
> >
> > Cheers,
> >
> > Jason A. Roberts
> > Senior Medical Scientist
> > National Enterovirus Reference Laboratory
> > WHO Poliomyelitis Regional Reference Laboratory
> > VIDRL, 10 Wreckyn Street,
> > North Melbourne, Australia, 3051
> > Phone: +613 9342 2607
> > Fax: +613 9342 2665
> > email: polio_at_mh.org.au (lab enquiries)
> > web site: www.vidrl.org.au
> >
> > Date: Fri, 14 Sep 2012 09:50:41 -0400
> > From: Guanglei Cui <amber.mail.archive_at_gmail.com>
> > Subject: Re: namd-l: GTX-660 Ti benchmark
> >
> > Hi,
> >
> > I'm curious what kind of performance I should expect from a M2090 card
> (Intel Xeon X5670, CentOS 5.8). With 1 CPU and 1GPU, I get 0.11 s/step on
> Apoa1 (2000 steps, timestep 1) using the namd2.9 multicore CUDA binary from
> the NAMD website. I suspect this is a reasonable speed. I wonder if someone
> would kindly point out what a reasonable expectation is for this type of
> setup, and how to achieve that. Thanks very much.
> >
> > Guanglei
> >
> > On Thu, Sep 13, 2012 at 11:10 PM, Wenyu Zhong <wenyuzhong_at_gmail.com>
> wrote:
> >> Sorry, a correction.
> >>
> >> The power consumption with i5_at_3.7G+660ti running apoa1 is about 200w,
> >> and with i5_at_3.7G+2*460 is about 260w.
> >>
> >> Wenyu
> >
> >
> >
> > - --
> > Guanglei Cui
> >
> >
>
>
>
> --
> Guanglei Cui
>
>

-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:22:05 CST