Re: AW: 2CPU+1GPU vs 1CPU+2GPU

From: Marcel UJI (IMAP) (arzo_at_uji.es)
Date: Tue Feb 14 2012 - 05:07:26 CST

Dear Nicholas,

Please be assured that I am not at all convinced that memory
errors can have a major effect on molecular dynamics results. In any
case, apart from protecting our investment against the possible
existence of those errors and considering that I'm not the only
user, I decided what I feel safer. Surely, in case the system for
my personal use, I would decide to go for the cards without ECC correction.

In any case, I think that discussion has been very rewarding, and I will
follow any news about this problem.

Thank you for your comments

Marcel

Al 14/02/12 11:04, En/na Nicholas M Glykos ha escrit:
> Dear Marcel, Norman, List,
>
> I'll play devil's advocate, bear with me. Measuring (and demonstrating)
> memory errors with memtest does nothing to answer the important question :
> Do these errors change the average long-term dehaviour (and derived
> quantites) from the simulations, or they just add (as white noise) another
> source of chaotropic behaviour in an already chaotic system ? I would
> argue that if the memory errors are trully random, then they can not be
> correlated with the aim of any given simulation, and, thus, can not be
> held responsible for things working out "incredibly great" or otherwise.
> If I were to offer an example in support of this thesis, I would probably
> quote the results obtained on folding simulations by the Shaw group (the
> Science 2010 paper) using the Anton machine which to my knowledge (please
> do correct me if I'm wrong) does not use ECC memory. Although I'm not
> advocating the incorporation of avoidable errors in calculations, I do
> feel that solid evidence for the effect of these errors on the MD-derived
> quantities is missing.
>
> My twocents,
> Nicholas
>
>
>
> On Tue, 14 Feb 2012, Marcel UJI (IMAP) wrote:
>
>
>> Yes I have found other sources with similar results (see
>> http://www.cs.stanford.edu/people/ihaque/talks/resilience-2010.pdf), so
>> I think I will finally go for those Tesla cards.
>>
>> Thank you all for your help!
>>
>> Marcel
>>
>> Al 14/02/12 08:18, En/na Norman Geist ha escrit:
>>
>>> Hi,
>>>
>>>
>>>
>>> I just wanted to add that I was pretty surprised when I first saw the
>>> ECC error counters on my Tesla C2050. Well in fact it's the total of
>>> double bit and I never investigated their occurrence but I would only
>>> go without ECC with some belly aches because everything that doesn't
>>> work or behave strange in your simulations, or even what works
>>> incredibly great can come due to artifacts of memory errors, that
>>> might sound a little overdone, but is possible. For what else, except
>>> of reliability, ECC has been developed. But I'm really not sure what
>>> influence those errors can really have, but with ecc you have one
>>> thing less to survey when problems occur.
>>>
>>>
>>>
>>> Best wishes
>>>
>>>
>>>
>>> Norman Geist.
>>>
>>>
>>>
>>> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
>>> Auftrag von *Ajasja Ljubetic
>>> *Gesendet:* Montag, 13. Februar 2012 16:09
>>> *Cc:* Marcel UJI (IMAP); namd-l_at_ks.uiuc.edu
>>> *Betreff:* Re: namd-l: 2CPU+1GPU vs 1CPU+2GPU
>>>
>>>
>>>
>>> One final thing. I've done some benchmarking with a AMD 6-core
>>> desktop and a GTX-570 and it ends up being about equal (slightly
>>> faster) than a 6-core xeon with an M2070. You can buy a 3GB
>>> GTX580 for a fraction of the price of a M series card, and an AMD
>>> CPU (particularly the 3 GHz 6-core Thubans) will be close to half
>>> the price of the intel. While I'm sure the intel chip is
>>> generally superior to the AMD one, it doesn't seem to be a factor
>>> when running NAMD. So I would say buy two desktops and save
>>> yourself money and also gain performance. I know there is the
>>> lack of ECC memory with the GTX series, but I'm really not
>>> convinced that is a big issue for MD (maybe someone on the list
>>> has a different opinion).
>>>
>>>
>>>
>>> I'm running my simulations on several GTX 560 Ti for half a year now
>>> and it works great! So I would back up this advice.
>>>
>>>
>>>
>>> Best regards,
>>>
>>> Ajasja
>>>
>>>
>>>
>>> ~Aron
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Feb 13, 2012 at 6:44 AM, Nicholas M Glykos
>>> <glykos_at_mbg.duth.gr <mailto:glykos_at_mbg.duth.gr>> wrote:
>>>
>>>
>>>
>>> You will (hopefully) hear from Axel on this, but :
>>>
>>>
>>> > as it would give more speed for our NAMD based simulations
>>>
>>> Is this an assumption or the result of benchmarking the two hardware
>>> configurations with your intended system sizes ? For small (atom-wise)
>>> systems, you shouldn't expect much improvement by increasing the
>>> number of
>>> GPUs (and for tiny systems the 1CPU+2GPU may not scale at all).
>>>
>>> My twocents,
>>> Nicholas
>>>
>>>
>>> --
>>>
>>>
>>> Nicholas M. Glykos, Department of Molecular Biology
>>> and Genetics, Democritus University of Thrace, University Campus,
>>> Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office)
>>> +302551030620 <tel:%2B302551030620>,
>>> Ext.77620, Tel (lab) +302551030615 <tel:%2B302551030615>,
>>> http://utopia.duth.gr/~glykos/ <http://utopia.duth.gr/%7Eglykos/>
>>>
>>>
>>>
>>> --
>>> Aron Broom M.Sc
>>> PhD Student
>>> Department of Chemistry
>>> University of Waterloo
>>>
>>>
>>>
>>>
>>
>>
>

-- 
Dr. Marcel Aguilella-Arzo
Professor Titular d'Universitat, Física Aplicada
Departament de Física
Escola Superior de Tecnologia i Ciències Experimentals
Universitat Jaume I
Av. Sos Baynat, s/n
12071 Castelló de la Plana (Spain)
+34 964 728 046
arzo_at_fca.uji.es

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:12 CST