From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Mon Apr 11 2011 - 16:34:22 CDT
On Mon, Apr 11, 2011 at 5:21 PM, Gianluca Interlandi
<gianluca_at_u.washington.edu> wrote:
>> blades generally tend to be a bit slower than "normal" nodes.
>>
>> i am attaching some graphs with performance numbers from some local
>> machine(s).
>> note that the $$$ estimate includes a share of the (rather expensive) QDR
>> infiniband infrastructure switch and the IB HCAs. that - of course - works
>> in favor of the 48-core and GPU nodes in the performance per $$ category,
>> but also reduces their scaling capability.
>
> Thanks Axel for the nice plots. It seems that Intel does a better job with
> scalability than AMD. Do they have a lower latency? Why do you say that
no. the main difference is contention for the communication.
in the intel nodes, you have one infiniband port per 12 cores,
in the amd nodes, you have one port per 48 cores. also, if my
math is right, the intel cpus have a bit more memory bandwidth
per core than the amds.
that is why i am saying that the "manycore" nodes are a good
replacement for a small cluster (they _are_ a small cluster in a box).
> blades are -in general- slower than nodes? You can get a quad core Intel
> Xeon also in a blade.
it is not the CPU it is the whole infrastructure. blades are optimized
to have a large CPU density per space, that results in some restrictions
in how networks can be built, what type of CPU can be used, what memory
and how everything is wired together. the differences are not always large,
but unless you are constrained in space or want the easy of management
of a blade enclosure, you'll be getting a better deal with 1U nodes. it is
more old fashioned, but that is here a good thing.
[...]
> http://www.dell.com/us/en/enterprise/servers/blade/cp.aspx?refid=blade&s=biz&cs=555
>
> However, they tend to be loud and you need a dedicated room with air
> conditioning. Normal computer nodes are usually quiet and cool, so it's
> easier to find a location for them.
i would want to have neither where i can hear them.
and as of recent i almost always use earplugs when
i work on our clusters. after a couple hours in the
cluster noise my IQ starts dropping (reversibly, i hope).
axel.
>
> Gianluca
>
>> i would expect you get the best bang for the buck using
>> 4-way 2.5GHz 8-core with 32GB (even 16GB would do
>> well for running only NAMD)
>>
>> axel.
>>
>>>
>>> Gianluca
>>>
>>>
>>>>
>>>> this way, you can save a lot of money by not needing a fast
>>>> interconnect.
>>>>
>>>> the second best option in my personal opinion would be a dual intel
>>>> westmere processor (4-core) based workstation with 4 GPUs. depending
>>>> on the choice of GPU, CPU and amount of memory, that may be bit
>>>> faster or slower than the 32-core AMD. the westmere has more memory
>>>> bandwidth and you can have 4x 16-x gen2 PCI-e to get the maximum
>>>> bandwidth to and from the GPUs. when using GeForce GPUs you are
>>>> taking a bit of a risk in terms of reliability, since there is no easy
>>>> way
>>>> to tell, if you have memory errors, but the Tesla "compute" GPUs will
>>>> bump up the price significantly.
>>>>
>>>>> If the funding is approved we will have to built by our own a small
>>>>> cluster.
>>>>> There is not many people with expertise around that can help us. Any
>>>>> suggestion, help or information would be greatly appreciated.
>>>>
>>>> there is no replacement for knowing what you are doing and
>>>> some advice from a random person on the internet, like me,
>>>> is not exactly what i would call a trustworthy source that i
>>>> would unconditionally bet my money on. you have to make
>>>> tests by yourself.
>>>>
>>>> i've been designing, setting up and running all kinds of linux
>>>> clusters for almost 15 years now and despite that experience,
>>>> _every_ time i have to start almost from scratch and evaluate
>>>> the needs and match it with available hardware options. one
>>>> thing that people often forget in the process is that they only
>>>> look at the purchasing price, but not the cost of maintenance
>>>> (in terms of time that the machine is not available, when it takes
>>>> long to fix problems, or when there are frequent hardware failures)
>>>> and the resulting overall "available" performance.
>>>>
>>>> i've also learned that you cannot trust any sales person.
>>>> they don't run scientific applications and have no clue
>>>> what you really need or not. similarly for the associated
>>>> "system engineers" they know very well how to rig and
>>>> install a machine, but they are never have to _operate_
>>>> one (and often without expert staff, as usual in many
>>>> academic settings).
>>>>
>>>> so the best you can do is to be paranoid, test for yourself
>>>> and make sure that the risk you are taking does not outweigh
>>>> the technical expertise that you have available to operate
>>>> the hardware.
>>>>
>>>> HTH,
>>>> axel.
>>>>
>>>>> Thanks a lot,
>>>>>
>>>>> HVS
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Axel Kohlmeyer
>>>> akohlmey_at_gmail.com http://goo.gl/1wk0
>>>>
>>>> Institute for Computational Molecular Science
>>>> Temple University, Philadelphia PA, USA.
>>>>
>>>>
>>>
>>> -----------------------------------------------------
>>> Gianluca Interlandi, PhD gianluca_at_u.washington.edu
>>> +1 (206) 685 4435
>>> http://artemide.bioeng.washington.edu/
>>>
>>> Postdoc at the Department of Bioengineering
>>> at the University of Washington, Seattle WA U.S.A.
>>> -----------------------------------------------------
>>
>>
>>
>> --
>> Dr. Axel Kohlmeyer
>> akohlmey_at_gmail.com http://goo.gl/1wk0
>>
>> Institute for Computational Molecular Science
>> Temple University, Philadelphia PA, USA.
>>
>
> -----------------------------------------------------
> Gianluca Interlandi, PhD gianluca_at_u.washington.edu
> +1 (206) 685 4435
> http://artemide.bioeng.washington.edu/
>
> Postdoc at the Department of Bioengineering
> at the University of Washington, Seattle WA U.S.A.
> -----------------------------------------------------
-- Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0 Institute for Computational Molecular Science Temple University, Philadelphia PA, USA.
This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:20:07 CST