From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Fri Mar 23 2012 - 02:09:28 CDT
Tru,
Unfortunately you still didn't executed a cuda binary, just a tool that
talks to the driver to get some data. Please run a program that does real
computation on the gpu. A tool that needs to alloc the gpu and uses it, so
we can clearly figure out if it is a namd problem or a general one.
Norman Geist.
> -----Ursprüngliche Nachricht-----
> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> Auftrag von Tru Huynh
> Gesendet: Donnerstag, 22. März 2012 20:29
> An: Norman Geist
> Cc: Namd Mailing List
> Betreff: Re: namd-l: Linux-x86_64-CUDA version 2.8 on CentOS-5 x86_64
> non local user issue?
> 
> Hi,
> 
> thanks for looking at that issue,
> 
> On Wed, Mar 21, 2012 at 08:02:41AM +0100, Norman Geist wrote:
> > Tru,
> >
> > nvidia-smi is not a cuda program, it's just a driver utility. Please
> check
> > if you can run other cuda programs, maybe one example from the cuda
> sdk,
> from a ldap only user account:
> /c5/shared/cuda/4.1.28/C/bin/linux/release/deviceQuery
> [deviceQuery] starting...
> 
> /c5/shared/cuda/4.1.28/C/bin/linux/release/deviceQuery Starting...
> 
>  CUDA Device Query (Runtime API) version (CUDART static linking)
> 
> Found 2 CUDA Capable device(s)
> 
> Device 0: "Tesla M2090"
>   CUDA Driver Version / Runtime Version          4.1 / 4.1
>   CUDA Capability Major/Minor version number:    2.0
>   Total amount of global memory:                 5375 MBytes
> (5636554752 bytes)
>   (16) Multiprocessors x (32) CUDA Cores/MP:     512 CUDA Cores
>   GPU Clock Speed:                               1.30 GHz
>   Memory Clock rate:                             1848.00 Mhz
>   Memory Bus Width:                              384-bit
>   L2 Cache Size:                                 786432 bytes
>   Max Texture Dimension Size (x,y,z)             1D=(65536),
> 2D=(65536,65535), 3D=(2048,2048,2048)
>   Max Layered Texture Size (dim) x layers        1D=(16384) x 2048,
> 2D=(16384,16384) x 2048
>   Total amount of constant memory:               65536 bytes
>   Total amount of shared memory per block:       49152 bytes
>   Total number of registers available per block: 32768
>   Warp size:                                     32
>   Maximum number of threads per block:           1024
>   Maximum sizes of each dimension of a block:    1024 x 1024 x 64
>   Maximum sizes of each dimension of a grid:     65535 x 65535 x 65535
>   Maximum memory pitch:                          2147483647 bytes
>   Texture alignment:                             512 bytes
>   Concurrent copy and execution:                 Yes with 2 copy
> engine(s)
>   Run time limit on kernels:                     No
>   Integrated GPU sharing Host Memory:            No
>   Support host page-locked memory mapping:       Yes
>   Concurrent kernel execution:                   Yes
>   Alignment requirement for Surfaces:            Yes
>   Device has ECC support enabled:                Yes
>   Device is using TCC driver mode:               No
>   Device supports Unified Addressing (UVA):      Yes
>   Device PCI Bus ID / PCI location ID:           2 / 0
>   Compute Mode:
>      < Default (multiple host threads can use ::cudaSetDevice() with
> device simultaneously) >
> 
> Device 1: "Tesla M2090"
>   CUDA Driver Version / Runtime Version          4.1 / 4.1
>   CUDA Capability Major/Minor version number:    2.0
>   Total amount of global memory:                 5375 MBytes
> (5636554752 bytes)
>   (16) Multiprocessors x (32) CUDA Cores/MP:     512 CUDA Cores
>   GPU Clock Speed:                               1.30 GHz
>   Memory Clock rate:                             1848.00 Mhz
>   Memory Bus Width:                              384-bit
>   L2 Cache Size:                                 786432 bytes
>   Max Texture Dimension Size (x,y,z)             1D=(65536),
> 2D=(65536,65535), 3D=(2048,2048,2048)
>   Max Layered Texture Size (dim) x layers        1D=(16384) x 2048,
> 2D=(16384,16384) x 2048
>   Total amount of constant memory:               65536 bytes
>   Total amount of shared memory per block:       49152 bytes
>   Total number of registers available per block: 32768
>   Warp size:                                     32
>   Maximum number of threads per block:           1024
>   Maximum sizes of each dimension of a block:    1024 x 1024 x 64
>   Maximum sizes of each dimension of a grid:     65535 x 65535 x 65535
>   Maximum memory pitch:                          2147483647 bytes
>   Texture alignment:                             512 bytes
>   Concurrent copy and execution:                 Yes with 2 copy
> engine(s)
>   Run time limit on kernels:                     No
>   Integrated GPU sharing Host Memory:            No
>   Support host page-locked memory mapping:       Yes
>   Concurrent kernel execution:                   Yes
>   Alignment requirement for Surfaces:            Yes
>   Device has ECC support enabled:                Yes
>   Device is using TCC driver mode:               No
>   Device supports Unified Addressing (UVA):      Yes
>   Device PCI Bus ID / PCI location ID:           3 / 0
>   Compute Mode:
>      < Default (multiple host threads can use ::cudaSetDevice() with
> device simultaneously) >
> 
> deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.1, CUDA
> Runtime Version = 4.1, NumDevs = 2, Device = Tesla M2090, Device =
> Tesla M2090
> [deviceQuery] test results...
> PASSED
> 
> > exiting in 3 seconds: 3...2...1...done!
> 
> 
> > It looks like your user has no permission to list the available
> devices. So
> > check what is the difference between local users and non-local
> > (bashrc,LD_LIBRARY_PATH...,cuda-toolkit).
> it's the same $HOME, same user, the only difference is adding that user
> to /etc/passwd
> 
> I can reproductibly:
> 1) su - that user
> 2) fail to run namd but run deviceQuery (/dev/nvidia* are 666)
> 3) on another shell as root, just add that user to /etc/passwd and
> /etc/shadow
> 4) successfully run namd (on the same shell that failed on 2) by
> hitting <up><return>
> 5) remove the user from /etc/passwd and /etc/shadow
> 6) fail again namd as that used (same shell/window as 4) by hitting
> <up><return>
> 
> > Check with "ldd namd2" if local and non-local users use the same
> shared
> > librarys.
> yes, nothing changed.
> 
> Tru
> --
> Dr Tru Huynh          | http://www.pasteur.fr/recherche/unites/Binfs/
> mailto:tru_at_pasteur.fr | tel/fax +33 1 45 68 87 37/19
> Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15
> France
This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:21 CST