From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Tue Jun 14 2011 - 00:45:37 CDT
Hello:
With a gaming machine
Gigabyte GA 890FXAUD5
Six-core AMD PhenomII 1075T
2x GTX 470
NAMD_CVS-2011-06-04_Linux-x86_64-CUDA.tar.gz
Debian GNU-Linux amd64 wheezy
I could run plainly MD:
nfo: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic
Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu
Info: 1 NAMD CVS-2011-06-04 Linux-x86_64-CUDA 6 gig64 francesco
Info: Running on 6 processors, 6 nodes, 1 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.00650811 s
Pe 5 sharing CUDA device 1 first 1 next 1
Pe 2 sharing CUDA device 0 first 0 next 4
Did not find +devices i,j,k,... argument, using all
Pe 5 physical rank 5 binding to CUDA device 1 on gig64: 'GeForce GTX
470' Mem: 1279MB Rev: 2.0
Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'GeForce GTX
470' Mem: 1279MB Rev: 2.0
Pe 0 sharing CUDA device 0 first 0 next 2
Pe 3 sharing CUDA device 1 first 1 next 5
Pe 1 sharing CUDA device 1 first 1 next 3
Pe 1 physical rank 1 binding to CUDA device 1 on gig64: 'GeForce GTX
470' Mem: 1279MB Rev: 2.0
Pe 0 physical rank 0 binding to CUDA device 0 on gig64: 'GeForce GTX
470' Mem: 1279MB Rev: 2.0
Pe 3 physical rank 3 binding to CUDA device 1 on gig64: 'GeForce GTX
470' Mem: 1279MB Rev: 2.0
Pe 4 sharing CUDA device 0 first 0 next 0
Pe 4 physical rank 4 binding to CUDA device 0 on gig64: 'GeForce GTX
470' Mem: 1279MB Rev: 2.0
Info: 1.64104 MB of memory in use based on CmiMemoryUsage
Info: Configuration file is min-02.conf
Yesterday failure: "cuda error cudastreamcreate", which was resolved
by stepwise visiting
----/var/lib/dkms/nvidia/270.41.19/2.6.38-2-amd64/x86_64/module/nvidia.ko
and
----/lib/module/2.6.38-2-amd64/update/dkms/nvidia.ko
and (perhaps, unsure whether this next action was really carried out):
---reboot
whereby the machine worked nicely for several different tasks all day
and night long.
Today same error "cuda error cudastreamcreate" and the procedure
above, including reboot, is unable to get NAMD running. The log file
says:
Info: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic
Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu
Info: 1 NAMD CVS-2011-06-04 Linux-x86_64-CUDA 6 gig64 francesco
Info: Running on 6 processors, 6 nodes, 1 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.0124412 s
Pe 5 sharing CUDA device 0 first 0 next 0
Pe 5 physical rank 5 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)' Mem: 0MB Rev: 9999.9999
FATAL ERROR: CUDA error cudaStreamCreate on Pe 5 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 5 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 5 (gig64 device
0): no CUDA-capable device is available
Did not find +devices i,j,k,... argument, using all
Pe 0 sharing CUDA device 0 first 0 next 1
Pe 0 physical rank 0 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)' Mem: 0MB Rev: 9999.9999
Pe 3 sharing CUDA device 0 first 0 next 4
Pe 3 physical rank 3 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)' Mem: 0MB Rev: 9999.9999
Pe 1 sharing CUDA device 0 first 0 next 2
Pe 1 physical rank 1 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)' Mem: 0MB Rev: 9999.9999
FATAL ERROR: CUDA error cudaStreamCreate on Pe 0 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 0 (gig64 device
0): no CUDA-capable device is available
FATAL ERROR: CUDA error cudaStreamCreate on Pe 3 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 3 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 3 (gig64 device
0): no CUDA-capable device is available
FATAL ERROR: CUDA error cudaStreamCreate on Pe 1 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 1 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 1 (gig64 device
0): no CUDA-capable device is available
Pe 2 sharing CUDA device 0 first 0 next 3
Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)' Mem: 0MB Rev: 9999.9999
FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 2 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device
0): no CUDA-capable device is available
Pe 4 sharing CUDA device 0 first 0 next 5
Pe 4 physical rank 4 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)' Mem: 0MB Rev: 9999.9999
FATAL ERROR: CUDA error cudaStreamCreate on Pe 4 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 4 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 4 (gig64 device
0): no CUDA-capable device is available
[0] Stack Traceback:
--------------------------------
nvidia-smi -r (or nvidia-smi -a)
NVIDIA: could not open the device file /dev/nvidia1 (no such file)
Failed to initialize NVML: unknown error.
If "nvidia-smi" is for Tesla only, how to check GTX 470?
Thanks for advice
francesco pietra
This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:20:25 CST