Re: Unable to run NAMD3.0 on 2 GPUs simultaneously

From: Sruthi Sundaresan (bo20resch11002_at_iith.ac.in)
Date: Mon May 02 2022 - 06:56:26 CDT

Hey Hrishikesh,
The command I'm using to run NAMD3.0 is:
./charmrun ./namd3 +p2 +devices 0,1 +devicesperreplica 1 input_file.inp >
output_file.out

It's still running on only one GPU. Is there anything I'm missing?

On Mon, May 2, 2022 at 3:25 PM Hrishikesh Dhondge <hbdhondge_at_gmail.com>
wrote:

> Hello Sruthi,
>
> What is the command you are using to run NAMD3.0?
>
> You need to add *+devicesperreplica 1 *in your command. For more
> information look here
> https://www.ks.uiuc.edu/Research/namd/alpha/3.0alpha/
> Also, make sure you have the correct executables for multi-GPU usage.
>
>
> On Mon, May 2, 2022 at 11:35 AM Sruthi Sundaresan <
> bo20resch11002_at_iith.ac.in> wrote:
>
>> Dear Users,
>> I would like to run my job on 2 GPUs. Although the queue shows that I am
>> submitting a job to run on 2 GPU nodes:
>>
>>
>>
>>
>> *#!/bin/bash#SBATCH --partition=gpu#SBATCH --gres=gpu:2#SBATCH -N 2 #
>> Number of nodes#SBATCH --ntasks-per-node=2 #Number of cores per node*
>>
>> *I happened to notice that only one node is being utilized:*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *[user_at_gpu002 ~]$ nvidia-smiMon May 2 14:53:11
>> 2022+-----------------------------------------------------------------------------+|
>> NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4
>> ||-------------------------------+----------------------+----------------------+|
>> GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC
>> || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
>> M. || | |
>> MIG M.
>> ||===============================+======================+======================||
>> 0 Tesla V100-SXM2... Off | 00000000:61:00.0 Off | 0
>> || N/A 58C P0 269W / 300W | 2261MiB / 16160MiB | 99%
>> Default || | |
>> N/A
>> |+-------------------------------+----------------------+----------------------+|
>> 1 Tesla V100-SXM2... Off | 00000000:89:00.0 Off | 0
>> || N/A 42C P0 26W / 300W | 2MiB / 16160MiB | 0%
>> Default || | |
>> N/A
>> |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+|
>> Processes:
>> || GPU GI CI PID Type Process name GPU
>> Memory || ID ID
>> Usage
>> ||=============================================================================||
>> 0 N/A N/A 25714 C ...6_64-multicore-CUDA/namd3 2259MiB
>> |+-----------------------------------------------------------------------------+*
>>
>>
>> *And there are no jobs submitted to the second node:*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *[user_at_gpu003 ~]$ nvidia-smiMon May 2 14:54:38
>> 2022+-----------------------------------------------------------------------------+|
>> NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4
>> ||-------------------------------+----------------------+----------------------+|
>> GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC
>> || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
>> M. || | |
>> MIG M.
>> ||===============================+======================+======================||
>> 0 Tesla V100-SXM2... Off | 00000000:61:00.0 Off | 0
>> || N/A 42C P0 41W / 300W | 0MiB / 16160MiB | 0%
>> Default || | |
>> N/A
>> |+-------------------------------+----------------------+----------------------+|
>> 1 Tesla V100-SXM2... Off | 00000000:89:00.0 Off | 0
>> || N/A 41C P0 40W / 300W | 0MiB / 16160MiB | 0%
>> Default || | |
>> N/A
>> |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+|
>> Processes:
>> || GPU GI CI PID Type Process name GPU
>> Memory || ID ID
>> Usage
>> ||=============================================================================||
>> No running processes found
>> |+-----------------------------------------------------------------------------+*
>>
>> Is there any way I can make my NAMD_3.0 job run by utilizing both GPUs?
>> The queue shows that the job is submitted to 2 GPUs but is running entirely
>> on only one.
>>
>> I've tried using *mpirun -np* in my script, but it's still running on
>> only one GPU.
>>
>>
>> Thanks and Regards,
>>
>>
>> <https://urldefense.com/v3/__https://iith.ac.in/__;!!DZ3fjg!4ahDWXFpriRXPz3KjSz0LJ4p3kyDSIAMa-6dqff2Fw4r8c1cG-PbLktVh5vcrMzTsUbK3gZH-IhsOygvXtxWKuwIkZ2_ww$>
>>
>> Sruthi Sundaresan
>>
>> Ph.D. Research Scholar
>>
>> C/o Dr. Thenmalarchelvi Rathinavelan
>>
>> Molecular Biophysics Lab, Department of Biotechnology
>>
>>
>> <https://urldefense.com/v3/__https://www.iith.ac.in/*tr/Home.html__;fg!!DZ3fjg!4ahDWXFpriRXPz3KjSz0LJ4p3kyDSIAMa-6dqff2Fw4r8c1cG-PbLktVh5vcrMzTsUbK3gZH-IhsOygvXtxWKuy6509-Gw$>
>>
>> <https://urldefense.com/v3/__https://www.linkedin.com/in/sruthisundaresan/__;!!DZ3fjg!4ahDWXFpriRXPz3KjSz0LJ4p3kyDSIAMa-6dqff2Fw4r8c1cG-PbLktVh5vcrMzTsUbK3gZH-IhsOygvXtxWKuxV0pAn7w$>
>>
>> <https://urldefense.com/v3/__https://twitter.com/MBL_IITH__;!!DZ3fjg!4ahDWXFpriRXPz3KjSz0LJ4p3kyDSIAMa-6dqff2Fw4r8c1cG-PbLktVh5vcrMzTsUbK3gZH-IhsOygvXtxWKuwIyY2fPQ$>
>>
>> Disclaimer:- This footer text is to convey that this email is sent by
>> one of the users of IITH. So, do not mark it as SPAM.
>>
>
>
> --
> With regards
> Hrishikesh Dhondge
> PhD student,
> LORIA - INRIA Nancy
>

-- 
Disclaimer:- This footer text is to convey that this email is sent by one 
of the users of IITH. So, do not mark it as SPAM.

This archive was generated by hypermail 2.1.6 : Tue Dec 13 2022 - 14:32:44 CST