计算化学公社

 找回密码 Forget password
 注册 Register
Views: 361|回复 Reply: 0
打印 Print 上一主题 Last thread 下一主题 Next thread

[VASP] 求助VASP如何降低显存使用

[复制链接 Copy URL]

1

帖子

0

威望

25

eV
积分
26

Level 2 能力者

跳转到指定楼层 Go to specific reply
楼主
本帖最后由 y597690 于 2025-5-31 16:38 编辑

服务器配置为AMD EPYC 7b12+ 2块NVIDIA Tesla V100 16G。之前CPU成功优化了8x8x2的fcc铜衬底,所以我们想测试GPU的运算性能。
但是我们在尝试使用GPU优化这个衬底时炸显存了。请问如何在不影响精度的情况下降低显存使用?

VASP版本是6.5.1, 使用omp_acc编译

运行脚本和stdout报错如下:

$ mpirun -np 2 --bind-to core \
>        -x OMP_NUM_THREADS=32 \
>        -x OMP_PLACES=cores \
>        -x OMP_PROC_BIND=close \
>        --report-bindings \
>        ~/vasp651gpu/vasp_gam
MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././.]
MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././.]
running    2 mpi-ranks, with   32 threads/rank, on    1 nodes
distrk:  each k-point on    2 cores,    1 groups
distr:  one band on    1 cores,    2 groups
Offloading initialized ...    2 GPUs detected
vasp.6.5.1 10Mar25 (build May 27 2025 20:02:56) gamma-only                     
POSCAR found type information on POSCAR Cu
POSCAR found :  1 types and     512 ions
scaLAPACK will be used selectively (only on CPU)
LDA part: xc-table for (Slater+PW92), standard interpolation
POSCAR, INCAR and KPOINTS ok, starting setup
FFT: planning ... GRIDC
FFT: planning ... GRID_SOFT
FFT: planning ... GRID
WAVECAR not read
entering main loop
       N       E                     dE             d eps       ncg     rms          rms(c)
Out of memory allocating 311500800 bytes of device memory
Failing in Thread:1
total/free CUDA memory: 16928342016/109051904
Present table dump for device[1]: NVIDIA Tesla GPU 0, compute capability 7.0, threadid=1
Hint: specify 0x800 bit in NV_ACC_DEBUG for verbose info.


INCAR:
# === Global Parameters ===
ISTART   = 0            # Start from scratch (no WAVECAR)
ICHARG   = 2            # Charge density from atomic superposition
ISPIN    = 1            # Non-spin-polarized
LREAL    = Auto         # Use real-space projection for speed on large systems
PREC     = Accurate     # Full precision for reliable forces
LWAVE    = .TRUE.       # Write WAVECAR (used in later single-point/molecule run)
LCHARG   = .TRUE.       # Write CHGCAR (for charge inspection or reuse)
ADDGRID  = .TRUE.       # Improve GGA integration accuracy
LASPH    = .TRUE.       # Needed for non-spherical corrections with PAW
#NSIM = 1
LREAL = AUTO

# === Electronic Relaxation ===
ISMEAR   = 1            # Gaussian smearing (good for metals)
SIGMA    = 0.2          # Smearing width in eV
NELM     = 150          # Max SCF steps (increased to avoid premature stop)
NELMIN   = 6            # Min SCF steps
EDIFF    = 1E-6         # Electronic convergence (loosened for relaxation)
#ALGO     = Fast         # Good balance of speed and robustness
#AMIX     = 0.2          # Mixing amplitude (better for metallic systems)
#BMIX     = 0.0001       # Mixing damping (prevents charge oscillation)

# === Ionic Relaxation ===
NSW      = 100          # Max ionic steps
IBRION   = 2            # Conjugate gradient (stable geometry optimization)
ISIF     = 2            # Relax ions only
EDIFFG   = -0.02        # Stop if all forces < 0.02 eV/Å
ISYM     = 0            # Turn off symmetry (important for steps, surfaces)

# === (Optional tweaks) ===
ENCUT   = 500          # Only if POTCAR recommends a higher cutoff
# NGXF/YF/ZF             # Set only if you want to enforce FFT grid manually

#NCORE = 64


OUTCAR:
total amount of memory used by VASP MPI-rank0 16517326. kBytes
=======================================================================

   base                                    :      30000. kBytes
   nonlr-proj                              :     532224. kBytes
   fftplans                                :    5690184. kBytes
   grid                                    :    2253312. kBytes
   one-center                              :       3981. kBytes
   wavefun                                 :    8007625. kBytes

     INWAV:  cpu time      0.0000: real time      0.0000
Broyden mixing: mesh for mixing (old mesh)
   NGX =115   NGY =115   NGZ = 83
  (NGX  =480   NGY  =480   NGZ  =336)
  gives a total of ****** points

initial charge density was supplied:
charge density of overlapping atoms calculated
number of electron    5632.0000000 magnetization
keeping initial charge density in first step


--------------------------------------------------------------------------------------------------------


Maximum index for non-local projection operator          6566
Maximum index for augmentation-charges         47379 (set IRDMAX)


--------------------------------------------------------------------------------------------------------


First call to EWALD:  gamma=   0.062
Maximum number of real-space cells   3x   3x   3
Maximum number of reciprocal cells   3x   3x   2

    FEWALD:  cpu time      6.3257: real time      6.3293


--------------------------------------- Ionic step        1  -------------------------------------------




--------------------------------------- Iteration      1(   1)  ---------------------------------------

在这之后因为显存不够报错了

本版积分规则 Credits rule

手机版 Mobile version|北京科音自然科学研究中心 Beijing Kein Research Center for Natural Sciences|京公网安备 11010502035419号|计算化学公社 — 北京科音旗下高水平计算化学交流论坛 ( 京ICP备14038949号-1 )|网站地图

GMT+8, 2025-8-14 13:32 , Processed in 0.203284 second(s), 20 queries , Gzip On.

快速回复 返回顶部 返回列表 Return to list