|
本帖最后由 五十八 于 2017-6-17 16:41 编辑
编译环境:
CUDA7.5 (目前CUDA 8.0不能 但是驱动可以是最新的)
INTEL PSX 2017 (包括MKL 等)
硬件环境:
Xeon E5 2643 V3 /E5 2667 V4
GTX 1070 *2 / K80 * 2
系统:
Redhat 6.7 / Centos 6.8 (Centos 7.2 需要更新主板固件,不然microcode卡死)
$cp2kroot/arch/cuda.popt 如下:
PERL = perl
CC = mpiicc
CPP = cpp
NVCC = nvcc -arch=compute_30
FC = mpiifort
LD = $(FC)
AR = ar -r
CFLAGS = -O2 -nofor-main
CPPFLAGS = -traditional -C $(DFLAGS) -P -I$(MKLROOT)/include/fftw
DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__SCALAPACK -D__BLACS -D__CUDAPW -D__DBCSR_CUDA -D__FFTW3 -D__FFTCU -D__CUBLASDP -D__ACC -D__DBCSR_ACC -D__PW_CUDA
FCFLAGS = $(DFLAGS) -I$(INTEL_INC) -O3 -msse2 -heap-arrays 64 -funroll-loops -fpp -free -nofor-main
NVFLAGS = $(DFLAGS)
LDFLAGS = $(FCFLAGS)
INTEL_INC = $(MKLROOT)/include
MKLPATH = $(MKLROOT)/lib/intel64
CUDAPATH = /usr/local/cuda
LIBS = -L${MKLROOT}/lib/intel64 -lmkl_scalapack_lp64 -lmkl_cdft_core -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -liomp5 -lpthread -lm -ldl
LIBS += -L$(CUDAPATH)/lib64 -lcudart -lcufft -lcublas -L/usr/lib64/libfftw3f.so.3
OBJECTS_ARCHITECTURE = machine_intel.o
然后 make -j4 ARCH='cuda' VERSION='popt'
官方对于使用cuda8.0出现的CUFFT_COMPATIBILITY_NATIVE" is undefined问题解决方案
“if your are mostly interested in cuda-acceleration for DBCSR and don't
need it for FFT, then you could compile without -D__PW_CUDA as a
workaround. “
起因是cuda8.0弃用了cufftSetCompatibilityMode
Function cufftSetCompatibilityMode is deprecated。
|
评分 Rate
-
查看全部评分 View all ratings
|