计算化学公社
标题:
pgi fortran on gpu。运行调用不了GPU。求助!
[打印本页]
作者Author:
didi_dudu
时间:
2017-5-16 14:55
标题:
pgi fortran on gpu。运行调用不了GPU。求助!
本帖最后由 didi_dudu 于 2017-5-16 15:35 编辑
最近想尝试下GPU加速,于是折腾着安装好了CUDA和PGI。然后尝试编译PIG FORTRAN的例子。编译可以顺利通过,然后运行总是failed。自己测试发现应该是increment这个子程序根本没有被调用。求助下各位大神谁知道原因可能出在什么地方么?-------------------------------------------------------
程序
http://blog.csdn.net/slow_jiulong/article/details/53105223
---------------------------------------------------------------------
pgaccelinfo显示的信息
CUDA Driver Version: 8000
NVRM version: NVIDIA UNIX x86_64 Kernel Module 375.26 Thu Dec 8 18:36:43 PST 2016
Device Number: 0
Device Name: GeForce GTX 1080
Device Revision Number: 6.1
Global Memory Size: 8507555840
Number of Multiprocessors: 20
Concurrent Copy and Execution: Yes
Total Constant Memory: 65536
Total Shared Memory per Block: 49152
Registers per Block: 65536
Warp Size: 32
Maximum Threads per Block: 1024
Maximum Block Dimensions: 1024, 1024, 64
Maximum Grid Dimensions: 2147483647 x 65535 x 65535
Maximum Memory Pitch: 2147483647B
Texture Alignment: 512B
Clock Rate: 1733 MHz
Execution Timeout: No
Integrated Device: No
Can Map Host Memory: Yes
Compute Mode: default
Concurrent Kernels: Yes
ECC Enabled: No
Memory Clock Rate: 5005 MHz
Memory Bus Width: 256 bits
L2 Cache Size: 2097152 bytes
Max Threads Per SMP: 2048
Async Engines: 2
Unified Addressing: Yes
Managed Memory: Yes
PGI Compiler Option: -ta=tesla:cc60
Device Number: 1
Device Name: GeForce GT 730
Device Revision Number: 3.5
Global Memory Size: 1028128768
Number of Multiprocessors: 2
Number of SP Cores: 384
Number of DP Cores: 128
Concurrent Copy and Execution: Yes
Total Constant Memory: 65536
Total Shared Memory per Block: 49152
Registers per Block: 65536
Warp Size: 32
Maximum Threads per Block: 1024
Maximum Block Dimensions: 1024, 1024, 64
Maximum Grid Dimensions: 2147483647 x 65535 x 65535
Maximum Memory Pitch: 2147483647B
Texture Alignment: 512B
Clock Rate: 901 MHz
Execution Timeout: No
Integrated Device: No
Can Map Host Memory: Yes
Compute Mode: default
Concurrent Kernels: Yes
ECC Enabled: No
Memory Clock Rate: 900 MHz
Memory Bus Width: 64 bits
L2 Cache Size: 524288 bytes
Max Threads Per SMP: 2048
Async Engines: 1
Unified Addressing: Yes
Managed Memory: Yes
PGI Compiler Option: -ta=tesla:cc35
作者Author:
didi_dudu
时间:
2017-5-16 15:34
额 然后发现编译为pgf90 -Mcuda -ta=tesla:cc60 test.f90 就能运行了。编译文件就在pgaccelinfo末端有显示。。。果然还是应该细心多看才对~~~~
欢迎光临 计算化学公社 (http://bbs.keinsci.com/)
Powered by Discuz! X3.3