nvprof 用于性能评估的三个主要指标:
occupancy
nvprof --metrics achieved_occupancy ./helloCuda.out
gld_throughput
nvprof --metrics gld_throughput ./helloCuda.out
gdl_efficiency
nvprof --metrics gld_efficiency ./helloCuda.out
_________________________________________________
重要指标:
共享内存占用率:
achieved_occupancy
全局内存读写:
gld_throughput
gld_efficiency
gld_transactions
gld_transactions_per_request
共享内存读写:
shared_efficiency
shared_load_throughput
shared_load_transactions
shared_load_transactions_per_request
shared_store_throughput
shared_store_transactions
shared_store_transactions_per_request