- 博客(9)
- 收藏
- 关注
原创 Study Note: RoofLine Model
Some background knowledge: Here is some connection between latency, throughput and concurrency [1]:Here is the influence factor of runtime and performance: latency and throughput.
2016-02-29 16:00:47 2032
原创 Study Note: Schedule Optimisation and math_intrinsic in CUDA Programming
Let us introduce a new term first[1]. It is the ratio of active warps / maximum number(32) of warps. It depends on three parameters: 1) threads/block (set in >>)2) registers/th
2016-02-29 09:41:27 650
原创 Study Note: Instruction Optimisation of CUDA programming
Consideration 1: Branch Divergence Before we talk about this, let us go through what is going on in GPU actually.Here is the abstract model of SM like[1]:Every SM has one con
2016-02-28 22:55:24 938
原创 Study Note: Shared Memory Optimisation -- avoid of bank conflict
This article is illustrated bases on 2.x computation device: Typically speaking, a shared memory has 16KB totally. And it has 32 banks for 2.x computation device. Bank is a unit of parallel read
2016-02-28 13:11:37 1328
原创 Study Note: Global memory optimisation of CUDA programming
Global memory coalescing: The storage pattern of global memory in GPU is row first pattern because there is not two dimension array in GPU. Use a matrix as an example[1]: Knowledge of
2016-02-27 23:49:19 745
转载 Self summary: Ruby(RVM, gem, bundle)
Establishment of the develop environment:https://ruby-china.org/wiki/install_ruby_guideEvolution of bundler: Now, the install of ruby will include the gem command for get the oth
2016-02-16 20:09:30 892
原创 Self Summary: Basic concepts of GPU
Some basic concepts of GPU programming: Here is the overview of a GPU(Fermi Architecture)[1]:It is a 16-way many core (16 SM) GPU. Each way of many core has the architecture like t
2016-02-09 22:10:22 651
转载 Locale in Linux
The following content is from http://www.blog.chinaunix.net/uid-641896-id-338729.html:Linux use locale to set the different language environment for running program. Locale is supported by ANSI
2016-02-06 22:50:39 382
转载 C/C++: Inline function, calloc vs malloc
Inline function is like a macro definition. When it was be called in another function, the control right will not be changed to this function. The compiler will just replace the line of inline functio
2016-02-02 18:52:20 513
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人