2016年02月_Firehotest

12月 11月 10月 09月 08月 07月 06月 05月 04月 03月 02月 01月

原创 Study Note: RoofLine Model

Some background knowledge: Here is some connection between latency, throughput and concurrency [1]:Here is the influence factor of runtime and performance: latency and throughput.

2016-02-29 16:00:47 2032

原创 Study Note: Schedule Optimisation and math_intrinsic in CUDA Programming

Let us introduce a new term first[1]. It is the ratio of active warps / maximum number(32) of warps. It depends on three parameters: 1) threads/block (set in >>)2) registers/th

2016-02-29 09:41:27 650

原创 Study Note: Instruction Optimisation of CUDA programming

Consideration 1: Branch Divergence Before we talk about this, let us go through what is going on in GPU actually.Here is the abstract model of SM like[1]:Every SM has one con

2016-02-28 22:55:24 938

原创 Study Note: Shared Memory Optimisation -- avoid of bank conflict

This article is illustrated bases on 2.x computation device: Typically speaking, a shared memory has 16KB totally. And it has 32 banks for 2.x computation device. Bank is a unit of parallel read

2016-02-28 13:11:37 1328

原创 Study Note: Global memory optimisation of CUDA programming

Global memory coalescing: The storage pattern of global memory in GPU is row first pattern because there is not two dimension array in GPU. Use a matrix as an example[1]: Knowledge of

2016-02-27 23:49:19 745

转载 Self summary: Ruby(RVM, gem, bundle)

Establishment of the develop environment:https://ruby-china.org/wiki/install_ruby_guideEvolution of bundler: Now, the install of ruby will include the gem command for get the oth

2016-02-16 20:09:30 892

原创 Self Summary: Basic concepts of GPU

Some basic concepts of GPU programming: Here is the overview of a GPU(Fermi Architecture)[1]:It is a 16-way many core (16 SM) GPU. Each way of many core has the architecture like t

2016-02-09 22:10:22 651

转载 Locale in Linux

The following content is from http://www.blog.chinaunix.net/uid-641896-id-338729.html:Linux use locale to set the different language environment for running program. Locale is supported by ANSI

2016-02-06 22:50:39 382

转载 C/C++: Inline function, calloc vs malloc

Inline function is like a macro definition. When it was be called in another function, the control right will not be changed to this function. The compiler will just replace the line of inline functio

2016-02-02 18:52:20 513

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人