Cilk

Design

The biggest principle behind the design of the Cilk language is that the programmer should be responsible for exposing the parallelism, identifying elements that can safely be executed in parallel; it should then be left to the run-time environment, particularly the scheduler, to decide during execution how to actually divide the work between processors. It is because these responsibilities are separated that a Cilk program can run without rewriting on any number of processors, including one.

The Cilk language has been developed since 1994 at the MIT Laboratory for Computer Science. It is based on GNU C, with the addition of just a handful of Cilk-specific keywords. When the Cilk keywords are removed from Cilk source code, the result is a valid C program, called the serial elision (or C elision) of the full Cilk program. Cilk is a clean extension of C and the serial elision of any Cilk program is always a valid serial implementation in C of the semantics of the parallel Cilk program. Despite several similarities, Cilk is not directly related to AT&T Bell Labs' Concurrent C. A commercial version of Cilk, called Cilk++, that supports both C and C++ and is compatible with both GCC and Microsoft C++ compilers, has been announced by Cilk Arts, Inc.. Academic and Open Source versions also exist, where the Open Source version is under an in-house license that falls somewhere between the updated BSD license and the LGPL. The original Cilk code is still available from MIT, where Cilk Arts' version is a licensed fork of this base.

The first Cilk keyword is in fact cilk, which identifies a function which is written in Cilk. Since Cilk procedures can call C procedures directly, but C procedures cannot directly call or spawn Cilk procedures, this keyword is needed to distinguish Cilk code from C code.

The remaining keywords are:

  • spawn
  • sync
  • inlet
  • abort

They are described in further detail below.

[edit] Basic parallelism with Cilk

Two keywords are all that are needed to start using the parallel features of Cilk:

spawn -- this keyword indicates that the procedure call it modifies can safely operate in parallel with other executing code. Note that the scheduler is not obligated to run this procedure in parallel; the keyword merely alerts the scheduler that it can do so.

sync -- this keyword indicates that execution of the current procedure cannot proceed until all previously spawned procedures have completed and returned their results to the parent frame. This is an example of a barrier method.

[edit] Sample code

Below is a recursive implementation of the Fibonacci function in Cilk, with parallel recursive calls, which demonstrates the cilk, spawn, and sync keywords. (Cilk program code is not numbered; the numbers have been added only to make the discussion easier to follow.)

01 cilk int fib (int n)
02 {
03     if (n < 2) return n;
04     else
05     {
06        int x, y;
07  
08        x = spawn fib (n-1);
09        y = spawn fib (n-2);
10  
11        sync;
12  
13        return (x+y);
14     }
15 }

If this code was executed by a single processor to determine the value of fib(2), that processor would create a frame for fib(2), and execute lines 01 through 05. On line 06, it would create spaces in the frame to hold the values of x and y. On line 08, the processor would have to suspend the current frame, create a new frame to execute the procedure fib(1), execute the code of that frame until reaching a return statement, and then resume the fib(2) frame with the value of fib(1) placed into fib(2)'s x variable. On the next line, it would need to suspend again to execute fib(0) and place the result in fib(2)'s y variable.

When the code is executed on a multiprocessor machine, however, execution proceeds differently. One processor starts the execution of fib(2); when it reaches line 08, however, the spawn keyword modifying the call to fib(n-1) tells the processor that it can safely give the job to a second processor: this second processor can create a frame for fib(1), execute its code, and store its result in fib(2)'s frame when it finishes; the first processor continues executing the code of fib(2) at the same time. A processor is not obligated to assign a spawned procedure elsewhere; if the machine only has two processors and the second is still busy on fib(1) when the processor executing fib(2) gets to the procedure call, the first processor will suspend fib(2) and execute fib(0) itself, as it would if it were the only processor. Of course, if another processor is available, then it will be called into service, and all three processors would be executing separate frames simultaneously.

(The preceding description is not entirely accurate. Even though the common terminology for discussing Cilk refers to processors making the decision to spawn off work to other processors, it is actually the scheduler which assigns procedures to processors for execution, using a policy called work-stealing, described later.)

If the processor executing fib(2) were to execute line 13 before both of the other processors had completed their frames, it would generate an incorrect result or an error; fib(2) would be trying to add the values stored in x and y, but one or both of those values would be missing. This is the purpose of the sync keyword, which we see in line 11: it tells the processor executing a frame that it must suspend its own execution, until all the procedure calls it has spawned off have returned. When fib(2) is allowed to proceed past the sync statement in line 11, it can only be because fib(1) and fib(0) have completed and placed their results in x and y, making it safe to perform calculations on those results.

[edit] Advanced parallelism with Cilk: Inlets

The two remaining Cilk keywords are slightly more advanced, and concern the use of inlets. Ordinarily, when a Cilk procedure is spawned, it can return its results to the parent procedure only by putting those results in a variable in the parent's frame, as we assigned the results of our spawned procedure calls in the example to x and y.

The alternative is to use an inlet. An inlet is a function internal to a Cilk procedure which handles the results of a spawned procedure call as they return. One major reason to use inlets is that all the inlets of a procedure are guaranteed to operate atomically with regards to each other and to the parent procedure, thus avoiding the bugs that could occur if the multiple returning procedures tried to update the same variables in the parent frame at the same time.

inlet -- This keyword identifies a function defined within the procedure as an inlet.

abort -- This keyword can only be used inside an inlet; it tells the scheduler that any other procedures that have been spawned off by the parent procedure can safely be aborted.

[edit] Work-stealing

The Cilk scheduler uses a policy called "work-stealing" to divide procedure execution efficiently among multiple processors. Again, it is easiest to understand if we look first at how Cilk code is executed on a single-processor machine.

The processor maintains a stack on which it places each frame that it has to suspend in order to handle a procedure call. If it is executing fib(2), and encounters a recursive call to fib(1), it will save fib(2)'s state, including its variables and where the code suspended execution, and put that state on the stack. It will not take a suspended state off the stack and resume execution until the procedure call that caused the suspension, and any procedures called in turn by that procedure, have all been fully executed.

With multiple processors, things of course change. Each processor still has a stack for storing frames whose execution has been suspended; however, these stacks are more like deques, in that suspended states can be removed from either end. A processor can still only remove states from its own stack from the same end that it puts them on; however, any processor which is not currently working (having finished its own work, or not yet having been assigned any) will pick another processor at random, through the scheduler, and try to "steal" work from the opposite end of their stack -- suspended states, which the stealing processor can then begin to execute. The states which get stolen are the states that the processor stolen from would get around to executing last.

[edit] Commercialization

Prior to ~2006, the market for Cilk was restricted to high-performance computing. The emergence of multicore processors in mainstream computing means that hundreds of millions of new parallel computers are now being shipped every year. Cilk Arts was formed to capitalize on that opportunity: In 2006, Professor Leiserson launched Cilk Arts to create and bring to market a modern version of Cilk that supports the commercial needs of an upcoming generation of programmers. The company closed a Series A venture financing round in October 2007, and Cilk++ 1.0 shipped in December, 2008. On July 31, 2009, Cilk Arts announced on its web site that its products and engineering team were now part of Intel Corp. The product downloads continue to be available, and Intel and Cilk Arts have said they plan to integrate and advance the technology further. Cilk++ differs from Cilk in several ways: support for C++, operation with both Microsoft and GCC compilers, support for loops, and "Cilk hyperobjects" - a new construct designed to solve data race problems created by parallel accesses to global variables.

[edit] See also

[edit] References

Cilk: An Efficient Multithreaded Runtime System by Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 207–216, 1995.

come from http://en.wikipedia.org/wiki/Cilk

Download Intel® Cilk++ SDK

Download Files
Select your operating system and accepting the end user license agreement.

Download Intel® Cilk++ SDK package for 32-bit Linux* (cilk_8503-i686.release.tar.gz) 51.7MB
Download Intel® Cilk++ SDK package for 64-bit Linux* (cilk_8503-x86_64.release.tar.gz) 53.4MB

Intel ® Cilk++ SDK is an extension to C++ that offers a quick, easy and reliable way to improve the performance of C++ programs on multi-core processors. The Intel Cilk++ SDK, based on technology acquired from Cilk Arts in August, 2009, offers support for programmers using the GCC compiler for Linux* or the Microsoft C++ compiler for Windows*. The Intel Cilk++ SDK includes compiler support, runtime libraries, and tools for race detection, scalability and performance analysis. The three Cilk++ keywords provide a simple yet surprisingly powerful model for parallel programming, while runtime and template libraries offer a well-tuned environment for building parallel applications.  The Intel Cilk++ SDK allows you to:
  • Write parallel programs using a simple model:  With only three keywords to learn, C++ developers move quickly into the parallel programming domain.
  • Optimize for parallel performance: Hyperobject libraries resolve race conditions without the performance overhead of traditional locking solutions, and the scalability analyzer predicts how performance will scale to systems with many more processors.
  • Leverage existing serial tools: The serial semantics of Cilk++ allows you to debug in a familiar serial debugger.
  • Verify the correctness of parallel programs: The race detector’s strong guarantee of race-free operation eliminates the worry that parallel bugs will compromise applications.
  • Scale for the future: The runtime system operates smoothly on systems with hundreds of cores.
As multi-core systems become prevalent on desktops, servers and even laptop systems, new performance leaps will come as the industry adopts parallel programming techniques. However, many parallel environments consist of confusing, complex and error-prone rules and constructs. The Cilk++ language, built on the Cilk technology developed at M.I.T. over the past two decades, is designed to provide a simple, well-structured model that makes development, verification and analysis easy. Because Cilk++ is an extension to C++, programmers typically do not need to restructure programs significantly in order to add parallelism.
Product Overview

 

Intel® Cilk++ SDK offers early exposure to the Cilk++ style of parallel development.

Convert a serial program to Cilk++. Here, we indicate a parallel region of a serial quicksort algorithm using the cilk_spawn and cilk_sync keywords:

template <typename T>

void qsort(T begin, T end) {

if (begin != end) {

T middle = partition(

begin,

end,

bind2nd(less<typename iterator_traits<T>::value_type>(), *begin));

cilk_spawn qsort(begin, middle);

qsort(max(begin + 1, middle), end);

cilk_sync;

}

}

 

Compile and link the program with the compiler and runtime library from the command line or within Microsoft Visual Studio*.

 

Test the program on a single processor to ensure serial correctness.

 

Verify that the program is race free using the race detector, and correct any errors found. This sample output shows how the race detector displays the file name, source line, stack trace and variable name for a race in a sample program:


cilkscreen sum

Race condition on location 004367C8

write access at 004268D8: (c:/sum.cilk:8, sum.exe!f+0x1a)

read access at 004268CF: (c:/sum.cilk:8, sum.exe!f+0x11)

called by 004269B4: (c:/sum.cilk:14, sum.exe!cilk_main+0xd0)

called by 0042ABED: (c:/[...]/ostream:786, sum.exe!__cilk_main0+0x3d)

called by 100081D5: (cilk_1_1-x86.dll!__cilkrts_ltq_overflow+0x137)

Variable: 004367C8 - int sum



Correct race conditions using Hyperobjects:
  • Create private views of global variables in different strands running in parallel.
  • Use any data structure or control structure (not just parallel loops).
  • Minimize overhead with lazy view creation.
  • Simplify debugging and testing with deterministic results that are the same as the serial program results.
  • Use built-in support to support parallel operations such as creating linked lists, constructing strings, summing numbers and generating file output.
  • Create custom Hyperobjects modeled after the templates provided.
Analyze that the program will scale well using the scalability analyzer:
1
come from http://software.intel.com/en-us/articles/intel-cilk/
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
英特尔® C++编译器Cilk语言扩展 ............................................................................................... 1 1. 介绍............................................................................................................................................ 7 1.1 目标读者 ............................................................................................................................... 7 1.2 前提条件 ............................................................................................................................... 7 1.3 排字约定 ............................................................................................................................... 7 1.4 附加资源和信息 ................................................................................................................... 7 2. 新手上路 .................................................................................................................................... 8 2.1 编译运行一个Cilk用例 ..................................................................................................... 8 2.1.1 编译生成 qsort ........................................................................................................... 8 2.1.2 执行 qsort ................................................................................................................... 9 2.1.3 观察多核系统中的加速 ............................................................................................... 9 2.2 改写一个C++程序 .............................................................................................................. 10 2.2.1 从一个串行程序开始 ................................................................................................. 11 2.2.2 使用_Cilk_spawn加入并行性 .................................................................................. 12 2.2.3 编译,执行和测试 ..................................................................................................... 14 3. 编译、运行和调试Cilk程序 ......................................................
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值