HLS第十九课（pragma, loop, pipeline,unroll,trip_count,assert）

最新推荐文章于 2025-03-28 14:22:03 发布

Huskar_Liu

最新推荐文章于 2025-03-28 14:22:03 发布

阅读量2.8k

点赞数 4

分类专栏： hls 文章标签： hls

本文链接：https://blog.csdn.net/weixin_42418557/article/details/120874114

版权

hls 专栏收录该内容

42 篇文章

订阅专栏

对函数的循环展开的任何编译控制，都体现在pragma中。
下面对一些常用的pragma进行详细说明。
+++++++++++++++++++++++++++++++++++++++
pragma HLS loop_flatten

Flattening nested loops allows them to be optimized as a single loop.

#pragma HLS loop_flatten 
#pragma HLS loop_flatten off

Place the pragma in the C source within the boundaries of the nested loop.

void foo (num_samples, ...) {
	int i;
	...
	loop_1: for(i=0;i< num_samples;i++) {
	#pragma HLS loop_flatten
		...
		result = a + b;
	}
}

Flattens loop_1 in function foo and all (perfect or semi-perfect) loops above it in the loop hierarchy,

++++++++++++++++++++++++++++++++++++++++++++
pragma HLS loop_merge

The LOOP_MERGE pragma will seek to merge all loops within the scope it is placed. For example,
if you apply a LOOP_MERGE pragma in the body of a loop, Vivado HLS applies the pragma to any
sub-loops within the loop but not to the loop itself.

void foo (num_samples, ...) {
#pragma HLS loop_merge
	
	int i;
	...
	loop_1: for(i=0;i< num_samples;i++) {
	...
	}
	...
}

Merges all consecutive loops in function foo into a single loop.

	loop_2: for(i=0;i< num_samples;i++) {
		#pragma HLS loop_merge 
			...
		}

All loops inside loop_2 (but not loop_2 itself) are merged. Placethe pragma in the body of loop_2.

+++++++++++++++++++++++++++++++++++++++
pragma HLS pipeline

A pipelined function or loop can process new inputs every N clock cycles, where N is the initiation interval (II) of the loop or function. The default initiation interval for the PIPELINE pragma is 1, which processes a new input every clock cycle.

#pragma HLS pipeline II=<int> enable_flush rewind

II=<int>: Specifies the desired initiation interval for the pipeline.
enable_flush: An optional keyword which implements a pipeline that will flush and empty if the data valid at the input of the pipeline goes inactive.
rewind: An optional keyword that enables rewinding, or continuous loop pipelining with no pause between one loop iteration ending and the next iteration starting.

void foo { a, b, c, d} {
#pragma HLS pipeline II=1
	...
}

In this example function foo is pipelined with an initiation interval of 1:
The default value for II is 1, so II=1 is not required.

++++++++++++++++++++++++++++++++++++++++++++++++++++++
pragma HLS unroll

Unroll loops to create multiple independent operations rather than a single collection of
operations.
The UNROLL pragma transforms loops by creating multiples copies of the loop body
in the RTL design, which allows some or all loop iterations to occur in parallel.
Using the UNROLL pragma you can unroll loops to increase data access and throughput.

Partially unrolling a loop lets you specify a factor N, to create N copies of the loop body and reduce the loop iterations accordingly.

for(int i = 0; i < X; i++) {
#pragma HLS unroll factor=2
	a[i] = b[i] + c[i];
}

Loop unrolling by a factor of 2 effectively transforms the code to look like the following code
where the break construct is used to ensure the functionality remains the same,

for(int i = 0; i < X; i += 2) {
	a[i] = b[i] + c[i];
	
	if (i+1 >= X) break;
	
	a[i+1] = b[i+1] + c[i+1];
}

#pragma HLS unroll factor=<N> region skip_exit_check

factor=<N>: Specifies a non-zero integer indicating that partial unrolling is requested. If factor= is not specified, the loop is fully unrolled.
region: An optional keyword that unrolls all loops within the body (region) of the specified loop, without unrolling the enclosing loop itself.
skip_exit_check: An optional keyword that applies only if partial unrolling is specified
with factor=.

void foo (...) {
	int8 array1[M];
	int12 array2[N];
	...
	loop_2: for(i=0;i<M;i++) {
	#pragma HLS unroll skip_exit_check factor=4
		array1[i] = ...;
		array2[i] = ...;
		...
	}
	...
}

This example specifies an unroll factor of 4 to partially unroll loop_2 of function foo, and
removes the exit check:

void foo(int data_in[N], int scale, int data_out1[N], int data_out2[N]) {
	int temp1[N];
	loop_1: for(int i = 0; i < N; i++) {
	#pragma HLS unroll region
	
		temp1[i] = data_in[i] * scale;
		
		loop_2: for(int j = 0; j < N; j++) {
			data_out1[j] = temp1[j] * 123;
		}
		
		loop_3: for(int k = 0; k < N; k++) {
			data_out2[k] = temp1[k] * 456;
		}
	}
}

The following example fully unrolls all loops inside loop_1 in function foo, but not loop_1
itself due to the presence of the region keyword:

+++++++++++++++++++++++++++++++++++++++
pragma HLS LOOP_TRIPCOUNT

Vivado HLS performs analysis to determine the number of iteration of each loop. If the loop iteration limit is a variable, Vivado HLS cannot determine the maximum upper limit.

The TRIPCOUNT directive can be applied to the loop to manually specify the number of loop iterations and ensure the report contains useful numbers. The -max option tells Vivado HLS the maximum number of iterations that the loop iterates over and the -min option specifies the minimum number of iterations performed.

#pragma HLS LOOP_TRIPCOUNT min=0 max=1920

++++++++++++++++++++++++++++++++++++++
assert

If the C assert macro is used in the code, Vivado HLS can use it to both determine the loop limits automatically and create hardware that is exactly sized to these limits.

In addition, some assert statements are used to specify the maximize of loop bounds.

// These assertions let HLS know the upper bounds of loops
assert(height <= MAX_IMG_ROWS);
assert(width <= MAX_IMG_COLS);

This is a good coding style which allows HLS to automatically report on the latencies of variable bounded loops and optimize the loop bounds.

The assert macro in C is supported for synthesis when used to assert range information.
For example, the upper limit of variables and loop-bounds.
When variable loop bounds are present, Vivado HLS cannot determine the latency for all iterations of the loop and reports the latency with a question mark.

assert statements are placed before each of the loops.
These assertions:
• Guarantee that if the assertion is false and the value is greater than that stated, the C simulation will fail. This also highlights why it is important to simulate the C code before synthesis: confirm the design is valid before synthesis.
• Inform Vivado HLS that the range of this variable will not exceed this value and this fact can optimize the variables size in the RTL and in this case, the loop iteration count.
The following code example shows these assertions.

	assert(xlimit<=32);
	SUM_X:for (i=0;i<=xlimit; i++) {
		X_accum += A[i];
		X[i] = X_accum;
	}
	
	assert(ylimit<=16);
	SUM_Y:for (i=0;i<=ylimit; i++) {
		Y_accum += B[i];
		Y[i] = Y_accum;
	}

Because the assertions assert that the values will never be greater than 32 and 16, Vivado HLS can use this in the reporting.