Optimizing Generated HDL Code学习

目标:

1. 满足时序要求通过流水线方式。

2.选择具体的硬件实现方式,比如FPGA的具体型号。

3.比特级循环交叉验证正确性。

核心:还是在于关键路径的缩短——尽量处理较少的组合逻辑。

Example one:

Square Root Implementation

平方根算法:首先将带开方的数进行二进制转换,位数拼接多一点,16位左右。然后再找一个数移位从大到小与输出值相加,相加后平方。再与原来的数进行比较大小,小了,就取相加后的数;大了,就取相加之前的数。循环迭代,最后输出开方值。

评价:乘法器多,循环次数多,精度有待考虑。

Example Two

HDL Workflow Advisor

1.首先输入代码hdlsetuptoolpath('ToolName', 'Xilinx Vivado', ...

'ToolPath', 'D:\Vivado\Vivado\2018.3\bin');%%作用是将电脑的vivado软件路径添加到Matlab。

2.然后进入Simulink软件,打开HDL Workflow Advisor选项,确认识别成功。

3.Setting Target Frequency,产生时钟约束文件.xdc,每个周期组合逻辑时间 ≤ 5ns。

数据的采样率和时钟频率后文会详细分析,there is often a difference ! 

4.生成HDL Code,这里我们主要是用verilog语言,文件后缀.v文件。

5.继续关注时序分析,打开critical path estimation报告,点击Hignlight Critical Path。

我们可以看到最长的关键路径的可视化,在模型当中。

深蓝色标志着关键路径,但不代表结束,因为还有更高层,还需要从testout出去看。

显然,是不满足时序要求的,哈哈。

6.vivado嵌入simulink中的综合与实现。synthesis and implementation。

默认情况下,skip implementation操作,但是最准确的时序分析是在"实现"之后完成。

可以手动点击运行,先综合再实现。“实现”的时间会特别漫长.......

Exmaple Three

Pipelining

1.解决时序问题的方法就是加寄存器,这里一般说加流水线。

2.可以手动加一个Delay 或者 自适应加延时(根据设置的设备和频率)。

3.Block properties.

4.Distributed Pipelining.

Example Four

自适应加延时方法演示

使用前:已经有一个小小的模块,最好定点化完成了,但是时延没通过,流水线也没加。

现在,把设置里面>最优化>流水线>勾选自适应流水线,打开Workflow Advisor。

一直运行,直到全绿,打开代码生成报告。

最下方上面一点点的是对比模型,点击最下方的生成模型(已经最新优化结果),非常的哇塞!

最直观的一点就是乘法器前后都加了Delay,非常智能。

遇到HwModelRegister模块是z-d现象,也是正常的,刷新一下,因为没有hdlsetup。

我们再看一下关键路径:

非常的清晰!

乘法器的时延非常大,但是映射到专用DSP硬件上其实时延会小很多。

实际上的vivado软件里面synthesis和Implementation之后延时会小很多。

Synthesis                 Slack         0.508ns

Implementation        Slack         0.233ns

务必找到Timing Summary Report文件

保存位置在vivado_prj工程文件目录下的.rpt文件名,但是在Matlab里面记得要拖拽打开。

总的资源利用表格和时序分析报告都在下面这个根目录下:

hdl_prj\vivado_prj\slprj

到此为止,完成时序优化,满足时序需求,关键路径实现时间在5ns周期内。

小结

手动delay也行

自适应delay也行

最后满足时序要求就可~~~

  • 3
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Here are some possible ways to optimize the previous code: 1. Vectorize the calculations: Instead of using nested loops to compute the responsibility matrix, we can use vectorized operations to speed up the computation. For example, we can use broadcasting to compute the Euclidean distance between each pair of points in a matrix form. Similarly, we can use matrix multiplication to compute the weighted sums of the point clouds. ```python def em_for_alignment(xs: np.ndarray, ys: np.ndarray, num_iter: int = 10) -> Tuple[np.ndarray, np.ndarray]: """ The em algorithm for aligning two point clouds based on affine transformation :param xs: a set of points with size (N, D), N is the number of samples, D is the dimension of points :param ys: a set of points with size (M, D), M is the number of samples, D is the dimension of points :param num_iter: the number of EM iterations :return: ys_new: the aligned points: ys_new = ys @ affine + translation responsibility: the responsibility matrix P=[p(y_m | x_n)] with size (N, M), whose elements indicating the correspondence between the points """ # initialize the affine matrix and translation vector affine = np.eye(xs.shape[1]) translation = np.zeros(xs.shape[1]) # initialize the responsibility matrix responsibility = np.zeros((xs.shape[0], ys.shape[0])) for i in range(num_iter): # E-step: compute the responsibility matrix diff = xs[:, np.newaxis, :] - ys[np.newaxis, :, :] sq_dist = np.sum(diff ** 2, axis=-1) responsibility = np.exp(-0.5 * sq_dist) / (2 * np.pi) ** (xs.shape[1] / 2) responsibility /= np.sum(responsibility, axis=1, keepdims=True) # M-step: update the affine matrix and translation vector xs_weighted = responsibility.T @ xs ys_weighted = responsibility.T @ ys affine, _, _, _ = np.linalg.lstsq(xs_weighted, ys_weighted, rcond=None) translation = np.mean(ys, axis=0) - np.mean(xs @ affine, axis=0) # compute the aligned points ys_new = ys @ affine + translation return ys_new, responsibility ``` 2. Use the Kabsch algorithm: Instead of using the weighted least squares solution to update the affine matrix, we can use the Kabsch algorithm, which is a more efficient and numerically stable method for finding the optimal rigid transformation between two point clouds. The Kabsch algorithm consists of three steps: centering the point clouds, computing the covariance matrix, and finding the optimal rotation matrix. ```python def em_for_alignment(xs: np.ndarray, ys: np.ndarray, num_iter: int = 10) -> Tuple[np.ndarray, np.ndarray]: """ The em algorithm for aligning two point clouds based on affine transformation :param xs: a set of points with size (N, D), N is the number of samples, D is the dimension of points :param ys: a set of points with size (M, D), M is the number of samples, D is the dimension of points :param num_iter: the number of EM iterations :return: ys_new: the aligned points: ys_new = ys @ affine + translation responsibility: the responsibility matrix P=[p(y_m | x_n)] with size (N, M), whose elements indicating the correspondence between the points """ # center the point clouds xs_centered = xs - np.mean(xs, axis=0) ys_centered = ys - np.mean(ys, axis=0) # initialize the affine matrix and translation vector affine = np.eye(xs.shape[1]) translation = np.zeros(xs.shape[1]) # initialize the responsibility matrix responsibility = np.zeros((xs.shape[0], ys.shape[0])) for i in range(num_iter): # E-step: compute the responsibility matrix diff = xs_centered[:, np.newaxis, :] - ys_centered[np.newaxis, :, :] sq_dist = np.sum(diff ** 2, axis=-1) responsibility = np.exp(-0.5 * sq_dist) / (2 * np.pi) ** (xs.shape[1] / 2) responsibility /= np.sum(responsibility, axis=1, keepdims=True) # M-step: update the affine matrix and translation vector cov = xs_centered.T @ responsibility @ ys_centered u, _, vh = np.linalg.svd(cov) r = vh.T @ u.T t = np.mean(ys, axis=0) - np.mean(xs @ r, axis=0) affine = np.hstack((r, t[:, np.newaxis])) # compute the aligned points ys_new = ys @ affine[:, :-1] + affine[:, -1] return ys_new, responsibility ``` The Kabsch algorithm is more efficient than the weighted least squares solution, especially when the point clouds are high-dimensional or noisy. However, it only works for rigid transformations, i.e., rotations and translations. If the transformation between the point clouds is not rigid, we need to use a more general method, such as the Procrustes analysis or the Iterative Closest Point (ICP) algorithm.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值