RapidWright系列-3. 构建一个基础的Router

〇、前言

本篇博文将复现FCCM 2019 Workshop上的RapidWright Tutorial——Build a Basic Router。官方的tutorial已经比较详细,但是要求读者有一些对RapidWright的基础知识,有一些基础概念并没有讲。本文给出tutorial的讲解原文,并在此基础上增添了tutorial缺失的基础知识。

一、背景

甚么是Routing?

将一根逻辑net映射到FPGA的interconnect(互联)资源上,包含两部分:intra-site布线和inter-site布线。inter-site布线主要由Vivado route_design执行,intra-site在site内部布线,即将site wires连接到net上,主要发生在布局过程中place_design。本tutorial的内容主要针对inter-site。

补充,甚么是Site?

在这里插入图片描述
图1

描述Xilinx FPGA分成了六个抽象层次,最基础的是LUT和FF,也叫做BEL(Basic Element of Logic),若干BEL组合成一个Slice(Site),若干Slice组成一个Tile,若干Tile组成FSR,若干FSR组成SLR,若干SLR组成一个器件。FPGA做布局时,首先将一个LUT或FF找到一个Site放进去,然后把它们的输入输出连接到Site的Pin上。比如说上图是7系列FPGA SLICEL的原理图,共有4个LUT6和4个FF。假设我们想实现一个或门,在诸多Slice里我们选择X0Y0,在诸多LUT和FF里我们选择ALUT和BFF来实现我们想要的功能,这就是布局。布完局还有两件事情,一件事情是把ALUT的输出和BFF的输入连接到一起,另一件事情是把ALUT的输入和BFF的输出都接到Site的Pin上,在这里ALUT的输入连接到Slice的A6:1上,BFF的输出连接到BQ上。这就是前面所说的intra-site,在site内部布线。假设我们想把X0Y0/BFF跟X0Y1/ALUT相连,我们不能直接把ALUT和BFF连在一起,而是先把X0Y1跟X0Y0连在一起,然后再通过Site内部的线把他们连在一起。这就是前文所说的inter-site布线,即Site之间布线。

FPGA的布线资源

在这里插入图片描述
图2

甚么是Net?

In RapidWright, we have two kinds of nets, logical and physical. A logical net is a network of connected cell pins (both hierarchical and leaf) of a netlist. A physical net is a list of PIPs that describe connections between physical site pins and crosses hierarchical boundaries of the netlist.

在RapidWright中有两种类型的Net,Logical Net和Physical Net。Logical Net记录了这根Net上都有哪些Cell Pin(在同一根线上的Pin就相当于连在一起了),Physical Net除了记录它都包含了哪些Site Pin之外,还要记录了这些Site Pin之间都经过了哪些PIP。(Cross Hierarchy我暂时还不知道是什么意思,先挖个坑后边再填)

Net TypeRapidWright ClassCrosses Hierarchy?Pins Live on a…
Logical NetEDIFNetNoCell (EDIFCellInst)
Physical NetNetYesSite (SiteInst)

Physical Net的名字来自于Source Logical Net,即与source leaf cell连着的那根线,这与Vivado也保持一致。一个典型的Physical Net主要包含如下信息:

Class MemberDescriptionAPI Getter
nameFull hierarchical name of the netpublic String Net.getName()
pinsSource/Sink site pinspublic List<SitePinInst> Net.getPins()
pipsProgrammable interconnect Points usedpublic List<PIP> Net.getPIPs()

而布线的目的就是给每一根Net选出一组合适的PIPs。

在这里给出一个具体的例子,左图是Logical Net,上边有3个端口,他的source名字叫ffWrapperInst/out0,所以右边对应的Physical Net也叫ffWrapperInst/out0

图3
图3

二、进入正题,开始布线!

1. 导入RapidWright库

from com.xilinx.rapidwright.design import Cell
from com.xilinx.rapidwright.design import Design
from com.xilinx.rapidwright.design import DesignTools
from com.xilinx.rapidwright.design import Net
from com.xilinx.rapidwright.design import NetType
from com.xilinx.rapidwright.design import PinType
from com.xilinx.rapidwright.design import Unisim
from com.xilinx.rapidwright.device import Device
from com.xilinx.rapidwright.device import Node
from com.xilinx.rapidwright.edif   import EDIFDirection
from com.xilinx.rapidwright.edif   import EDIFTools
from com.xilinx.rapidwright.router import RouteNode
from com.xilinx.rapidwright.router import Router
from com.xilinx.rapidwright.util   import MessageGenerator
from java.util import HashSet
from java.util import List
from pprint import pprint

2. 生成一个Logical Netlist

def createTopPortNet(name, direction):
    net = top.createNet(name)
    port = top.createPort(name,direction, 1)
    net.createPortInst(port)
    return net

def makeFlipFlop(name, loc):
    ff = design.createAndPlaceCell(name, Unisim.FDRE, loc)
    gnd.createPortInst("R" , ff.getEDIFCellInst())
    vcc.createPortInst("CE", ff.getEDIFCellInst())
    clk.createPortInst("C" , ff.getEDIFCellInst())
    return ff

# Create a new, empty design
design = Design("MyFirstRoute",Device.PYNQ_Z1)
# Get some useful handles
device = design.getDevice()
netlist = design.getNetlist()
top = netlist.getTopCell()
gnd = EDIFTools.getStaticNet(NetType.GND, top, netlist);
vcc = EDIFTools.getStaticNet(NetType.VCC, top, netlist);
ports = {"clk":EDIFDirection.INPUT, "in0":EDIFDirection.INPUT,"out0":EDIFDirection.OUTPUT}
(clk, in0, out0) = tuple([createTopPortNet(k,v) for k,v in ports.iteritems()])

# Create our source and sink flip flops, connect plumbing pins (clk, ce, rst)
src_ff = makeFlipFlop("src_ff", "SLICE_X0Y4/AFF")
snk_ff = makeFlipFlop("snk_ff", "SLICE_X2Y4/BFF")

# Create the net to route and connect it to the FFs
net = design.createNet("ff_net")
src = DesignTools.createPinAndAddToNet(src_ff,"Q",net)
snk = DesignTools.createPinAndAddToNet(snk_ff,"D",net)
in0.createPortInst("D",src_ff.getEDIFCellInst())
out0.createPortInst("Q",snk_ff.getEDIFCellInst())

# These make the design route-able by Vivado (sanity check)
design.routeSites()
design.setAutoIOBuffers(False)
design.setDesignOutOfContext(True)
design.writeCheckpoint(design.getName() + ".dcp")
print "Wrote: " + design.getName() + ".dcp"

我们来逐块解析一下

device = design.getDevice()
netlist = design.getNetlist()
top = netlist.getTopCell()
gnd = EDIFTools.getStaticNet(NetType.GND, top, netlist);
vcc = EDIFTools.getStaticNet(NetType.VCC, top, netlist);
ports = {"clk":EDIFDirection.INPUT, "in0":EDIFDirection.INPUT,"out0":EDIFDirection.OUTPUT}
(clk, in0, out0) = tuple([createTopPortNet(k,v) for k,v in ports.iteritems()])

首先要获取整个netlist的TopCell,所谓TopCell就是跟这个Cell相连的都是输入输出Pin而没有FPGA的逻辑资源了。以图3为例,TopCell就是包含了ffWraperInst和ffWraperInst2的那个Cell,TopCell的Pin一般是FPGA的IO。下面这段代码的作用就是给TopCell定义了三根Net,分别是clk, in0, out0。

(clk, in0, out0) = tuple([createTopPortNet(k,v) for k,v in ports.iteritems()])

外部端口定义好了就开始定义内部,先是Place了两个FF

src_ff = makeFlipFlop("src_ff", "SLICE_X0Y4/AFF")
snk_ff = makeFlipFlop("snk_ff", "SLICE_X2Y4/BFF")

然后创建一根Net将src和snk连接起来,并且将TopCell的Net分别连接到src的输入和snk的输出。在RapidWright中将某个Pin加到某个Net就算是连上了。

net = design.createNet("ff_net")
src = DesignTools.createPinAndAddToNet(src_ff,"Q",net)
snk = DesignTools.createPinAndAddToNet(snk_ff,"D",net)
in0.createPortInst("D",src_ff.getEDIFCellInst())
out0.createPortInst("Q",snk_ff.getEDIFCellInst())

然后将设计保存成dcp文件。具体为什么按照Tutorial那么设置就可以被Vivado布线了我也不太清楚,先跳过去。打开dcp文件可以发现两个FF连接在了一起,但是只是Logical Net而不是Physical Net。下一步就将这个LogicNet变为PhysicNet。
在这里插入图片描述图4

3. 最简单的布线

# We'll use a simple Manhattan distance between route nodes 
# to estimate cost/delay
def costFunction(curr, snk):
    return curr.getManhattanDistance(snk)

# Let's put our route routine into a function
def findRoute(src, snk):
    # We will use a priority queue to sort through the nodes we encounter, 
    # those with the least cost will fall to the bottom
    q = RouteNode.getPriorityQueue()
    q.add(src)
    
    # We'll keep track of where we have visited and a watchdog timer
    visited = HashSet()
    watchdog = 50000
    
    # While we still have nodes to look at, keep expanding
    while(not q.isEmpty()):
        curr = q.poll()
        if(curr.equals(snk)):
            print "Visited Wire Count: " + str(visited.size())
            # We've found the sink, recover our trail of used PIPs 
            return curr.getPIPsBackToSource()
        
        visited.add(curr)
        watchdog = watchdog - 1
        if(watchdog < 0): break
        # Print our search path to help debugging easier
        print MessageGenerator.makeWhiteSpace(curr.getLevel()) + str(curr)
        # Expand the current node to look for more nodes/paths
        for wire in curr.getConnections():
            nextNode = RouteNode(wire,curr)
            if wire.isRouteThru(): continue # We'll skip routethrus for this tutorial
            if visited.contains(nextNode): continue
            # Here we call the costFunction (above) to store a cost on the node
            nextNode.setCost(costFunction(nextNode,snk))
            q.add(nextNode)
    # Unroutable situation
    print "Route failed!"
    return

net.setPIPs(findRoute(src.getRouteNode(), snk.getRouteNode()))
print "PIP Count: "+str(net.getPIPs().size())
#for p in net.getPIPs():
#    print p

design.writeCheckpoint(design.getName() +".dcp")
print "Wrote: " + design.getName() + ".dcp"

想看懂这段代码有几个补充知识点

  1. 曼哈顿距离

在这里插入图片描述

  1. Cost是怎么来的?

cost_fuction的两个传入参数都是RouteNode,两个RouteNode的曼哈顿距离实际上是这两个RouteNode所在Tile的曼哈顿距离

CLBLM_L_X2Y4/CLBLM_M_AQ 0 0 OUTPUT
 CLBLM_L_X2Y4/CLBLM_LOGIC_OUTS4 1 1 OUTBOUND
  INT_L_X2Y4/LOGIC_OUTS_L4 1 2 OUTBOUND
   INT_L_X2Y4/WW4BEG0 1 3 HQUAD
   INT_L_X2Y4/EE2BEG0 1 3 DOUBLE
    INT_R_X3Y4/EE2A0 0 4 DOUBLE
    CLBLM_R_X3Y4/CLBLM_EE2A0 0 4 DOUBLE
   INT_L_X2Y4/SE6BEG0 1 3 BENTQUAD
    INT_R_X3Y4/SE6A0 0 4 BENTQUAD
   INT_L_X2Y4/NE2BEG0 1 3 DOUBLE
    INT_R_X3Y4/NE2END_S3_0 0 4 DOUBLE
     INT_R_X3Y4/BYP_ALT7 0 5 PINBOUNCE
     INT_R_X3Y4/IMUX31 0 5 PINFEED
     INT_R_X3Y4/IMUX39 0 5 PINFEED
     INT_R_X3Y4/IMUX47 0 5 PINFEED
      INT_R_X3Y4/BYP_BOUNCE7 0 6 BOUNCEACROSS
      INT_R_X3Y4/BYP7 0 6 PINFEED
      CLBLM_R_X3Y4/CLBLM_IMUX31 0 6 PINFEED
      CLBLM_R_X3Y4/CLBLM_IMUX39 0 6 PINFEED
      CLBLM_R_X3Y4/CLBLM_IMUX47 0 6 PINFEED
       CLBLM_R_X3Y4/CLBLM_BYP7 0 7 PINFEED
       CLBLM_R_X3Y4/CLBLM_M_C5 0 7 LUTINPUT
       CLBLM_R_X3Y4/CLBLM_L_D3 0 7 LUTINPUT
       CLBLM_R_X3Y4/CLBLM_M_D5 0 7 LUTINPUT
        CLBLM_R_X3Y4/CLBLM_L_DX 0 8 INPUT

先给出一段以上程序的部分执行结果,简单解释一下含义,INT_L_X2Y4/LOGIC_OUTS_L4 1 2 OUTBOUND表示INT_L_X2Y4/LOGIC_OUTS_L4这个RouteNode的Cost=1,Level=2,Cost是在costFunction(nextNode,snk)这行代码里定义的,就是这个节点距离snk的曼哈顿距离。前面提到两个RouteNode的曼哈顿距离实际上是这两个RouteNode所在Tile的曼哈顿距离。在本例中,snk是CLBLM_R_X3Y4/CLBLM_M_BX 0 0 INPUT,src是CLBLM_L_X2Y4/CLBLM_M_AQ 0 0 OUTPUT,X2Y4距离X3Y4的曼哈顿距离就是1,所以Cost=1。

  1. Level是怎么来的?

在前面那段代码中并找不到直接设置Level的代码(不像Cost是直接设置的),Level是在RouteNode初始化时就定义了的。找到RouteNode的源代码可以发现以src为起点,每向外遍历一次,Level就会自动加1。

	/**
	 * Constructor common for routing expansion
	 * @param wire Wire object to construct route node from
	 * @param parent The parent of the wire in the expanion search
	 */
	public RouteNode(Wire wire, RouteNode parent){
		setTile(wire.getTile());
		setWire(wire.getWireIndex());
		setParent(parent);
		setLevel(parent.getLevel() + 1);
	}
  1. PriorityQueue

PriorityQueue顾名思义,就是一个队列,随便向里边加东西但是出的时候按照某个顺序出。RapidWright针对getPriorityQueue给出的解释是

Creates a new priority queue that sorts route nodes based on their lowest cost (see RouteNode.getCost()).

也就是说RapidWright里RouteNode的PriorityQueue是按照RouteNode的Cost排列的。

  1. 上述代码的工作流程

根据程序执行打印出来的结果,可以推断出上述代码工作流程。首先CLBLM_L_X2Y4/CLBLM_M_AQ是这段Node的起点,然后看一下都有哪些RouteNode与起点相连,然后把这些RouteNode加进队列中。下次从队列中找出来Cost最小的RouteNode(就是离snk曼哈顿距离最近的),再看一下都有哪些RouteNode与该RouteNode相连,再加队列中。这样就一点点把能通道起点的RouteNode都加进来了,直到把snk的RouteNode也加了进来,布线结束。
6. 执行结果

Visited Wire Count: 250
PIP Count: 13
Wrote: MyFirstRoute.dcp

布线后,找到了一个Path包含了13个PIP,如图所示
在这里插入图片描述
从直观的角度来看,这条路径并不够好,在Switch Box周围绕了好几圈。

4. 改进Cost Function

def costFunction(curr, snk):
    return curr.getManhattanDistance(snk) + curr.getLevel()

前面的CostFunction只考虑了下一步要遍历的点要尽可能与snk离得近,但是没有考虑与src离得近不近。加上Level以后,就会让下一步搜寻的目标既要离snk近,也要离src近(兜圈子的那些RouteNode显然会有比较大的Level)也就是说目标会尽可能呆在src跟snk的直线上。改善了CostFunction之后,执行结果如下

Visited Wire Count: 1065
PIP Count: 5
Wrote: MyFirstRoute.dcp

在这里插入图片描述

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值