有时需要进行一些相对复杂些的计算,如乘除取模等,此时当输入数的位宽不是太大时,可以采用LUT(look up table),这样将复杂运算直接转化成读取LUT,从资源与速度方面都有较大提高。在xilinxFPGA中,提供基本的LUT是六输入,其也可以变成两个五输入。
下面提供一个检测输入数种值为1的位个数的快捷方法:
Quick FPGA Hacks: Population count
Presented at FPL2014 the paper “Pipelined Compressor Tree Optimization using Integer Linear Programming” by Martin Kumm and Peter Zipf, University of Kassel uses clever (Xilinx) slice-wide (four 6-LUTs + CARRY4) based compressors to reduce as many as 12 input bits to 5 sum bits for an efficiency of up to (12-5) / 4 = 1.75 bits/LUT. This reminded me of a nice application of compressors in FPGAs — population count a.k.a. bit count, e.g. determine the number of one bits set in a word — which uses simpler LUT-based (no carry chain) compressors to good effect. I first learned about this approach in some Altera documentation. Using modern FPGAs’ 6-LUTs it