The art of counting in fpga

最新推荐文章于 2023-08-25 14:55:50 发布

wenchenggan

最新推荐文章于 2023-08-25 14:55:50 发布

阅读量667

点赞数

分类专栏： fpga 文章标签： art of counter

fpga 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

转载至fpga4fun，觉得对重新理解fpga很好，而且从来没有在中文网站上看到这种东西，而且讲的通俗易懂，以后多看点英文网站。先转下来备忘。。

（1）Binary counters

The simplest of counters

A fast and efficient binary counter can be built using a couple of Verilog lines. For example, here's a 32bit counter.

reg [31:0] cnt;

always @(posedge clk) cnt <= cnt+1;

Such counter counts from 0 to 4294967295, and then rolls-back 0 to continue its course. It takes little resources and runs fast in an FPGA thanks to an hidden carry-chain (more on that later). For now, let's see a few variations.

First it's a good idea to explicitly give a starting value, even if it's 0.

reg [31:0] cnt = 0;

always @(posedge clk) cnt <= cnt+1;

Note that if we don't specify a starting value, simulation tools will refuse to work, and some synthesis tools might also change the starting value on their own... so it's really a good idea to always specify a starting value. We could also have used an asynchronous reset to specify a starting value, but the simplest way is shown above.

Now if you need more features, here's an example of a 10bit counter (counts up to 1023) that starts counting from 300, and has enable and direction controls.

reg [9:0] cnt = 10'd300; // 10bit counter, starts at 300

wire cnt_enable; // 0 to disable the counter, 1 to enable it

wire cnt_direction; // 0 to counter backward, 1 to count forward

always @(posedge clk) if(cnt_enable) cnt <= cnt_direction ? cnt+1 : cnt-1;

Note that FPGA flip-flops always start at 0, so the FPGA synthesis tool have to play some tricks to make non-zero values work, but it's transparent (the synthesis tool puts some well-placed inverters in the logic).

Counter tick

Let's say we need a "tick" signal that is asserted once every 1024 clock. Most likely we would create a 10bit counter and some logic to generate the "tick". Let's see how to do that.

First we make our 10bit counter. It counts from 0 to 1023 and then rolls-back.

reg [9:0] cnt = 0;

always @(posedge clk) cnt <= cnt+1;

Now we could decide that our "tick" is asserted when the counter reaches its maximum value.

wire tick = (cnt==1023);

An alternate way to write that is

wire tick = &cnt; // assert "tick" when all the cnt bits are 1

The drawback of these tick signals is that they create a big chunk of logic (a 10bit AND gate here). Not that big a deal for only 10bit but if our counter is 32bit or bigger, that would be a waste. The alternate way is to rely on the (usually hidden) carry chain that the FPGA is using behind the scene. We just need a bit of arm twisting to convince the FPGA to provide the info he is hiding...

reg [31:0] cnt = 0; // 32bit counter

wire [32:0] cnt_next = cnt+1; // next value with 33bit (one bit more than the counter)

always @(posedge clk) cnt <= cnt_next[31:0];

wire tick = cnt_next[32]; // gets the last bit of the carry chain (asserted when the counter reaches its maximum value)

Try it, you'll see it works the same but takes less space in the FPGA (note: at the time of this writing, we tried both ISE and Quartus-II and both do a good job with 0 as the start value. But Quartus-II gets confused for non-zero starting values and makes a big blob of logic).

（2）Special counters

Modulus counters

A modulus counter is a binary counter that rolls back before its natural end value. For example, let's say you want a modulus 10 counter (counts up to 9), you can write this.

reg [3:0] cnt = 0; // we need 4 bits to be able to reach 9

always @(posedge clk)

if(cnt==9) cnt <= 0;

else cnt <= cnt+1;

or this (a little more compact)

reg [9:0] cnt = 0;

always @(posedge clk) cnt <= (cnt==9) ? 0 : cnt+1;

Now a little of (free) optimization is available if you realize that we don't actually need to compare all the 4 bits of the counter to 9. The code below uses only bit 0 and bit 3 in the comparison.

always @(posedge clk) cnt <= ((cnt & 9)==9) ? 0 : cnt+1;

Gray counters

A gray counter is a binary counter where only one bit changes at a time. Yes, that's possible, see below the output of a 4bit Gray counter.

0000

0001

0011

0010

0110

0111

0101

0100

1100

1101

1111

1110

1010

1011

1001

1000

Gray code is mostly used to send values across clock domains (this way it has an uncertainty of only 1).

The easiest way to create a Gray counter is to first make a binary counter, and then convert the value to Gray.

module GrayCounter( input clk, output reg [3:0] cnt_gray = 0);

reg [3:0] cnt;

always @(posedge clk) cnt <= cnt+1; // 4bit binary counter

assign cnt_gray = cnt ^ cnt[3:1]; // then convert to grayendmodule

It is also possible to create a native Gray counter.

wire [3:0] cnt_cc = {cnt_cc[2:1] & ~cnt_gray[1:0], ^cnt_gray, 1'b1}; // carry-chain type logic

always @(posedge clk) cnt_gray <= cnt_gray ^ cnt_cc ^ cnt_cc[3:1];

（3）LFSR counters

Let's say you want a counter that counts more or less "randomly", you can use an LFSR. Here's an example.

As you can see, an LFSR is a shift-register with some XOR gates. The one shown above is an 8-taps LFSR (it uses 8 flip-flops).

The output sequence is as follow (assuming all the flip-flops start at 1):

00000001

00110010

01100111

01010100

11001101

11111110

10101011

10011000

Here's the LFSR source code.

module LFSR8_11D( input clk, output reg [7:0] LFSR = 255 // put here the initial value);wire feedback = LFSR[7];

always @(posedge clk)begin

LFSR[0] <= feedback;

LFSR[1] <= LFSR[0];

LFSR[2] <= LFSR[1] ^ feedback;

LFSR[3] <= LFSR[2] ^ feedback;

LFSR[4] <= LFSR[3] ^ feedback;

LFSR[5] <= LFSR[4];

LFSR[6] <= LFSR[5];

LFSR[7] <= LFSR[6];

end

endmodule

Notice that we made it start at 255. That's because 0 is a dead-end state. So we can choose any start value but 0. After that, the LFSR changes value at each clock cycle but never reaches 0, so it goes through only 255 values (out of 256 possible combinations of the 8bit output bus) before starting again.

Now, instead of taking the 8bit output, we could have used only one of the flip-flops to feed a 1bit output. This way we get a string of 255 '0' and '1' that looks random (although it repeats itself after 255 clocks). Useful for a noise generator...

Customization

You can tweak the LFSR:

· Select a different number of taps (we chose 8 above).

· Change the way the feedback network is wired - like change the number XOR gates, where they are placed, or replace XOR gates by XNOR.

Certain feedback configurations will create islands of possible values. For example, this LFSR looks similar to the first one but loops through only 30 values.

module LFSR8_105( input clk, output reg [7:0] LFSR = 255);

wire feedback = LFSR[7];

always @(posedge clk)begin

LFSR[0] <= feedback;

LFSR[1] <= LFSR[0];

LFSR[2] <= LFSR[1] ^ feedback;

LFSR[3] <= LFSR[2];

LFSR[4] <= LFSR[3];

LFSR[5] <= LFSR[4];

LFSR[6] <= LFSR[5];

LFSR[7] <= LFSR[6];

end

endmodule

It is also possible to add a bit of logic in the feedback so that the LFSR reaches all possible states.

module LFSR8_11D( input clk, output reg [7:0] LFSR = 255);

wire feedback = LFSR[7] ^ (LFSR[6:0]==7'b0000000); // modified feedback allows to reach 256 states instead of 255

always @(posedge clk)begin

LFSR[0] <= feedback;

LFSR[1] <= LFSR[0];

LFSR[2] <= LFSR[1] ^ feedback;

LFSR[3] <= LFSR[2] ^ feedback;

LFSR[4] <= LFSR[3] ^ feedback;

LFSR[5] <= LFSR[4];

LFSR[6] <= LFSR[5];

LFSR[7] <= LFSR[6];

endendmodule

LFSR testbench

We made a small Windows utility that allows experimenting with LFSR designs.

Download it here.

（4）The carry chain

The carry chain is the feature allowing FPGAs to be efficient at arithmetic operations (counters, adders...). Let's learn more through counters. Counters are easily built using...

T flip-flops

A T flip-flop is very simple. At the rising edge of the clock, it toggles its Q output only if the T input is high, otherwise Q doesn't change.

FPGAs use D flip-flops internally, but D and T flip-flops are easily interchangeable with a bit of logic around them. So we are using T flip-flops on this page, knowing that FPGA software can easily map them in the FPGA.

The ripple counter

The smallest binary counter is a ripple counter. Here's a 4bit ripple counter.

Basically each T flip-flop has its input set at 1 and its output drives the clock of the next flip-flop. It's very efficient in terms of hardware, but it's not great for FPGAs as we now have as many clock domains as there are bits in the counter. FPGAs are designed for synchronous circuits, so we need something where all the counter bits toggle at the same time.

The synchronous counter

In a synchronous counter, the clock feeds all the flip-flop simultaneously, so there is only one clock domain.

Now, if we look at the way a binary counter counts, we see that the LSB always toggles and that for any higher bit to toggle, all the bits of lower order need to be 1.

0000

0001

0010

0011

0100

0101

0110

0111

1000

1001...

So our synchronous counter takes shape by using a few AND gates.

It's good as long as the counter is small. Our example 4bit counter only needs two AND gates (plus the flip-flops obviously) so it's pretty efficient. But that doesn't scale well. For a 32bit counter, we would need 30 AND gates, the last one having 31 inputs...

However we can easily redraw our counter this way (we made a 6bit counter this time).

Basically instead of having AND gates grow in size, we keep them small and chain them.

That's the way FPGAs implement counters! It is efficient in term of hardware but the problem is speed... For example, a 32bit counter would need 30 chained AND gates. And this chain is the main part of the counter "critical path" (which sets the maximum counter clock speed). So it is important to keep this path fast... and FPGAs have one nice trick to keep it fast. It is called...

The carry chain

FPGAs are made of "logic elements", each containing one LUT and one D flip-flop. Each logic element can implement one counter bit (a 32bit counter needs 32 logic elements).

Logic elements can communicate with their surroundings through general-purpose routing structures, but that's slow. So FPGA designers made sure that logic elements placed side by side have an extra local routing signal (in red below).

This local routing is ideal for a carry chain. So every time you ask the FPGA software to implement a binary counter, it places the bits next to each other so that it can use the local routing as a carry chain. That adds a bit of constraint on the mapping, but the software takes care of it.

FPGA manufacturers also make sure that logic elements are heavily optimized for speed along the carry chain path. The result is counters that run easily at hundred of MHz... the speed of counters is usually not an issue (the critical path of an FPGA design is much more likely to go through regular logic than carry chains). Of course, it depends on how fast you want to run your design. Big counters feature long carry chains, and so cannot be clocked as fast as small counters. If that's an issue, you can either break down the carry chains (i.e. use a series of small counters) or choose a counter architecture that doesn't use carry chains.

For those adventurous, click here for an ISE FPGA editor screenshot of a slice (two logic elements) from a Spartan-3A FPGA design implementing a counter. The view is for bits 6 and 7 of the counter. We can immediately recognize the carry chain crossing the slice in the middle from bottom to top. What is less apparent is where are the AND gates and the T flip-flops. They are actually all there... the AND gates are made using the big muxes on the carry chain line and the T flip-flops are made using XOR gates and the D flip-flop outputs that loop back to the LUT inputs (through routing outside the logic elements). The LUTs are just pass-through.

Carry chains are also used for adders and comparators. But thanks for the hundred of engineers working to build smart HDL tools, we can use the power of carry chains without having to worry about them. Life is good.

wenchenggan

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
The art of counting in fpga

转载至fpga4fun，觉得对重新理解fpga很好，而且从来没有在中文网站上看到这种东西，而且讲的通俗易懂，以后多看点英文网站。先转下来备忘。。（1）Binary countersThe simplest of countersA fast and efficient binary counter can be built using a couple of Verilog l
复制链接

扫一扫

专栏目录