The study will cover:
- Comparison of one-process, two-process, and three-process FSM structures, including usage scenarios, trade-offs, and code examples.
- Analysis of binary, Gray, and one-hot state encoding schemes with pros and cons.
- Common FSM-related interview questions and guidance for answering them effectively.
- Practical FSM design tips, synthesis optimization strategies, and both formal and functional verification techniques.
- Methods for drawing a state transition diagram and translating it into synthesizable RTL code.
- A comprehensive case study of the PCI Express LTSSM (Link Training and Status State Machine), including its state breakdown, transition logic, timing considerations, and implementation practices.
Finite State Machine (FSM) Design for ASICs in Verilog/SystemVerilog
Finite State Machines (FSMs) are fundamental for modeling control logic in digital circuits. This report provides a comprehensive study of FSM design targeting ASIC implementation using Verilog/SystemVerilog. We cover different FSM coding styles, state encoding schemes, common interview questions, practical design tips, verification strategies, and a case study of the PCI Express Link Training and Status State Machine (LTSSM). The content is structured for junior to senior IC design engineers, with clear explanations and references to reputable sources.
FSM Coding Styles: One-Process vs Two-Process vs Three-Process
In hardware description languages, an FSM can be coded using one, two, or three procedural blocks (processes/always blocks). All styles achieve the same logical behavior – they contain state memory (flip-flops to hold the current state), next-state combinational logic, and (for Moore machines) output logic . The difference lies in how these pieces are organized in code, which affects readability and certain design aspects.
-
One-Process FSM: All state transitions and outputs are coded in a single clocked process (one always block). This means the current state register is updated and outputs are assigned within one
always @(posedge clk)
block . One-process FSMs typically use nonblocking assignments for state and output registers. An advantage is simplicity – less verbose code and no risk of forgetting a default in a separate combinational block . It also simulates slightly faster (one process instead of multiple) . However, since outputs are registered in the same process, a Moore-style output will by definition update one clock cycle after a state transition. For example, in a one-process Moore FSM, if you change state on a clock edge, the output (which is tied to state) effectively reflects the previous state for that cycle . This is functionally correct but can be confusing when debugging waveforms (output lags state by one cycle) . One-process style is common, especially among engineers who prefer a sequential coding style . It can implement both Moore and Mealy machines (by updating outputs either unconditionally by state or immediately on input conditions) . -
Two-Process FSM: The FSM is divided into two processes/always-blocks – one purely sequential for state register update, and one purely combinational for next-state logic (and possibly outputs). A typical Verilog two-process FSM uses an
always_ff @(posedge clk)
for updatingstate <= next_state
, and analways_comb
(oralways @*
) to computenext_state
based on the current state and inputs . In Moore machines, the output can be produced in the sequential block (registered outputs), or in the combinational block as a function of state (still effectively registered because state is stable during the clock period) . In Mealy machines, combinational outputs that depend on inputs can be handled in the combinational logic block. The two-process style is often recommended by coding guidelines (e.g., the Reuse Methodology Manual) for clarity . It closely mirrors the conceptual FSM diagram: one block for state register, one for logic . Engineers find it readable because the next-state logic is clearly separate. One important use-case for the two-process style is when you need immediate combinatorial signals from the FSM (like handshake signals or distributed next-state signals to other logic) . Sincenext_state
is a combinational output of the first process, other logic can react to state transitions without an extra clock of latency. By contrast, a one-process FSM would require waiting a cycle (until the state register updates) or using additional variables to emulate combinational outputs . The trade-off is that the designer must carefully code the combinational block (include a default or all branches to avoid inferred latches, and ensure the sensitivity list is complete, or usealways_comb
in SystemVerilog) . Overall, two-process FSMs are very popular in ASIC design for their balance of clarity and reliability. -
Three-Process FSM: This style uses three separate processes: (1) a sequential process for state register updates, (2) a combinational process for next-state logic, and (3) a combinational process for output logic. Essentially, it splits the Moore output logic out of the next-state combinational block. For example, one might have an
always_ff
forstate
, analways_comb
that computesnext_state
(based on current state & inputs), and anotheralways_comb
that computesoutput
(based solely on current state for a Moore FSM) . This approach makes each process very focused, which some consider a “divide and conquer” method . The output logic being isolated can ensure no accidental interference with next-state logic and can eliminate the delta-cycle delay between state and output in simulation (since outputs change in a separate comb process triggered by state changes) . Three-process FSMs are logically equivalent to certain two-process implementations; in fact, a three-process Moore FSM is essentially the same as a two-process FSM where the output is handled combinatorially in a separate block . This style is less common in modern code but is seen in some legacy code or when designers want to emphasize separation of concerns. It can be slightly more verbose. Notably, Xilinx’s documentation historically referred to a “two-process” style that is actually this separation (state+next in one, outputs in second) and a “three-process” style as described above . In practice, whether to use two or three processes often comes down to personal or team preference . There is no single “best” style – all can be made to infer the same hardware. The key is to write clear, synthesizable code and understand the implications (like registered vs combinational outputs, one-clock output delay in Moore, etc.).
Typical Use Cases and Trade-offs: One-process FSMs are concise and less error-prone in small designs, and they guarantee all outputs are registered (glitch-free). Two-process FSMs are a good default for most designs – they make the FSM behavior explicit and allow combinational signaling of intentions (useful in ASIC control logic that might feed into other combinational blocks) . Three-process FSMs may be used in designs where outputs are complex and one prefers to isolate that logic for readability, or when directly adapting VHDL code that used the pattern. For ASIC implementation, all styles are acceptable as long as they are coded correctly. It’s important to ensure that your chosen style cleanly resets the FSM and does not infer unintended latches. Synthesis tools generally have no issue with any style; there is little to no difference in the final hardware for one vs two vs three process FSM given the same behavior . Thus, the decision often hinges on coding guidelines or designer experience. As one reference notes, older and more experienced engineers sometimes favor multi-process FSMs, whereas those with a software background often use one-process since it feels more sequential . Whichever style is used, it is critical to apply it consistently and document the FSM for reviewers.
FSM State Encoding Schemes: Binary, Gray, and One-Hot
State encoding refers to how the abstract states are represented in binary bits in the actual flip-flops. Common encoding schemes are binary (dense), one-hot, and Gray code. The choice of encoding can impact speed, area, power, and even reliability of the FSM in certain contexts .
-
Binary Encoding: In binary (or compact) encoding, each state is assigned a unique binary number. The number of flip-flops is the minimum needed to represent all states (e.g., 4 states require 2 FFs, 5-8 states require 3 FFs, etc.). This scheme minimizes the state register width . Fewer flip-flops means less area and potentially lower power in the registers. Combinational logic is used to decode the state (since multiple bits together define the state), but for a modest number of states this logic is small. Binary encoding is the default encoding used by many synthesis tools if not directed otherwise . It’s often suitable for ASIC designs where flip-flop count should be kept low and where logic gating can be optimized by the synthesis tool. A downside is that when transitioning from one arbitrary binary value to another, multiple bits may switch, which can create transient glitches in the decoding logic if not all logic is registered. However, since state changes occur on clock edges and outputs in a Moore machine are registered, glitches are usually not a functional problem in synchronous FSMs. Binary encoding is a good general choice for ASIC FSMs, especially if state count is large, because it keeps the design size small. (For FPGAs, binary is also fine for small FSMs or when FF resources are limited, like in CPLDs .)
-
One-Hot Encoding: In one-hot encoding, each state is represented by a flip-flop that is
1
when in that state and0
otherwise. So an FSM with N states uses N flip-flops, and only one FF is HIGH at a time (one-hot) . The advantage is that next-state decoding becomes very simple: often just combinational logic gating the individual state bits to produce the next-state signals. There is no complex binary decoding; you “pay” in flip-flops instead of logic gates. This can make the FSM faster, since each flip-flop can directly feed into the conditions for transitioning to the next state without going through a multi-bit decode . One-hot encoding typically results in more registers and less combinational logic, which is why it’s favored in FPGA designs where registers are abundant and logic LUTs might be a limiting factor . In ASICs, flip-flops are also plentiful, but they do consume area and power, so one-hot may not always be area-optimal for very large state machines. One-hot FSMs also have the nice property that only one bit changes on each transition (you deactivate one state FF and activate another). This localized switching can sometimes help timing and may reduce dynamic power (fewer simultaneous bit flips, though there are more total bits) – the power aspect depends on how often the FSM is active and switching. Another benefit is design clarity when using enumerated state encodings in code: one can map each state to a bit position, which can be convenient for understanding and debug (e.g., in simulation you see one-hot state vector with a single ‘1’). Because decoding is simple, one-hot FSMs achieve very high clock speeds and are sometimes explicitly chosen for speed-critical control logic. The main drawback is the increased number of flip-flops – an N-state one-hot FSM uses ~N FFs instead of log2(N). For example, a 16-state FSM would use 16 FFs in one-hot versus 4 FFs in binary. In an ASIC with many FSMs or very large state machines, this could impact area. Additionally, the clock tree sees more loads (each FF toggling at some point), potentially affecting power distribution. That said, the trade-off often still favors one-hot for medium-sized FSMs where speed is important . Many FPGA-oriented design guides suggest one-hot for FPGAs, but for ASICs designers often let the synthesis tool decide or use binary unless timing analysis shows a need for one-hot in a particular FSM. -
Gray Encoding: In Gray code encoding, state codes are assigned such that any transition between consecutive states (in some defined sequence) differs by only 1 bit flip. Gray coding is not as commonly used for general FSM design as binary or one-hot, but it has niche advantages. The primary benefit is that by ensuring only one bit changes at a time, the FSM avoids potential hazards where multiple bits changing could cause transient incorrect states if observed in another clock domain or by asynchronous logic . Gray encoding is particularly useful if the FSM’s state outputs feed some asynchronous interface or combinational logic that is sensitive to glitches . By having only one bit change, you avoid the situation where, say, going from
011
to100
passes through intermediate values (010
,000
, etc.) momentarily – this could confuse external logic if it wasn’t synchronized. Gray codes are thus often used when an FSM’s state register is sampled by another clock domain or external device (making it somewhat like a synchronizer) . Another use is to minimize Hamming distance for power or EMI reasons – only one bit switching at a time can reduce simultaneous switching noise. However, Gray encoding typically has similar complexity to binary in terms of flip-flops (uses ≈ log2(N) bits). It doesn’t intrinsically speed up decoding logic (the decoding might actually be trickier than binary since the codes are not sequential binary numbers). It mainly shines for reliability in asynchronous interactions. For example, some FIFO pointer implementations use Gray code for read/write pointers crossing clock domains to avoid metastability issues. In summary,