The plan here is to simulate your design with synchronous counters which show an easy to read incrementing integer, then swap out the synchronous counter for a faster equivalent Linear Feedback Shift Register (LFSR). This is perhaps a moot point with modern FPGAs being so capable of arithmetic with DSP slices that carry chain propagation delays do not get noticed, but remains relevant in the ASIC world.
When I first started working on VLSI, I discovered that my office had previously employed a summer student to write a C program to spit out VHDL that created LFSR counters to specification, that's how important they were to the local ASIC designs. No thought had been given to the realisation that the solution could be entirely realised in VHDL alone. Absolutely no need for software using the ncurses library. No need to return to the software to request a new LFSR just because you mis-specified the counter's maximal value by one. Instead you can just change a generic value.
Drawn in https://www.circuit-diagram.org/.
This is the basic architecture, very similar to polynomial division modulo 2 as it uses the Galois LFSR "one-to-many" format. The logic in the bottom half of the diagram is a verbose way of performing an equality test (I only had 2-input AND gates). There are authoritative articles across the Internet explaining the mathematical theory behind LFSRs, for example how they don't immediately provide a full range 0..(2n-1) counter for n-bits of state unless additional action is taken to enter and leave the missing state (all '0's). So the theory will be omitted here, the Wikipedia link is a good start for references or skip to Tutorial: Linear Feedback Shift Registers (LFSRs) by Max Maxfield. Here I'll just provide the code that allows a synchronous counter to be safely swapped for an LFSR.
The first aim is to provide two identical (or similar enough) entities that the swap between the two architectures is easy. I find using multiple architectures for a single entity cumbersome, but you are right to point out this is an excellent opportunity here and "just use a configuration" to swap them over. However I think it would be better in this case to use a generic and then generate the code required inside the architecture. The first step though is to work out how to make the LFSR count given it requires a different set of taps (polynomial) for each different length register.
Common Interface
library ieee;
use ieee.std_logic_1164.all;
entity counter is
generic(
max_g : positive range 2 to positive'high
);
port(
clk : in std_ulogic;
reset : in std_ulogic;
enable : in std_ulogic;
finished : out std_ulogic
);
end entity;
Synchronous Counter
architecture sync of counter is
signal cnt : natural range 0 to max_g-1;
begin
process(clk)
begin
if rising_edge(clk) then
if reset = '1' then
cnt <= 0;
finished <= '0';
elsif enable = '1' then
if cnt = max_g-1 then
cnt <= 0;
else
cnt <= cnt + 1;
end if;
-- Needs to go high during the 50th clock cycle for which enable is high.
-- This allows a follow-on clocked process to have notice.
if cnt = max_g-2 then
finished <= '1';
else
finished <= '0';
end if;
end if;
end if;
end process;
end architecture;
LFSR Counter
library local;
architecture lfsr of counter is
constant taps : std_ulogic_vector := local.lfsr_pkg.get_taps(max_g);
constant max_reg : std_ulogic_vector := local.lfsr_pkg.lfsr_cnt(taps, max_g-1);
constant fin_reg : std_ulogic_vector := local.lfsr_pkg.lfsr_cnt(taps, max_g-2);
signal reg : std_ulogic_vector(taps'range);
begin
process(clk)
begin
if rising_edge(clk) then
if reset = '1' then
reg <= (others => '1');
finished <= '0';
elsif enable = '1' then
if reg = max_reg then
reg <= (others => '1');
else
reg <= local.lfsr_pkg.lsfr_feedback(reg, taps);
end if;
if reg = fin_reg then
finished <= '1';
else
finished <= '0';
end if;
end if;
end if;
end process;
end architecture;
I've separated the complication of the supporting functions out into a package to hide complexity and allow re-use of the functions over multiple LFSR counters.
library ieee;
use ieee.std_logic_1164.all;
package lfsr_pkg is
-- Look up the polynomial required for the count length and slice the returned
-- array down to the required size.
--
-- Usage:
-- constant taps : std_ulogic_vector := get_taps(5);
--
function get_taps(num : positive range 2 to positive'high) return std_ulogic_vector;
-- Provide the next state for the LFSR given the current state 'reg' and the
-- polynomial from a previous call to 'get_taps'.
--
-- Usage:
-- reg <= lsfr_feedback(reg, taps);
--
function lsfr_feedback(
reg : std_ulogic_vector;
taps : std_ulogic_vector
) return std_ulogic_vector;
-- Calculate the nth state of the LFSR such that the value can be used for a
-- comparison, e.g. when the LFSR has reached its maximal count value.
--
-- Usage:
-- constant max_reg : std_ulogic_vector := lfsr_cnt(taps, max);
--
function lfsr_cnt(
taps : std_ulogic_vector;
num : natural
) return std_ulogic_vector;
end package;
library local;
package body lfsr_pkg is
type taps_array_t is array(positive range <>) of std_ulogic_vector(31 downto 0);
-- LFSR polynomial array taps has been taken from the table listed in the article
-- "Tutorial: Linear Feedback Shift Registers (LFSRs)" by Max Maxfield, who with
-- others has also lifted them from a book
-- "Bebop to the Boolean Boogie (An Unconventional Guide to Electronics)".
--
-- Only the second half of each vector after the '&' is required, but in VHDL all
-- array elements must be of the same type, i.e. same length vector. So the index
-- array element must be sliced afterwards.
constant taps_array : taps_array_t := (
2 => "000000000000000000000000000000" & "11",
3 => "00000000000000000000000000000" & "101",
4 => "0000000000000000000000000000" & "1001",
5 => "000000000000000000000000000" & "10010",
6 => "00000000000000000000000000" & "100001",
7 => "0000000000000000000000000" & "1000001",
8 => "000000000000000000000000" & "10001110",
9 => "00000000000000000000000" & "100001000",
10 => "0000000000000000000000" & "1000000100",
11 => "000000000000000000000" & "10000000010",
12 => "00000000000000000000" & "100000101001",
13 => "0000000000000000000" & "1000000001101",
14 => "000000000000000000" & "10000000010101",
15 => "00000000000000000" & "100000000000001",
16 => "0000000000000000" & "1000000000010110",
17 => "000000000000000" & "10000000000000100",
18 => "00000000000000" & "100000000001000000",
19 => "0000000000000" & "1000000000000010011",
20 => "000000000000" & "10000000000000000100",
21 => "00000000000" & "100000000000000000010",
22 => "0000000000" & "1000000000000000000001",
23 => "000000000" & "10000000000000000010000",
24 => "00000000" & "100000000000000000001101",
25 => "0000000" & "1000000000000000000000100",
26 => "000000" & "10000000000000000000100011",
27 => "00000" & "100000000000000000000010011",
28 => "0000" & "1000000000000000000000000100",
29 => "000" & "10000000000000000000000000010",
30 => "00" & "100000000000000000000000101001",
31 => "0" & "1000000000000000000000000000100",
32 => "10000000000000000000000001100010" -- Never gets used with positive'high = 2**31-1
);
function get_taps(num : positive range 2 to positive'high) return std_ulogic_vector is
-- Need to select the taps correctly when a num-bit LFSR only counts through 2**num-1 states.
-- So for 'num' is a power of 2, select the next size up.
constant e : positive range 2 to positive'high := local.math_pkg.ceil_log(num+1);
begin
assert e >= taps_array'low and e <= taps_array'high report "no LFSR taps big or small enough for " & positive'image(num) & "." severity failure;
return taps_array(e)(e-1 downto 0);
end function;
function lsfr_feedback(
reg : std_ulogic_vector;
taps : std_ulogic_vector
) return std_ulogic_vector is
subtype fb_t is std_ulogic_vector(taps'length-2 downto 0);
begin
-- NB. As taps(tap_t'high) = 1 always, then reg(tap_t'high) gets XOR'ed
-- with itself, hence zero'ed and then discarded.
return (reg(fb_t'range) xor (taps(fb_t'range) and fb_t'(others => reg(taps'high)))) & reg(taps'high);
end function;
-- Loop limits might also limit the maximum value passed to this function.
-- Xilinx Vivado v2019.1.1 (64-bit)
-- ERROR: [Synth 8-403] loop limit (65538) exceeded
function lfsr_cnt(
taps : std_ulogic_vector;
num : natural
) return std_ulogic_vector is
subtype fb_t is std_ulogic_vector(taps'length-2 downto 0);
variable max_reg : std_ulogic_vector(taps'length-1 downto 0) := (others => '1');
begin
for i in 1 to num loop
max_reg := lsfr_feedback(max_reg, taps);
end loop;
return max_reg;
end function;
end package body;
Under the hood, the taps_array has been taken from the table listed in Tutorial: Linear Feedback Shift Registers (LFSRs) by Max Maxfield, who with others has also lifted them from a book "Bebop to the Boolean Boogie (An Unconventional Guide to Electronics)". VHDL constrains all elements of an array to be the same size, so they are all the size of the maximum polynomial, and then bus ripped down to size on use. I have chosen to avoid trying to reach the normally unreachable state on the grounds that for maximum count values of 2n its little cost to add one more bit to the internal counter register. Note that the LFSR can count to values beyond that which you can specify with a VHDL integer generic of type positive, so the 'swapability' of this solution breaks down for count values greater than or equal to 231-1.
Performance
'max' Generic | Synchronous | LFSR | ||
---|---|---|---|---|
Clock Frequency (MHz) post Place |
Clock Frequency (MHz) post Route |
Clock Frequency (MHz) post Place |
Clock Frequency (MHz) post Route |
|
50 | 671.1 | 677.0 | 678.0 | 685.9 |
300 | 509.2 | 509.2 | 612.7 | 610.9 |
4000 | 430.1 | 430.7 | 661.4 | 657.0 |
50000 | 430.3 | 410.2 | 534.5 | 561.5 |
As mentioned earlier FPGAs do well for synchronous counters, so they typically get clock speeds in the range 410-670 MHz. Swapping them to LFSRs feels like a pointless exercise to increase the speed of the larger counters from 410 to 560 MHz. As previously said, the situation is different for ASICs.
A word of warning when using Vivado with this code, for large enough maximum count values you will get an error message:
ERROR: [Synth 8-403] loop limit (65538) exceeded
This is just one of several irritating behaviours Vivado exhibits when assigning constants using a function to calculate the value. In order for the LFSR counter to know what the terminal count value is (assigned to a constant), it must iterate a function a "few" times as it can't be predicted. Intel's Quartus Prime does not error out in this way.
I leave it as an exercise for the reader to combine the code for each architecture into a single one with a generic to flip between the two types of counter, e.g. based on whether you are simulating to synthesising.
It is also worth noting an online tool for generating LFSR code, provided by Guy Eschemann at New Generator: Linear-Feedback Shift Registers. Some may consider this more convenient, and easier for teaching purposes.