The previous blog derived an 'AXI Edit' component that would perform the operations 'pass', 'pause', 'swap', 'drop' and 'insert' on an AXI stream. The key problem solved was the insertion of data into the stream given it must occupy a space or create one by pausing the data stream without breaking the AXI specification. Getting the interface correct for the insertions also means realising the alternative data input needed to manage back pressure. This blog now uses that component to derive the methods for mutating a data stream against predefined operations managed by a finite state machine (FSM).
- Protocol Manipulations
- RTL Architecture
- Basic Actions
- 'Drop' Timing Diagram
- Insert Action
- 'Insert' Timing Diagram
- AXI Processing Conditions
- Testing
- Conclusions
- References
Protocol Manipulations
For this example, imagine a data stream of ASCII characters. As this data stream arrives it is passed through an AXI shift register, and manipulated by an FSM. The table below outlines the list of manipulations that the FSM can apply. The notation here is that capital letters are triggers for some protocol manipulation and both upper and lower case letters are data. The assumption is that whenever we want to insert data, it is entirely reasonable that the actual data required will be available and hence is not important for this exercise, so we just insert a constant character. In practice the triggers may instead be bits of 32-bit words or positions from the start of a packet.
Test Reference | Purpose | Example Input | Example Output |
---|---|---|---|
1a | Swap a sequence from character 'A' until character 'B' with 'y' characters, retain 'A' & 'B' characters. | aaaAxxxBbbb | aaaAyyyBbbb |
1b | Swap a sequence from characters 'C' until character 'D' with 'y' characters, use 'D' as the stop character. | aaaCxxxDbbb | aaayyyDbbb |
1c | Swap a sequence from characters 'E' until character 'F' with 'y' characters, use 'E' as the start character. | aaaExxxFbbb | aaaEyyybbb |
1d | Swap a sequence from characters 'G' until character 'H' with 'y' characters, drop 'G' & 'H' characters. | aaaGxxxHbbb | aaayyybbb |
2a | Delete a single character | aZb | ab |
3a | Delete sequence from character 'I' until character 'J', retain 'I' & 'J' characters. | aaaIxxxJbbb | aaaIJbbb |
3b | Delete sequence from character 'K' until character 'L', use 'L' as the stop character. | aaaKxxxLbbb | aaaLbbb |
3c | Delete sequence from character 'M' until character 'N', use 'M' as the start character. | aaaMxxxNbbb | aaaMbbb |
3d | Delete sequence from character 'O' until character 'P', drop 'O' & 'P' characters. | aaaOxxxPbbb | aaabbb |
4a | Insert a character, before 'Q'. | aaaQbbb | aaayQbbb |
4b | Insert a character, after 'R'. | aaaRbbb | aaaRybbb |
5a | Insert sequence, before character 'S'. | aaaSbbb | aaayyySbbb |
5b | Insert sequence, after character 'T'. | aaaTbbb | aaaTyyybbb |
5c | Insert sequence, instead of character 'U'. | aaaUbbb | aaayyybbb |
5d | Insert sequence, between 'V' and 'W'. | aaVWbb | aaVyyyWbb |
RTL Architecture

Basic Actions
The plan is to parse the data with an FSM, which takes a data cycle to derive the action to perform on the data. We definitely do not want the FSM in-line with the data because the tready logic is typically not registered, and FSMs add a clock cycle delay. Hence we need a parallel path that provides an AXI shift register controlled from an FSM that reads the same data in parallel. If the only actions are 'pass', 'pause', 'drop' and 'swap' then life is easier. The 'insert' action adds complication, so skip that for a moment. Under the non-'insert' conditions we need a single data delay to match the delay through the FSM, and then the FSM will control an "AXI Edit" stage quite happily.
'Drop' Timing Diagram
NB. Below the "subordinate" is attached to the "manager interface" with the m_* signals.
The timing diagram has been created under the conditions when there is no back pressure exerted from the AXI manager side, and input data is always valid. This simplifies showing the timing diagrams of how words flow in space and time. This means m_axi_ready = '1' and s_axi_valid = '1', and hence are not shown. We do need to show the effect of dropping data words on m_axi_valid.
When dropping an unknown sequence length, i.e. until a trigger stop word, the same state is re-used to observe the input. This means we avoid the complications observed for the 'insert' action.
Insert Action
When the insert action is included we have complications. The 'insert' action will pause the data arriving at the edit block, but without further intervention, each new data word shifts into the first AXI delay stage, beyond where the FSM is looking. Now if the FSM's decision is to insert a known amount of data not dependent on a subsequent trigger in the data stream, there is no issue. When a trigger must be read from the data stream, e.g. a stop word or a the next action trigger, then that word must be left where the FSM can parse it. Here we need to intervene and provide a 'pause' in the first AXI delay stage in order to keep the word in sight. This means stalling the data flow with an additional control signal.
'Insert' Timing Diagram
NB. Below the "manager" is attached to the "subordinate interface" with the s_* signals, and the "subordinate" is attached to the "manager interface" with the m_* signals.
The timing diagram has been created under the conditions when there is no back pressure exerted from the AXI manager side, and input data is always valid. This simplifies showing the timing diagrams of how words flow in space and time. This means m_axi_ready = '1' and s_axi_valid = '1', and hence are not shown. We do need to show the effect of dropping data words on s_axi_ready.
The key takeaway from this exercise is that performing an unknown sequence length of insertions will always require the data stream to stall by one data cycle as the input stream must be held up at the first stage where the FSM can see the next data word, and the data cycle of time is then lost as the FSM makes the next action decision. The other solution would be for the FSM to execute two different states concurrently!? Hence this is data cycle slip is unavoidable.
AXI Processing Conditions
Stalling the data flow with a 'pause' in the first stage of delay or with either a 'pause' or an 'insert' in the second means that the input AXI data stops with potential for deadlock (activity stops) if the standard conditions are used for reading in the state machine. Some familiarity with the "Protocol Edit" VHDL code is required for the precise understanding of these conditions. The diagram becomes too cluttered when including all the details, hence they are not shown.
Situation | Condition | Comment |
---|---|---|
Data reading | if s_axi_ready = '1' and s_axi_valid = '1' then |
Consuming data arriving on s_axi_*. |
Data insertion | if alt_ready = '1' and alt_valid = '1' then |
Completing data sent from the FSM to the AXI Edit stage on alt_*. |
Data input stalled | if delay_ready = '1' and delay_valid = '1' then |
Completing data sent from the AXI Delay to the AXI Edit stage on delay_*. |
The first row of the table is the standard condition on which data is clocked through the delay lines. But this condition will cause deadlock if used when the second stage is paused or inserting data, which has then exerted back pressure on the s_axi_* inputs. Instead the second row must be used following an 'insert'. Coming out of 'pause' must not be dependent on the incoming data! When the first stage is paused in order to retain the next input word where the FSM can see it, then the third row of the table gives the condition for moving to the next state. Essentially the last two rows are checking for data being taken away and hence freeing up the registers in the second stage of delay (AXI Edit).
When writing the FSM, care must be taken to use the correct conditions based on the actions in the previous state. It gets more complicated when there are two ways of entering a state and the previous two possible states have different actions requiring different conditions. Here it is worth noting that the 'insert' in the second stage AXI Edit block will complete before the next word from the data stream is consumed.
-- An insert completes before the stalled word in delay_data, and before or at the same time as the next input word.
if alt_ready = '1' and alt_valid = '1' then
axi_op <= pass;
pass_input <= '1';
end if;
if s_axi_ready = '1' and s_axi_valid = '1' then
....
The code above shows that you can avoid deadlock by decoding both conditions in the state which does the general consumption between trigger words. When the insertion has completed, return to the data passing. The FSM will then be able to move through the input stream again, even when both conditions occur on the same clock cycle.
Testing
The device under test (DUT) implements each of the cases in the first table, and the test bench uses OSVVM to automatically and randomly generate inputs sequences conforming to the third column "example input", and a scoreboard confirming to the fourth column "example output". There are two modes of operation for the test bench. When the VHDL constant fast_forward_c = true, there is no back pressure and the input data is fully enabled. The test bench will not pass in this mode, but it is useful for debugging the data progression. When fast_forward_c = false, the AXI Verification Components (VCs) in OSVVM do their randomised best to expose AXI handshake and compliance issues. The development of this DUT used an AXI component, "AXI Edit", (twice in two different configurations) that has already been tested thoroughly with OSVVM in advance, and the use of a component instantiation for each forced AXI compliance on each interface avoiding any temptation to work the control logic across the boundary.
AXI Option | Value |
---|---|
TRANSMIT_VALID_DELAY_CYCLES | 0 |
RECEIVE_READY_BEFORE_VALID | true |
RECEIVE_READY_DELAY_CYCLES | 0 |
Conclusions
This example illustrates how two copies of the AXI Edit shift register can be used in series and an FSM in parallel with the first stage in order to make significant changes to the input data stream successfully. The FSM currently implements a simplistic set of changes as the example was designed particularly to improve the understanding of how to perform the insertions correctly, and continue to parse the data stream afterwards. It is key to realise the data insertions need additional intervention in the first delay stage for many realistic word insertion scenarios and this comes with an unavoidable data cycle penalty as explained above.