The aim of this design is to extend the partial reconfiguration design in a previous blog to demonstrate partial reconfiguration from within the FPGA using the ICAP interface.
This is part 2 of 3 in the Partial Reconfiguration series:
- Dynamic Function eXchange
- Dynamic Function eXchange with ICAP
- Dynamic Function eXchange with ICAP Driven by Software
- Glossary
- Design
- Block Diagram
- VHDL
- DFX Controller IP and Memory Layout
- Full Build Process
- Constraints
- PS Code
- Simulation
- Demonstration
- Conclusions
- References
Glossary
Term | Function |
---|---|
ICAP | Internal Configuration Access Port, driven by dedicated logic in PL. See "7 Series FPGAs Configuration User Guide", UG407 |
MCAP | Media Configuration Access Port, a dedicated connection from one specific PCIe block on a device to the configuration engine. See "Dynamic Function eXchange User Guide", UG909 |
PCAP | Processor Configuration Access Port, driven by Zynq PS See "Zynq 7000 SoC Technical Reference Manual", UG585 |
Design
The design is comprised of a Vivado block diagram, VHDL and software for the PS.
Block Diagram

The entire design can be recreated from a TCL script exported from the block diagram editor and augmented to include the VHDL source files required. The script is known to work in Xilinx's Vivado version 2023.2.2.
As you can see the entire purpose of the block diagram is to include the PS7 processor. The custom VHDL block includes the reconfigurable modules, both hosting them in the reconfigurable partition and swapping them with the ICAP. There is no interaction between the PS and PL. The PS is required for a single function, see later.
VHDL
The top level of VHDL includes a natural generic for changing the value used by the reconfigurable module. A problem with using the block diagram is the VHDL sub-section of the design gets compiled into a .DCP file, ps_pl_wrapper.dcp. That file does not get updated when the generic value is changed, and the result is confusion in the build script. For this reason the .DCP needs to be deleted manually when the generic value is updated, and the associated Vivado run must also be reset to force re-synthesis.
# Need to manage hang-over DCP files.
remove_files -quiet -fileset utils_1 ${proj_stem}.srcs/utils_1/imports/synth_1/ps_pl_wrapper.dcp
if {[file exists ${proj_stem}.srcs/utils_1/imports/synth_1/ps_pl_wrapper.dcp]} {
file delete -force ${proj_stem}.srcs/utils_1/imports/synth_1/ps_pl_wrapper.dcp
}

The VHDL is the same at the previous DFX design with one addition. The block labelled "Reconfig" (reconfig_action) provides VHDL for four simple functions:
- External ICAP, which depends on the series of device used, for reprogramming the Reconfigurable Partition (RPs) in the FPGA.
- A Xilinx "DFX Controller" IP Core with a single "Virtual Stream" which can manage 4 Reconfigurable Modules (RMs).
- An AXI BRAM Controller to make it easy to interface the DFX Controller with a standard memory over an AXI-MM bus.
- An XPM Single Port ROM (Series 7) which allows the specification of a memory file for the ROM contents in simulation.
To stitch the VHDL-2008 files into the Vivado block diagram, a wrapper is required to provide a VHDL '93 file that can be added to the diagram as a module. Yep, their software was released in 2023.
DFX Controller IP and Memory Layout

The DFX Controller has been configured for 4 RMs to be delivered to a single "Virtual Stream". A virtual stream manages a single RP, allowing the controller to manage multiple RPs. Each RM is given an active high local reset and needs a RAM start address and bitstream size to be configured in the IP configuration. This means at least one RM needs to be created before the simulation can be run. From a pactical point of view, the DFX Controller IP core provides a TCL library to allow configuration post synthesis and implementation by modifying the netlist of the DFX Controller, but only from a design that can still access the IP's configuration. This means you can not use a saved .DCP file for this purpose, you must use the full and original Vivado project. The RM sizes in the DFX Controller IP must be correct, at least when using compression, so the IP core's configuration must be amended later in the build process.

Here the choice of compression can be selected and I chose the smallest FIFO and hence the use of distributed RAM. Increasing the size did not appear to improve the speed of decompression for my simple design during simulation. The table below gives the results for using the DFX Controller's own compression dfx_controller_v1_0::format_bin_for_icap, as opposed to the standard Vivado bitstream compression option.
RM | Uncompressed | DFX Compressed | DFX Reduction | Vivado Compressed | Vivado Reduction |
---|---|---|---|---|---|
0 | 117,448 | 2,932 | 97.5% | 57,530 | 51.0% |
1 | 117,448 | 3,176 | 97.3% | N/A | - |
2 | 117,448 | 3,152 | 97.3% | N/A | - |
3 | 117,448 | 3,208 | 97.3% | N/A | - |
4 | 117,448 | 3,128 | 97.3% | N/A | - |
My removable modules are such simple design that I was expecting a massive compression. I was able to compare this to a bistream written out for RM0, but not the others. The standard Vivado bitstream compression is terrible by comparison. Vivado could not write out compressed bitstream for RM1-4 as that property needs to be set on an implementation design which is not available within the in-memory project used for synthesis of the RMs, probably because I used non-project mode.

The above screenshot shows how each removable module is configured on the right and each virtual stream on the left. RM_1 has a start address of 0x00001000 (byte offset) and a size of 3152 bytes. The triggers can be setup for each RM, although the ability to reallocate bits to RMs feels overkill and unnecessary. The setting of "RM 0" for "Has PoR RM" is a bit of a fib, and the initial image uses an RM not available to the DFX Controller. Confusingly, my scripts do not match the RM_n defined in the DFX Controller with the RMn.BIN files, they are out by one.
source -notrace [get_property REPOSITORY \
[get_ipdefs *dfx_controller:1.0]]/xilinx/dfx_controller_v1_0/tcl/api.tcl
puts "INFO: \[$script\] Amending DFX Controller IP memory parameters in the netlist."
set dfx_ip [get_ips dfx_controller]
if {$dfx_ip eq ""} {
error "ERROR: \[$script\] DFX IP not found."
}
set config [get_property CONFIG.ALL_PARAMS $dfx_ip]
if {$config eq ""} {
error "ERROR: \[$script\] DFX configuration not found."
}
set dscr [dfx_controller_v1_0::netlist::get_descriptor $config {ps_pl_i/vhdl_conv_i/U0/wrapper_i/reconfig_action_i/dfx_controller_i/U0}]
if {$dscr eq ""} {
error "ERROR: \[$script\] DFX descriptor not found."
}
# Address offsets in bytes, we have 2 ** 'rom_addr_bits+2' bytes available
dfx_controller_v1_0::netlist::set_rm_bs_address dscr VS_0 RM_0 [format "0x%08X" 0]
dfx_controller_v1_0::netlist::set_rm_bs_address dscr VS_0 RM_1 [format "0x%08X" 4096]
dfx_controller_v1_0::netlist::set_rm_bs_address dscr VS_0 RM_2 [format "0x%08X" 8192]
dfx_controller_v1_0::netlist::set_rm_bs_address dscr VS_0 RM_3 [format "0x%08X" 12288]
set mem_file_list {}
set ip_rm 0
# rm*_comp.bin should all have been created by now
foreach rm $subsequent_rms {
set rm_file "$prod_dir/rm${rm}_comp.bin"
if {[file exists $rm_file]} {
dfx_controller_v1_0::netlist::set_rm_bs_size dscr VS_0 RM_${ip_rm} [file size $rm_file]
puts "INFO: \[$script\] RM${ip_rm}, file '${rm_file}' size set to [file size $rm_file]"
lappend mem_file_list $rm_file
incr ip_rm
puts "ip_rm = ${ip_rm}"
} else {
puts "ERROR: \[$script\] File '${rm_file}' not found."
}
}
dfx_controller_v1_0::netlist::apply_descriptor dscr
Memory organisation requires some iterative guesswork. You have to guess the resulting size of the compressed .BIN files in order to determine the size of ROM you need to instantiate. Hence the XPM ROM is handy as its trivial to alter the generic parameters without needing to regenerate a Xilinx memory IP core.
# Memory layout is specified by 'dscr', so extract it from the provided netlist for the IP core
#
proc multiple_bin2mem {dscr outfile args} {
upvar rom_addr_bits rom_addr_bits
set proc_name [lindex [info level 0] 0]
set rm 0
set addr 0
set fo [open $outfile w]
puts "dscr = $dscr"
puts $fo "// Generated by TCL '$proc_name' in script '[file normalize [info script]]'."
puts $fo "// MEM file created from:"
foreach infile $args {
puts "dfx_controller_v1_0::netlist::get_rm_bs_address dscr VS_0 RM_${rm}"
set start [dfx_controller_v1_0::netlist::get_rm_bs_address dscr VS_0 RM_${rm}]
puts $fo "// * '$infile', start address [format "0x%08X" $start] ($start), size [file size $infile] bytes."
incr rm
}
set rm 0
foreach infile $args {
# DFX Controller address should be in bytes
set start [dfx_controller_v1_0::netlist::get_rm_bs_address dscr VS_0 RM_${rm}]
# DFX Controller size is definitely in bytes
set size [dfx_controller_v1_0::netlist::get_rm_bs_size dscr VS_0 RM_${rm}]
if {[file size $infile] > $size} {
puts "WARNING: \[$proc_name\] Allocated space not large enough for the MEM file contents ([file size $infile] > $size). Change the IP settings for RM${rm}."
} elseif {$size == 0} {
puts "WARNING: \[$proc_name\] IP configuration error for RM${rm}, change the IP settings."
} else {
if {[file size $infile] != $size} {
puts "INFO: \[$proc_name\] DFX RM${rm} file $infile, size is $size bytes, should be [file size $infile] bytes"
}
# DFX Controller address should be in bytes, we want addresses of 32-bit words.
set start [expr $start / 4]
# DFX Controller size is definitely in bytes, we want addresses of 32-bit words.
set size [expr $size / 4]
# Not including this last address, i.e. up to ($end-1)
set end [expr $start + $size]
puts "INFO: \[$proc_name\] - DFX RM${rm} file $infile, start $start, end $end, size $size (32-bit words)"
# Pad up to the start address
for {set i $addr} {$i < $start} {incr i} {
puts $fo [format "@%08X FFFFFFFF" $i]
}
set fi [open $infile r]
fconfigure $fi -translation binary
while {($addr < [expr {2 ** $rom_addr_bits}]) && ($addr < $end)} {
if {$addr >= $start && $addr < $end && ![eof $fi]} {
# Inside ROM space with data to read
set word [read $fi 4]
binary scan $word Iu1 w
puts $fo [format "@%08X %08X" $addr $w]
} else {
# Outside ROM space, e.g. after '$end'
puts $fo [format "@%08X FFFFFFFF" $addr]
}
incr addr
}
if {($addr >= 2 ** $rom_addr_bits) && ![eof $fi]} {
puts "ERROR: \[$proc_name\] - ROM address space insufficient."
}
close $fi
}
incr rm
for {set i $addr} {$i < [expr {2 ** $rom_addr_bits}]} {incr i} {
puts $fo [format "@%08X FFFFFFFF" $i]
}
}
close $fo
}
multiple_bin2mem extracts the DFX Controller IP core parameters for the start points and size, then writes each RM into a single address space, offset by its respective start point and verifies the size. The binary .BIN files are converted to .MEM format in order to initialise the XPM ROM for simulation and inserting into the final bitstream file. Note that the build process requires that a .MEM file be present for synthesis, so an empty ROM is provided in the source code for the purposes of the initial build.
write_mem_info \
-force_detect_xpm \
-force \
"$prod_dir/roms.mmi"
exec "[file normalize $vivado_install]/bin/updatemem.bat" \
-force \
-meminfo "$prod_dir/roms.mmi" \
-data "$prod_dir/rm_all_comp.mem" \
-bit "$prod_dir/initial.bit" \
-proc "ps_pl_i/vhdl_conv_i/U0/wrapper_i/reconfig_action_i/rom_i/xpm_memory_base_inst" \
-out "$prod_dir/initial_rom.bit"
Vivado comes with a standard updatemem.bat script that allows the replacement of the ROM initialisation contents by re-writing the bitstream file. write_mem_info must be used to provide the layout information of the memory primitives for updatemem.bat.
Full Build Process
Other than the extra work to change the generic value used for each RM, the linking of the static with each different RM is the same process as previously performed.

Putting the above together gives the full TCL build script available via GitHub. Note that the DFX Controller parameters must be amended in the netlist before the first bitstream file is written so that the ROM contents can be amended and the final bitstream written.
Constraints
WARNING: [Vivado 12-7117] write_checkpoint: the -cell option is not recommended on a hierarchical cell that is not marked as HD.RECONFIGURABLE = TRUE. Resolution: To avoid physical conflicts during design assembly, set DONT_TOUCH = TRUE on the hierarchical cell, and exclusively assign the cell to a PBlock with EXCLUDE_PLACEMENT = TRUE and CONTAIN_ROUTING = TRUE
Since the move to using a block diagram, I have not found a way to get constraints picked up early enough to apply important constraints like HD.RECONFIGURABLE, that could really matter. I had to place the DONT_TOUCH and KEEP_HIERARCHY attributes directly in VHDL to solve the most pressing problems, but several of these constraints do not have VHDL attribute equivalents. I'm still looking for a solution.
PS Code
The connection for the ICAP controller is explained in the product guide and data sheet for the AXI_HWICAP pcore. To enable the ICAP path from the ICAP controller to the PL configuration module, make sure the other controllers are finished using the PL configuration module and then set the [PCAP_MODE] bit = 1 and the [PCAP_PR] bit = 0. The ICAP path is used when a MicroBlaze processor is controlling the PL reconfiguration or as an alternative to the PCAP path.
ICAP Controller, Zynq 7000 SoC Technical Reference Manual, UG585
This means software needs to switch the programming mux from PCAP to ICAP, hence the following piece of code is required for the PS, and hence the need for the block diagram. See Zynq 7000 SoC Technical Reference Manual, UG585 for details.
#include "sleep.h"
#include "xil_printf.h"
#include <xil_io.h>
#define XDCFG_CTRL_OFFSET 0xF8007000
int main() {
xil_printf("Running");
// https://docs.amd.com/r/en-US/ug585-zynq-7000-SoC-TRM/ICAP-Controller
// https://docs.amd.com/r/en-US/ug585-zynq-7000-SoC-TRM/Register-XDCFG_CTRL_OFFSET-Details
// XDCFG_CTRL_OFFSET @ 0xF8007000
// xsct% mwr 0xF8007000 [expr [mrd -value 0xF8007000] & 0xF7FFFFFF]
// Turn off bit XDCFG_CTRL_PCAP_PR_MASK (PCAP_PR) to enable ICAPE2 to re-program logic
Xil_Out32(XDCFG_CTRL_OFFSET, Xil_In32(XDCFG_CTRL_OFFSET) & 0xF7FFFFFF);
while (1) {
sleep(1);
xil_printf(".");
}
}

The build process for the PS is less easy to script and I just stuck to using the GUI to run the code. The Vivado build.tcl script outputs an .XSA file, dfx_ps_legacy.xsa, that can be imported to the platform creation process to create the "First Stage Boot Loader" fsbl.elf file. Then the bare-metal application based on that platform is compiled, creating the application dfx_app.elf file. These along with the Vivado bitstream file, initial_rom.bit, file can be used to programme the Quad SPI Flash. For step by step instructions to create the platform and application you may like to follow the guide by Joseph Abbey. Take care with the run configuration of the application, make sure it uses Vivado's latest bitstream and not a cached copy local to the application or platform that does not get updated with each build.
Simulation
The simulation requires an initialised memory file with the DFX Controller IP correctly setup. This means amending the RMs in the IP configuration manually and regenerating the core. If the RM sizes are configured wrong, the file decompression fails and the error output is raised.
I was able to reverse engineer how to compile the VHDL for QuestaSim and aportion code to the correct libraries, but in simulation the DFX Controller IP remained empty. It did however work in Vivado's Xsim. I used the simulator to log the decompressed ICAP programming commands for analysis using 7 Series FPGAs Configuration User Guide, UG470 as follows:
0xFFFFFFFF Dummy Word : 0xFFFFFFFF Dummy Word 0x000000BB Bus Width Sync Word 0x11220044 Bus Width Detect 0xFFFFFFFF Dummy Word 0xFFFFFFFF Dummy Word 0xAA995566 Sync Word 0x20000000 NOOP 0x30008001 Type 1 Write 1 Word to CMD, Command Register 0x00000007 Type 2 RCRC Resets CRC: Resets the CRC register. 0x20000000 NOOP 0x20000000 NOOP 0x30018001 Type 1 Write Device ID Register IDCODE 0x03722093 Type 2 From ICAPE2 DEVICE_ID generic value 0x30008001 Type 1 Write 1 Word to CMD, Command Register 0x00000000 Type 2 NULL, Null command, does nothing 0x3000C001 Type 1 Write 1 Word to MASK, Masking Register for CTL0 and CTL1 0x00000100 Mask GLUTMASK_B 0x3000A001 Type 1 Write 1 Word to CTL0, Control Register 0 0x00000100 Set GLUTMASK_B 0x3000C001 Type 1 Write 1 Word to MASK, Masking Register for CTL0 and CTL1 0x00000400 Mask ConfigFallback 0x3000A001 Type 1 Write 1 Word to CTL0, Control Register 0 0x00000400 Set ConfigFallback 0x30008001 Type 1 Write 1 Word to CMD, Command Register 0x00000001 Type 2 WCFG, Writes Configuration Data: used prior to writing configuration data to the FDRI. 0x20000000 NOOP 0x30002001 Type 1 Write 1 Word to FAR, Frame Address Register 0x00001900 Type 2 CLB, I/O, CLK @ Row 0 Column 32 0x20000000 NOOP 0x30004000 Type 1 Write 1 Word to FDRI, Frame Data Register, Input Register (write configuration data) 0x50003935 Start Programming - Writes to this register configure frame data at the frame address specified in the FAR register. This is now the BIN file created by Vivado, all the way to the end of the BIN file, with no further additions. : 0x3000C001 Type 1 Write 1 Word to MASK, Masking Register for CTL0 and CTL1 0x00000100 Mask GLUTMASK_B 0x3000A001 Type 1 Write 1 Word to CTL0, Control Register 0 0x00000000 Unset GLUTMASK_B 0x30002001 Type 1 Write 1 Word to FAR, Frame Address Register 0x03BE0000 Type 2 Row 31 Column 0 0x30000001 Type 1 Write 1 Word to CRC, CRC Register 0x36A55DDA Writes to this register are used to perform a CRC check against the bitstream data. If the value written matches the current calculated CRC, the CRC_ERROR flag is cleared and startup is allowed. 0x30008001 Type 1 Write 1 Word to CMD, Command Register 0x0000000D Type 2 DESYNC, Resets the DALIGN signal: Used at the end of configuration to desynchronize the device. After desynchronization, all values on the configuration data pins are ignored. 0x20000000 NOOP : 0x20000000 NOOP
This appears to be the uncompressed .BIN file unaltered and verbatim to achieve reconfiguration.
Demonstration
The demonstration now uses two buttons, one to save the state from RP to the static part, and one to advance to the next RM in the sequence shown below.

Conclusions
The DFX Controller IP is superb for the task here, and simple to use. The compression is excellent, but then my design is excessively simple holding just a constant value. The simulation works and includes an error signal and status to allow debugging of the design using it. The use of the block diagram gets in the way of applying constraints correctly. It might be better to move the completed design to pure VHDL based on the generated code to remove the problems caused by the way the block diagram tries to be "helpful".