- 2024.2
- 2024.1
- 2023.2
- 2023.1
- 2022.2
- 2022.1
- 2021.2
- 2021.1
- 2020.2
- 2020.1
AMD Vitis™ Software Platform 2024.2 Release Highlights:
Enhancements for AMD Versal AI Engine DSP Designs
- Latency and throughput estimates using Vitis Analyzer
- Mark the unavailable PLIOs using Vitis Analyzer
- Rapid Prototyping of AMD Versal™ AI Engine Designs
- Heap stack and program memory reporting
New and Enhanced Vitis Library functions for Versal AI Engines
- Enhanced DSP Library Functions for AIE (Available on Versal AI Core, Versal Premium Series)
- Performance enhanced Time Division Multiplexed (TDM) FIR filter functions
- Higher performance versions of
- General Matrix Vector (GEMV)
- General Matrix Multiply (GEMM)
- 2D IFFT – partitioned across AIE + PL for high performance
- New DSP Library Functions for AIE-ML (Available on Versal AI Edge)
- Performance enhanced TDM FIR filter functions
- Support for Radix-3/Radix-5 FFTs
- GEMV
- GEMM
New Ease-of-use Features in the Vitis IDE (new GUI)
- New Serial Terminal: Monitor serial messages from the hardware
- Install and explore third-party extensions
- PS Trace feature for debugging & optimizing the performance of embedded systems
Enhancements to Vitis Model Composer for AIE DSP Designs
- AI Engine DSP Library Updates
- AIE (Available on Versal AI Core, Versal Premium Series)
- Mixed Radix FFT
- Stockham FFT performance enhancements
- TDM FIR
- AIE-ML (Available on Versal AI Edge Series)
- TDM FIR
- Direct Digital Synthesis (DDS – used for waveform generation)
- Mixer (used for frequency shifting)
- AIE-MLv2 (Available on Versal AI Edge Gen 2 Series)
- FIR
- DFT
- DDS
- Mixer
- AIE (Available on Versal AI Core, Versal Premium Series)
- Additional Data Types for Vitis Model Composer
- Support for cbfloat16
- Additional data type support for cascaded signals
- int8/uint8
- int16/uint16/cint16
- int32/uint32/cint32
- float/cfloat
- Export AIE/HLS Kernel designs from Vitis Model Composer to Vitis as a Vitis Subsystem (VSS)
- Debug AIE/HLS Kernels Built in Vitis Model Composer Using Vitis Debugger
- Updates to HDL Blockset in Vitis Model Composer
- Other Enhancements in Vitis Model Composer
- Improved response time for code generation
- Simulation runs only once for any design
- Save Hub block configurations as a JSON file (useful for rapid prototyping or batch processing)
- Added support for MATLAB R2024a
- Added support for Red Hat Enterprise Linux (RHEL) 8.10, 9.4
- Improved response time for code generation
- Design Rule Checks (DRCs) to Replace Design Considerations
Vitis What's New by Category
Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2024.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2024.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.
Enhanced DSP Library Functions for AIE (Available on Versal AI Core, Versal Premium Series)
- Performance enhanced TDM (Time Division Multiplexed) FIR Filter Functions
- Higher performance versions of
- GEMV (General Matrix Vector)
- GEMM (General Matrix Multiply)
- 2D IFFT – partitioned across AIE + PL for high performance
New DSP Library Functions for AIE-ML (Available on Versal AI Edge)
- Performance enhanced TDM (Time Division Multiplexed) FIR Filter Functions
- Support for Radix-3/Radix-5 FFTs
- GEMV (General Matrix Vector)
- GEMM (General Matrix Multiply)
- Latency and Throughput Estimate with Vitis Analyzer
- Mark which PLIOs are unavailable using Vitis Analyzer
- AI Engine DSP Library Updates
- AIE (Available on Versal AI Core, Versal Premium Series)
- Mixed Radix FFT
- Stockham FFT Performance Enhancements
- TDM FIR
- AIE-ML (Available on Versal AI Edge Series)
- TDM FIR
- DDS (Direct Digital Synthesis – used for waveform generation)
- Mixer (used for frequency shifting)
- AIE-MLv2 (Available on Versal AI Edge Gen 2 Series)
- FIR
- DFT
- DDS
- Mixer
- AIE (Available on Versal AI Core, Versal Premium Series)
- Additional Data Types for Vitis Model Composer
- Support for cbfloat16
- Additional data type support for cascaded signals
- int8/uint8
- int16/uint16/cint16
- int32/uint32/cint32
- float/cfloat
- Export AIE/HLS Kernel designs from Vitis Model Composer to Vitis as a VSS (Vitis Subsystem)
- Debug AIE/HLS Kernels Built in Vitis Model Composer using Vitis Debugger
- Updates to HDL Blockset in Vitis Model Composer
- Other Enhancements in Vitis Model Composer
- Improved response time for code generation
- Simulation runs only once for any design
- Save Hub block configurations as a JSON file (useful for rapid prototyping or batch processing)
- Added support for MATLAB R2024a
- Added support for Red Hat Enterprise Linux (RHEL) 8.10, 9.4
- Design Rule Checks (DRCs) to Replace Design Considerations
- Modeling scalar/wire inputs that change during execution (Direct I/O)
- Support for arbitrary precision floating-point types
- Mapping HLS code to DSP blocks
- User-determined sequence of code execution
- HLS debugger that shows data types in a user-friendly manner (using the pretty print technology of GNU debugger)
AMD Vitis™ Software Platform 2024.1 Release Highlights:
Enhancements for AMD Versal™ AI Engine DSP Designs
- Enhanced DSP Library Functions for AMD Versal AI Core Series
- Time division multiplexed (TDM) FIR filter functions for SSR > 1
- FFT with 32-bit twiddle
- Mixed-Radix 3 & Mixed-Radix 5 FFTs
- Kronecker Matrix Product
- Householder-based QRD solver for improved stability
- DFT for SSR > 1
- New DSP library functions for AMD Versal AI Edge Series with AIE-ML
- General Matrix Vector (GEMV) with SSR support
- General Matrix Multiply (GEMM) with SSR support
- AIE API Enhancements
- Support Radix-3/Radix-5 FFTs
- AIE Simulator Enhancements
- Cycle approximate simulation capabilities for AI Engine designs with PL, without the need for control, interfaces, and processing system (CIPS) IP core
- AMD Vitis analyzer support for hardware emulation with 3rd party simulators such as VCS, Questa, Xcelium, and Riviera
Key improvements to Vitis Unified Software Platform
- New device support:AMD Versal™ Premium VP1902 Adaptive SoC, AMD MicroBlaze™ V Processor
- Enhanced embedded application development and BSP generation for Windows® environment
- User-managed flow to debug embedded applications compiled externally
- New Bootgen GUI
- Enable incremental builds for platform project
Key improvements to AMD Vitis IDE (New GUI)
- Added support for processing subsystem hierarchical debug
- Added support for export and import of projects/workspace
- Added support for Python interpreter and API
- New feature preview page
- New file change notification for embedded, AIE, platform projects
Vitis What's New by Category
Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2024.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2024.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.
Enhanced DSP Library Functions for AMD Versal AI Core Series
- Time division multiplexed (TDM) FIR filter functions for SSR > 1
- FFT with 32-bit twiddle
- Mixed-Radix 3 & Mixed-Radix 5 FFTs
- Kronecker Matrix Product
- Householder-based QRD solver for improved stability
- DFT for SSR > 1
New DSP library functions for AMD Versal AI Edge Series with AIE-ML
- General Matrix Vector (GEMV) with SSR support
- General Matrix Multiply (GEMM) with SSR support
AIE API Enhancements
Support Radix-3/Radix-5 FFTs
AI Engine Simulator Enhancements
- Cycle approximate simulation capabilities for AI Engine designs with PL, without the need for CIPS (Control, Interfaces, and Processing System IP Core).
- Vitis analyzer support for hardware emulation with 3rd party simulators such as VCS, Questa, Xcelium,and Riviera
- Export tables from Vitis analyzer to CSV format
- New DSP functions supported for AIE and AIE-ML within AMD Vitis Model Composer
- Time Division Multiplexed (TDM) FIR Filter functions
- For building polyphase channelizers @ 1 GSPS and higher throughput
- DFT/IDFT – with SSR support
- Optimized transforms for throughput/latency on small sizes
- FFT/IFFT – with extended support for CINT32-bit twiddle
- Mixed-Radix FFT/IFFT – with AIE-ML support
- Ease-of-use improvements to Model Composer Hub block
- Enhancements to Hardware Validation flow
- OS and MATLAB® version support added with v 2024.1:
- RHEL 9
- MATLAB R2023a and R2023b
New example designs available on Github.
A new stencil pragma simplifies HLS C++ code for image and video filters
New library function wizards tap into the AMD Vitis libraries github repo
- Create “Solver” and “Vision” (OpenCV compatible) IPs for AMD Vivado design tool
- Run the available library examples
Pragma for memory interface (ap_memory) can now bundle ports for AMD Vivado IP Integrator
New HLS component comparison displays side-by-side metrics for 2 or more components
Support for user-provider RTL code to replace a C++ function (black-box flow)
Code Analyzer can now disaggregate C++ struct members to fine-tune performance analysis
New user control for HLS global FSM encoding and selection of safe state
Access to Clang sanitizers during C-Simulation to perform address and initialization checks
Vitis™ Software Platform 2023.2 Release Highlights:
Enhancements for Versal™ AI Engine DSP Designs
- New DSP library functions
- New API support for DSP functions
- New features in AI Engine compiler and simulators
New Standalone Vitis Embedded Software
- A smaller standalone installer for designers writing C code for the Arm® embedded subsystem
- All embedded features are provided, including utilities such as Bootgen and XSCT
New Vitis Unified Integrated Design Environment
- Consistent GUI and CLI across all Vitis workflows
- Next-generation, Eclipse Theia-based GUI provides better flexibility and user-friendly features for enhanced work efficiency
Vitis What's New by Category
Expand the sections below to learn more about the new features and enhancements in Vitis software platform 2023.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2023.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.
New DSP library functions for AI Engines
- Mixed Radix FFT
- Discrete Fourier Transform (DFT)
- General Matrix-Vector Multiply (GEMV)
New API support for DSP functions
- FFT IP with cint32 twiddle data types
- Support for cint16 for Radix-4 FFT APIs
- Vectorized "fix2flt" and "flt2fix" implemented in API
New API support for AIE-ML
- APIs now support int32/cint32 data types in sliding_mul() function
- APIs now support <float> data types in sliding_mul() function
- All AIE API routines required to support sparse matrix multiplication are provided
Major Component Updates:
- U-boot 2024.1
- Arm Trusted Firmware 2.10
- Linux Kernel 6.6_LTS
- Qemu 8.1
- Xen 4.18
- OpenAMP 2023.10
Sunset BSPs:
- AMD Microblaze™: VCU118, KCU105, KC705, AC701
- Zynq: zc706
- AMD Versal™: VMK180-EMMC, VMK180-OSPI
- Zynq MP: ZCU111
New BSPs (XSCT) :
- VEK280 Production BSP with New ETH Phy
New System Device Tree Flow (SDT) BSP :
- ZCU102, ZCU104, ZCU105, ZCU216
- ZCU208, ZCU208-sdfec, ZCU670
- VCK190
- VMK180
- VPK120
- VPK180
- VEK280
- AIE compiler can now support 2D and 3D arrays as inputs or outputs
- Vitis Analyzer now generates guidance report to adjust FIFO size
- New support for multi-threaded simulator kernel and value change dump (VCD) analyzer speedup
- External interfacing with MATLAB® environment & Python traffic generators
- Enhanced AXI Stream model with support for empty/wait cycles in PLIO alignment
- Enhanced Design Rule Checking
- AI Engine trace offload via high-speed debug
- NoC and hard DDRMC profiling support in the Vitis environment
- Vitis tool now supports AIE-ML trace for VEK280 and Alveo™ V70 AI inference accelerator card
- AI Engine block updates
- Support for importing AIE-ML graphs as blocks into Vitis Model Composer
- New DSPlib functions for AIE and AIE-ML implementation in Vitis Model Composer
- Plotting of AIE simulator output for internal signals in the Simulink® tool
- HLS Kernel block updates
- Automatic test bench generation
- Expanded data type support for HLS Kernel blocks
- Integration of Vitis Model Composer and Vitis tool
- Generation of .xo and libadf.a files directly from Vitis Model Composer
- Other enhancements
- MATLAB® tool version support: R2021a, R2021b Update 6, R2022a Update 6, R2022b
- Additional topologies supported for the hardware validation flow
- New example collaterals available from GitHub
- New Vitis Unified IDE for HLS components
- New Vitis HLS license requirements
- New code analyzer feature for obtaining performance estimations before running C synthesis
- Enhancements to AXI interface:
- Support for HLS AXI Stream side-channels
- Support for user-configurable AXI master caching
- Other enhancements:
- New code complexity report to enable identifying design size issues during C synthesis
- Compile time improvements: Average compile time improvement of 20% in 2023.2 compared to 2023.11
Vitis Software Platform 2023.1 Release Highlights:
New Vitis™ Library Functions for Versal™ AI Engine (AIE) Arrays
- DSP library functions – more FIR filter configurations
- Solver library functions – enhancements for higher performance
Design Flow Enhancements for Versal AI Core and AI Edge Series
- AIE compiler support for 2D and 3D arrays as inputs/outputs
- AIE simulator guidance support for FIFO sizing to avoid deadlock conditions
- AIE status reporting enhancements
- New default GUI for the Vitis analyzer
Support for Vitis environment export to the Vivado™ environment
- Enables Vitis and Vivado tool development teams to work in parallel based on a common interface checkpoint
Vitis What's New by Category
Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2023.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2023.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.
DSP Library - FIR Filters
- Enhanced Fractional Resampler FIR, Single Rate FIR, Half Band FIR, and Rate Change FIR to support coefficient bit widths to be larger than data bit widths
- Fractional Resampler FIR also supports SSR operation using multiple AIE tiles and incorporates coefficient reload feature
Solver Library
- Enhanced API performance with high-performance streaming designs (~300 tiles)
- QR and Cholesky Decomposition support for 4D data mover functions to help read or write data from AIE arrays
- AIE compiler can now support 2D and 3D arrays as inputs or outputs in addition to 1D.
- AIE compiler supports graph-within-graph constructs (subgraphs) and conditional port constructs.
- New AIE CINT-to-CFLOAT data conversion APIs.
- AIE status reporting enhancement to generate a file that includes information about tiles, events, and additional registers on AIE-ML and AIE tiles in the design.
- Offloading of AIE event trace over high-speed differential pairs (HSDPs) instead of storing it in memory on Versal devices.
- NoC and hard DDR MC profiling support in the Vitis environment.
- AIE windowed event trace for inspecting a specific part of an application.
- Guidance with FIFO sizing to avoid deadlocks.
- Ability to select nodes that are reported by the AIE simulator to reduce the size of the simulator VCD file and speed up simulation.
- AIE simulator now generates a report (that can be viewed in the Vitis analyzer) that shows which AIE has memory access violations and how these correspond to lines in the graph C code.
- Trace view data visualization now supports the AIE-ML array as well.
- New data type support for FIR filter configurations that target Versal AI Engines
- Two new floating-point functions optimized for DSP58s in Versal adaptive SoCs
- Faster response time for all Vitis Model Composer library functions targeting Versal AI Engines
- Other enhancements:
- Enhancements to HLS kernel blocks
- Enhancements to the Vitis Model Composer Hub
- Support for MATLAB tool versions R2021a, R2021b, R2022a
- Performance improvements2: Average latency improvements of 5.2% in 2023.1 compared to 2022.2
- Easy way to download, view, and instantiate L1 libraries functions in the Vitis HLS tool
- Enhanced support for AXI transactions and burst reporting within the Vitis HLS tool
Vitis Software Platform 2022.2 Release Highlights:
New Vitis™ Library Functions for Versal™ AI Engine (AIE) Arrays
- DSP library functions – enhanced features
- Solver library functions
- Vision library functions
- Ultrasound library functions
Design Flow Enhancements for Versal AI Core and AI Edge Series
- Control relative placement of kernels in the AI Engine array – higher performance and better utilization
- AIE x86 simulator enhancements - improved modeling of deadlock conditions in x86 simulator
- AIE API enhancements - Radix 3/5 FFT and Matrix ‘x’ Vector APIs added
- Enhanced profiling and debugging capabilities for Versal designs – deadlock detection, larger trace data collection, RTL/Python testbench support
- New simulation options for heterogenous designs in Vitis
Vitis What's New by Category
Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2022.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2022.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.
DSP library functions
- Super sample rate (SSR) FIR filter implementation on AI Engine now supports coefficient reload feature and dynamic point size
- Added FFT windowing element to the FFT function that targets the AI Engine array
Solver library functions
- Quadrature rotation (QR) decomposition
- Cholesky decomposition
Vision library functions
- Four new video functions targeting the AI Engine array
Ultrasound library functions
- Various functions to help build medical ultrasound designs
- Ability to add constraints to control relative placement of kernels in the AI Engine array - this allows users to get higher performance and better utilization
- Improved modeling of AIE deadlock conditions in x86 simulator
- New AIE API added - Radix 3/5 FFT and Matrix ‘x’ Vector APIs added
- Generation of AI Engine profiling reports in HW Emulation
- Deadlock detection using XSDB (AMD System Debugger) for both AI Engine and PL-based designs
- Xilinx Runtime (XRT) controlled continuous offloading of AI Engine event trace over PLIO
- Supports PS application on x86 host machine for SW emulation
- Allows SystemC functional models for HW emulation instead of RTL
- Allows users to simulate the AI Engine kernel with a simple RTL test bench or Python script-based traffic generator
- AI Engine status can be analyzed during HW emulation with the Vitis™ analyzer
Vitis environment 2022.2 new simulation options: Processor system x86 simulation and AI Engine x86 simulation: Programmable logic simulation can be performed using the x86 simulator.
- Features for Versal AI Engine Design
- Ability to add graph constraints to AI Engine DSP library blocks designs – better utilization and performance
- New capability for cycle approximate simulation for AI Engine designs
- AI Engine Graph Import block automatically detects Run Time Parameter (RTP) ports
- Enhancements and additions to the DSP Library blocks
- General Features
- Hardware validation flow supported for heterogenous system designs that use PL and AIE array
- Vitis Model Composer Hub block updated to support heterogenous design
- Automatic detection of valid AI Engine, HDL, and HLS subsystems
- Hardware validation flow enhanced for HDL only designs and HDL → AI Engine → HDL designs for Versal platforms
- Improved 'task level parralellism' coding style support
- Enables faster C simulation and better QoR
- Additional performance and timing enhancements
- Improved burst inference
- Automatic inference of Unroll, Pipeline, Array_Partition, and inline pragmas for better performance
- Improved timing accuracy resulting in better timing closure at higher frequencies
- Other features
- Analysis and debug: printf inserted in C-code now supported even after synthesis in the RTL
- Ease of use: new performance pragma to automatically achieve a given transaction interval
- HLS::stream interfaces now supported by FFT and FIR IP
Vitis Software Platform 2022.1 Release Highlights:
Vitis™ Flow Enhancement for Versal™ ACAP and AI Engine
- Supports AMD base DFX platform with one static region and one DFX region
- AIE profiling supports stall/deadlock detection, generates AI Engine status (including error events) view reports in Vitis Analyzer
- External Traffic Generators in x86sim, AIEsim, and SW emulation are much more flexible and can be inserted very easily in Simulation and Emulation flows
- Vitis Model Composer supports Hardware Validation, Linux and HW emulation
Vitis for DC and Vitis HLS
- Vitis Provides additional reporting support for the dynamic region generation process and Flow reporting enhancements include 3 new or updated reports
- Vitis Improves PL profiling with the choice of offloading trace to memory resources (preferred) or FIFO in the PL for better performance
- A new Timeline Trace Viewer to show the runtime profile and allows user to remain in the Vitis HLS GUI is now available after simulation
- Vitis HLS now supports a higher-level type of "smart" construct via the new performance pragma or the set_performance_directive
- Vitis Graph Library with L3 API enhancements (1 mS time saved for kernel call) for performance
Vitis What's New by Category
Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2022.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2022.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.
- new Genomics Accelerator Library Added (L1&L2 and L3
- Graph Library, L3 enhancements for performance
- Vitis Database Library, GQE Multi-Functional Kernel
- New functions added in Vision Library
- New functions in Vitis AIE Vision Library additions/enhancements
- Vitis AIE DSP library, FIR resampler supersedes FIR fractional interpolator
- Vitis Codec Library new APIs, API jxlEnc, API ‘leptonEnc’, API ‘resize’, API ‘WebpEnc’
Vitis Data Compression Library
- ZLIB Compress Improvement, Customized Octa-Core compression for 8KB solution
- ZLIB Decompression Improvement, Customized IP for 8KB file size
- Platform Capability Query Improvement
- HBM Easy-of-Use Improvement, Ability to choose a specific S_AXI entry point to the HMSS for a kernel M_AXI, RAMA insertion supported from the configuration files
Vitis AI Engine Compiler
- AI Engine Automated Stall/Deadlock Detection & Analysis in Hardware
- Analyzing the Automated Status Output
- Analyzing the Automated Status Output – Buffers
- Analyzing the Manual Status Output in Hardware
- Analyzing the Manual Status Output
- AI Engine Event Trace Enhancements
- External traffic generators AIEsim
- AI Engine Profiling Improvements on HW
- AI Engine support for Broadcast windows
- Vitis AI Engine Compiler Enhanced Graph Programming Model
- Vitis AI Engine Compiler - PLIO/GMIO in ADF Graphs
Vitis HLS
- Analysis Enhancements, New Timeline Trace Viewer
- Coding Style Enhancements, Array Partition support for Stream of Blocks type
- Pragma Abstraction, New Performance Pragma (and directive)
- Vitis Core “one liner”, Vitis HLS - New Timeline Trace Viewer, new PERFORMANCE pragma, Stream of Blocks support windows
- New Viewer introduced
- Shows the runtime profile of all surviving functions in your design - i.e., those that get converted into modules
- Especially useful to see the behavior of dataflow regions after Co-simulation
- Native to Vitis HLS - No need to launch the xsim waveform viewer anymore (external tool)
Vitis Analyzer
- Vitis Analyzer Improvement, Save/Restore Timeline Customization
- Reporting Enhancement, report_qor_assessment, xclbin Clocking Information, Vivado Automation Summary
- Profiling Enhancement, New PL profiling infrastructure enabled, Multiple trace_memory options can be added to insert multiple memory monitors (HW Only), Sample config file for v++ linker to offload trace data for all CUs in SLR0 to DDR0 and same for all CUs in SLR1 to DDR1
Vitis IDE
- Updated Bootgen GUI for Versal
- Toolchain Update
- XSCT, Support STAPL, Add Linker script generation command
- System Compile Flow, Refer to system compile doc
Vitis Emulation
- Add Software Emulation support for Auto-restart and mailbox support for always running kernels
- Free running kernel doesn’t need while(1) for sw-emu
- Add Software Emulation support for external traffic generator
- Hardware Emulation can use HLS C source code function model for Streaming IP.
- Add API xrt::system for Probing number of devices
- Add API xrt::message for Logging messages
- XRT Native API host code now requires
-std=c++17 or above - Add experimental xrt::queue APIs for asynchronous execution of synchronous operations
- xbutil can show AIE FIFO counters that helps to debug AIE deadlock scenarios
- xbutil --legacy option is removed.
- xclbinutil --info provides clock information for embedded platforms
- xbutil on ARM can load SOM images
- xbtop standalone utility to show linux top like output (replacing legacy xbutil -top)
- XRT Utilities supports auto-completion in Bash with tab key.
- Alveo Platform Updates, Platform Updates for improved stability, Card Management Updates, SC Firmware Update Tool
- Embedded Platform, New VCK190 DFX Platform: xilinx_vck190_base_dfx_202210_1, Embedded Platforms are now installed with Vitis, Vivado adds a new Customizable Example Design: Vitis Platform for MPSoC
- Major overhaul of the Vitis Model Composer hub block for scalability and ease of use
- Hardware validation flow now supports Linux in addition to bare-metal
- "AIE to HDL" and "HDL to AIE" blocks no longer include the HDL gateway blocks
- 2022.1 now ships with a snapshot of the examples for customers who do not have access to the internet. The tool will prompt the user to download a new revision of the examples from GitHub if available
- For ease of use, utility blocks that are not part of code generation are now presented with a white background color
- Enhanced and reorganized the library browser for ease of use
- RHEL 8.x support
- MATLAB Support - R2021a and R2021b
Vitis Software Platform 2021.2 Release Highlights:
- New domain specific development environments
- Vitis™ Video Analytics SDK on Kria™ SOM, Alveo™ U30/U50, and VCK5000 Versal™ development card: Learn More >
- Vitis Blockchain solution on Varium™ C1100 card with Vitis libs: Learn More >
- Full end to end flow support for VCK5000 and Varium C1100 cards
- Enhanced core tool features
- Vitis AI Engine Compiler C/C++ high level abstraction API, Auto Pragma Inference, Area Group Constraints
- Vitis AI Engine x86simulator enhancements: Trace Report, Memory Access Violation and Deadlock Detection
- Vitis HLS EoU, Timing and QoR enhancement, HLS APIs for user-controlled burst inferencing
- Enhanced Vitis Analyzer for better timeline trace report, data visualization, stall analysis
- Vitis XRT for AI Engine Multiple Process and Multi Thread Support for AI Engine graph control
- Vitis IDE & Emulation support AI Engine Trace, SW Emulation for AI Engine applications
- 39 new C/C++ library in diverse domains covering in DSP, Data Analytics, Vision, Compression, Database, Graph, Security, … total of over 1000 library functions, Database, Graph, Security, …
- Vitis Model Composer
- 3x compile/simulation time, 7x compilation time reduction with Parallel Compilation
- New Hardware Validation Flow and Enhanced Functional Co-simulation
Vitis What's New by Category
Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2021.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2021.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.
Note: Vitis Accelerated Libraries are available as a separate download. They can be downloaded from GitHub or directly from within the Vitis IDE as well.
Library | 2021.1 | 2021.2 | New functions in 21.2 |
---|---|---|---|
xf_blas | 167 | 167 | 0 |
xf_codec | 3 | 3 | 0 |
xf_DataAnalytics | 33 | 36 | 3 |
xf_database | 62 | 65 | 3 |
xf_compression | 78 | 93 | 15 |
xf_dsp | 94 | 96 | 2 |
xf_graph | 53 | 59 | 6 |
xf_hpc | 37 | 37 | 0 |
xf_fintech | 116 | 116 | 0 |
xf_security | 135 | 140 | 5 |
xf_solver | 11 | 11 | 0 |
xf_sparse | 11 | 11 | 0 |
xf_utils_hw | 55 | 57 | 2 |
xf_opencv | 147 | 150 | 3 |
total | 1002 | 1041 | 39 |
Note: For vision, just count the number of sub folders in L*/tests, because each API has multiple tests for different types
Vitis Vision Library
- Programmable Logic (PL)
- End-to-end Mono Image Processing(ISP)with CLAHE TMO
- RGB-IR along with RGB-IR Image Processing(ISP) pipeline
- Global Tone Mapping(GTM) along with an ISP pipeline using GTM
New Features | Cat | Customer/Strategic | Segments | Description |
---|---|---|---|---|
RGB-IR | ISP | Seeing Machines | Automotive, ISM | •Support 4x4 RGB-IR demosaicking •Primarily for in-cabin monitoring system •Low light surveillance camera |
Mono (CCCC) | ISP | Strategic | Automotive, ISM, A&D | •Machine vision •Low light applications |
Global Tone Mapping (GTM) | ISP | Strategic | Automotive, ISM, A&D | •Improved dynamic range and contrast •Lower cost version compared to local tone mapping (LTM) |
Dense Optical Flow TV-L1 | CV | NTT | ISM | •Improved robustness (against illumination, noise, occlusions) for optical flow |
AI Engine (AIE)
- BlobFromImage
- Back to back filter2D with batch size three support
New Features | Cat | Customer/Strategic | Segments | Description |
---|---|---|---|---|
RGB-IR | ISP | Seeing Machines | Automotive, ISM | •Support 4x4 RGB-IR demosaicking •Primarily for in-cabin monitoring system •Low light surveillance camera |
ML+X | ISP | Strategic | Automotive, ISM, A&D | •ML interference pre-processing |
Gaussian Pyramid | CV | Strategic | Automotive, ISM, A&D | •Fundamental for multi-scale image processing |
Box Filter | CV | Strategic | Automotive, ISM, A&D | •Fundamental for smoothing, low pass filter |
Vitis Data Analytics Library
- Vitis Blockchain Solution based on Vitis libraries
- Out-of-Box Mining solutions for Ethereum
- Open-Source & easy to use and deploy with Vitis Libs using C++
- Flexible & Scalable with Vitis Libs
- Be flexible to mine multiple coins
- Customize and compile into hardware
- Highly optimized design
- Adding CSV parser API into library
- CSV parser could parse comma-seperated value files and generate object stream which could easily be connected with DataFrame APIs
Vitis Graph Library
- New L2 libraries added
- Louvain with renumber
- Renumbering
- The ‘weight’ feature is supported for Cosin Similarity
Vitis Database Library
- GQE start to support asynchronous input / output feature, along with multi-card support.
- Asynchronous support will allow the FPGA start to process as soon as part of the input data is ready.
- Multi-card support allows to identify multiple Alveo cards that suitable for working.
Vitis Data Compression Library
- ZSTD Mult-Core Compression
- Created new ZSTD multi-core architecture and provided >1GB/s throughput using quad-core.
- ZSTD Decompress optimization
- ZSTD decompress optimized for performance (increased by 20%) and resource (reduced < 30%)
- GZIP/ZLIB Stream Core Improvement for IBM
- Customized Static & Dynamic compress streaming IP (4KB & 8KB)
- Added functionality to provide compressed size in TUSER port
- GZIP/ZLIB Decompress Improvement for IBM
- Optimized huffman decoder to reduce latency < 1.5K cycles
- Reduced resources significantly from to 6.9K (older > 9K)
- Added ADLR32 Checksum Functionality
- GZIP System Compiler PoC
- Created a System Compiler PoC for GZIP Compress solution and benchmarked against OpenCL Host.
Vitis DSP Library
- DSPLib on Github since 2021
- Fast Fourier Transform (FFT/iFFT)
- Point size increase to 32k (data type dependent)
- Support for stream API as well as window API.
- Parallel Power (0-4)
- Allows higher throughput and extends range of supported point sizes
- FIR Filters
- Initial Stream support for Single Rate asymmetric / symmetric FIR
- DDS/Mixer
- New library unit in 2021.2
Vitis Security Library
- KECCAK-256 (hash function) and CRC32C (checksum function) are released
Vitis Utilities Library
- Two Data-Mover implementation are added for debugging hw issue.
- LoadDdrToStreamWithCounter: For loading data from PL’s DDR to AI Engine through AXI stream and recording the data count sending to AI Engine.
- StoreStreamToMasterWithCounter: For receiving data from AI Engine through AXI stream and saving them to PL’s DDR, as well as recording the data count sending to DDR.
AI Engine API
- Implemented as a C++ header-only library that provides types and operations that get translated into efficient AI Engine intrinsics.
- Provides parametrizable data types that enable generic programming
- Implements most common operations in a uniform way for different data types
- Transparently translates higher-level primitives into optimized AI Engine intrinsics
- Improves portability across AI Engine architectures
AI Engine API will be the lead method for AI Engine kernel programming
High Level Optimizations
AI Engine compiler optimization options
- --xlopt=0, no optimization applied.
- --xlopt=1, automatic computation of heap size, guidance generation from LLVM IR analysis.
- --xlopt=2, automatic inlining, loop peeling for unrolled loops, pragma insertion.
Introducing --xlopt=2 to improve performance, default remains --xlopt=1
- Automatic inline
- Automatically inlines functions if it is practical and possible to do so, even if the functions are not declared as __inline or inline
- Automatic pragma insertion
- Insert pragmas to kernel code automatically. (see next slide for more details)
Pragma Inference
Necessary for optimizing the kernels
- Alleviate user’s responsibility of adding effective & correct chess pragmas
Support to auto-infer five pragmas in 2021.2
- for performance:
- chess_prepare_for_pipelining for innermost loop, and outer loops with known trip count
- chess_loop_range for loops with known trip count
- chess_unroll_loop/chess_flatten_loop for innermost loops with known trip count
- for correctness:
- chess_unroll_loop_preamble when trip count is not a multiple of unroll factor
Updated Graph Programming Model PLIO and GMIO
Model Changes Include:
- Changes to usage of “simulation::platform”
- Interaction with PLIO/GMIO objects in the graph, position determines input/output.
- Changes of global PLIO/GMIO objects in the graph.
- Changes around graph connect<> statements.
PLIO/GMIO in ADF Graphs
Current
- Write PLIO, GMIO, simulation::platform, and connections at global scope
GMIO gm0(“GMIO_In0”, 64, 1);
GMIO gm1(“GMIO_In1”, 64, 1);
…
GMIO gm7(“GMIO_In7”, 64, 1);
PLIO pl0(“PLIO_Out0”, plio_32_bits, “data/output0.txt”, 250.0);
PLIO pl1(“PLIO_Out1”, plio_32_bits, “data/output1.txt”, 250.0);
…
PLIO pl7(“PLIO_Out7”, plio_32_bits, “data/output7.txt”, 250.0);
simulation::platform<8,8> plat(&gm0, &gm1,…, &gm7, &pl0, &pl1,…, &pl7,);
subgraph g;
connect<> net0(plat.src[0], g.in[0]);
connect<> net1(plat.src[1], g.in[1]);
…
connect<> net7(plat.src[7], g.in[7]);
connect<> net8(g.out[0], plat.sink[0]);
connect<> net9(g.out[1], plat.sink[1]);
…
connect<> net15(g.out[7], plat.sink[7]);
Alternative method
- Create a top-level graph and move PLIO, GMIO, and connections inside
- Allow managing connections within for loop
class topgraph
{
input_gmio gm[8];
output_plio pl[8];
subgraph sg;
topgraph()
{
for (i=0; i<8; i++)
{
gm[i] = input_gmio::create(“GMIO_In”+std::to_string(i), 64, 1);
pl[i] = output_plio::create(“PLIO_Out”+std::to_string(i), plio_32_bits, “data/output”+std::to_string(i)+”.txt”, 250.0);
connect<>(gm[i].out[0], sg.in[i]);
connect<>(sg.out[i], pl[i].in[0]);
}
}
};
topgraph g;
Area Group Constraints Improvements
Ability to use flags in the ADF graph or constraints file to control the mapper and router
- -contain_routing – when specified true ensures all routing, including nets between nodes contained in the nodeGroup, is contained within the area group.
- -exclusive_routing - when specified true ensures all routing, excluding nets between nodes from the nodeGroup, is excluded from the area group.
- -exclusive_placement - when specified true prevents all nodes not included in the nodeGroup from being placed within the area group bounding box.
Snapshots
Snapshots are textfiles containing comments and data relative to all kernel ports
- streams, packet streams, cascade streams
- windows, buffer
- RTP
Includes also all platform ports
- PLIO, GMIO, RTP
Allows users to inspect data traffic at kernel ports without using the debugger and without requiring instrumentation of kernel code
Deadlock Detection
- Detects deadlocks in x86 simulations whether this situation arises from insufficient input data, or an imbalanced FIFO depth on a re-convergent path
- The stop-on-deadlock feature must be enabled during x86 simulation by specifying option --stop-on-deadlock
- If the simulation is stopped because of a deadlock, the error message indicates that you should rerun with option -trace --timeout
Memory Access Violation Detection
Integration with Valgrind for Memory Access Violation Detection
- Detect
- out-of-bounds read and write
- read of uninitialized memory
- No specific flag required for compilation
- Simulation flags can be either
- --valgrind : simulation runs as usual and valgrind displays a report
- --valgrind-gdb : same thing but with gdb debug at the same time
Trace report
Deadlock situation results in poor simulation output and difficulties to analyze bug origin
X86 simulation trace option allows the simulator to log various timestamped information:
- Start/End of Kernel iterations
- Start/End of Stream stalls
- Start/End of lock stall
Timestamps are different in between x86 simulation and AI Engine simulation
User Controlled Burst Inference
- For use cases that do not satisfy the automatic burst inference by Vitis HLS tool, user can adopt the newly introduced manual burst optimization
- A new class 'hls::burst_maxi’ to support manual controlling burst behavior. New HLS APIs are provided to use together with the new class
- User need to understand AXI AMBA protocol and the hardware transaction level modeling in HLS design
Timing and QoR Enhancements
- Provide support for user to input high level throughput constraints
- Improve HLS timing estimation accuracy. When HLS reports timing closure, the RTL synthesis in Vivado should also expect to meet timing
EoU Enhancements
Add interface adaptors report in the C synthesis reports
- Users need to know the resource impact that interface adaptors have on their design
- Interface adaptors have variable properties that impact design QoR
- Some of these properties have associated user controls which should be reported to users
- Text version of bind_op and bind_storage reports are provided
Add new section in synthesis report to show list of pragmas and warnings on pragmas
- User can easily understand which of the pragmas that add have issues.
Analysis and Reporting Enhancements
The Function Call Graph Viewer has some new features
- New mouse drag based zoom in and out capability
- New Overview feature that shows the full graph and allows the user to zoom in on parts of the overall graph
- All functions and loops are shown along with their simulation data
A new Timeline Trace Viewer is now available after simulation. This viewer shows the runtime profile of your design and allows the user to remain in the Vitis HLS GUI.
Link Summary Enhancement
- Provide clock frequency information for the AI Engine, platform and compute units
- Provide a new table called Clocks in system diagram and platform diagram
Platform Export Enhancement
- XSA export from Vivado no source files required to be local to the project
- XSA export from Vivado no change to the project structure
- Package the IPs that are used in the hardware platform project instead of packaging the whole IP repo
AI Engine application emulation enhancements
- Provide support for external testbench integration with aiesimulation
- Provide support for external testbench integration with x86simulation
- Support for GDB debugging with x86simulation
- Provide support for snapshots of the data between kernels in a graph for x86simulation
- Provide support for access violation checking to x86sim
- Provide support for stop on deadlock to x86sim
Support AI Engine Trace
Support SW Emulation for AI Engine applications
Support external traffic generator in Verilog / System Verilog
Extend Profiling Monitor insertion to Monitor Memory
- Currently the profiling monitor logic can be inserted on kernel/CU port basis. This feature provides user the option to insert monitor logic on memory interface directly
- The visualization of memory bandwidth achieved directly on the memory interfaces can be reflected in profile summary report
- DDR memory and PLRAM are supported
- Hardware flow is supported
- To enable this feature, both linking phase and xrt need to be set up
- memory=all
- data_transfer_trace= coarse|fine or
- opencl_device_counter=true
Extend Profiling Monitor insertion to Monitor Memory
- A vadd example that enables memory interface monitoring
- A new table ‘Memory Bank Data Transfer’ is included
Vitis Analyzer Enhancements
Generic profile summary report generated for non-OpenCL applications
- Provide the same level of support for XRT API and HAL API applications.
- Users select which types of reports they want to create, the tool automatically generate and visualize them in Vitis Analyzer
Add OpenCL commands to PL event timeline
- Profiling will add overhead, XRT provides capability to dump the OpenCL events on the timeline trace without overhead.
- Vitis Analyzer can process the XRT output and show it in timeline trace view.
- xocl_debug=true needs to set in the xrt.ini.
Flatten signal hierarchy in timeline trace report
- By default, the timeline trace report displays the signal trace in hierarchical way
- Vitis Analyzer provides the capability of flattening the hierarchy by toggling the “Flatten Signal” symbol
- Comparing the waveform is supported for flattened timeline trace
Vitis Analyzer – Data Visualization
- Display input/output data to AI Engine kernels in an AI Engine design
- Helps debug AI Engine designs to show input/output data along with timeline
- Works with aiesimulator
- Supports
- Window/stream/cascade data types
- Packet streams
- Templated kernels
- data-dump utility
Vitis Analyzer – AI Engine Stall Analysis
- Vitis Analyzer provide visualization capabilities to enable users to identify root cause of stalls
- Support
- Performance Metrics
- Lock Stall Analysis
- Stream Stall Analysis
- Cascade Stall Analysis
- Memory Stall Analysis
- Support Flow
- aiesimulator
- HW emulation
Xilinx Runtime Library (XRT):
- XRT API
- The XRT native API supports user managed kernel control with xrt::ip
- XRT Utilities
- The xbutil and xbmgmt tools now becomes default
- To use the legacy utilities, please use xbutil --legacy or xbmgmt --legacy with legacy sub-commands
- New utility, xball
- Apply xbutil or xbmgmt commands to all or a filtered part of the installed data center cards. Check xball --help for details
- A new command, xbutil configure
- Allow you to enable, disable, or configure the PCIe Host Memory and PCIe Peer to Peer features. See the XRT documentation for more details
- All XRT utilities now globally support the --force option to skip user interactive confirmation
- The xbutil and xbmgmt tools now becomes default
- Profiling
- A profile summary report is generated when any profiling option is enabled.
- All applicable summary tables and guidance are generated based on the profiling options enabled in the xrt.ini file
- New data transfer summary table for aggregate information on a memory resource when monitors are added to memory resources in the design
- New AIE profiling metric sets to count different AIE events including (1) floating point exceptions in AIE, (2) tile execution counts, and (3) stream puts and gets
- Embedded
- zocl memory manager improvements to support any sptag
Vitis XRT for AI Engine Multiple Process Support
- C and C++ APIs to define access modes for multiple processes to share access to the same AI Engine array and graphs.
- ¬Protect AI Engine array & graphs from unwanted access.
- Three modes are supported for opening AI Engine array & graphs
- Exclusive Mode (prevent any other processes to access)
- Primary Mode (only allow other processes to do nondestructive access)
- Shared Mode (only do nondestructive access)
- Take into consideration when multiple process support is needed. For example:
- Prevent others to access AI Engine array(exclusive access)
- Multiple users to control different graphs separately (multiple application support)
- One primary user to control graph, and allow others to probe the running status (primary & shared access)
Vitis XRT for AI Engine Support Status
C and C++ APIs
- C version API
- For AI Engine array:
- xrtAIEDeviceOpenExclusive (Exclusive mode)
- xrtAIEDeviceOpen (Primary mode)
- xrtAIEDeviceOpenShared (Shared mode)
- For AI Engine graph:
- xrtGraphOpenExclusive (Exclusive mode)
- xrtGraphOpen (Primary mode)
- xrtGraphOpenShared (Shared mode)
- For AI Engine array:
- C++ version API
- xrt::aie::device class support access mode in constructor
- enum class access_mode : uint8_t { exclusive = 0, primary = 1, shared = 2 };
- xrt::graph class support access mode in constructor
- enum class access_mode : uint8_t { exclusive = 0, primary = 1, shared = 2, none = 3 };
- xrt::aie::device class support access mode in constructor
- Access latest Vitis Target Platforms for Alveo Cards and refer to the Getting Started section of the Accelerator Card.
- Download Vitis and refer to the Alveo Packages section
AI Engine DSP Library – New Blocks
- AIE DDS
- AIE Mixer
Parallel Compilation
- Reduced times vs. 2021.1 (As an example, the following numbers are for the 200 MHz TX Chain):
- Time to compile and simulate reduced by factor of 3
- Compilation times reduced by a factor of 7
- Dead time after simulation reduced from 25s to ~0s
Constraint Editor Enhancement
- 2021.2 Improved Navigation
To Fixed Size Improvements
To Variable Size Block Improvements
Enhanced Functional Co-simulation Capabilities
- Export Matlab data for AI Engine input – xmcVitisWrite
- Import AI Engine Data into Matlab – xmcVitisRead
- Import AI Engine Data into Matlab - xmcVitisRead
Others
- Import an AI Engine or HLS Kernel block with no input (Source block)
- New Data Type Support
- the Simulink native int64 and uint64 for AI Engine development instead of AMD data types, x_sfix64 and x_ufix64.
- accfloat and caccfloat for AI Engine Development
- Support for Ubuntu 20.04
- Support for MATALB 20a, 20b, 21a (No support for MATLAB 21b)
- Addition of new examples
- Dual stream SSR filter example with 64 kernels
- Pseudo inverse(64x32) – commslib example.
- Use xmcLibraryPath command to point to a custom DSPLib location.
- Many more enhancements and bug fixes
Vitis Software Platform 2021.1 Release Highlights:
- AMD Kria System-on-Modules (SOMs) KV260 vision AI starter kit support. The full Vitis flow for ML (DPU inference engine) + X (RTL kernel and Vitis HLS based computer vision kernels). Learn More >
- Support for new C/C++ Vision, DSP, Graph (Louvain Modularity), Codec in image processing, compression (GZIP, Facebook ZSTD, ZLIB whole application acceleration) performance-optimized libraries on FPGA and/or Versal ACAP over CPU/GPUs
- Enhanced Vitis™ core development kit design flow on Versal ACAP devices: visualization improvements for AI engine design trace report, AI engine event tracing via GMIO, incremental recompile, new boot image wizard, and encrypted AI engine source file support
- The new Vitis Model Composer tool enables rapid design exploration and verification within the MathWorks MATALB and Simulink® environment, enabling co-simulation of blocks targeting AI Engines and Programmable Logic, code generation, and test bench creation.
- New Vitis HLS Flow Navigator GUI for quick access to flow phases and reports. Merge synthesis, analysis, and debug views into a general default context
Vitis What's New by Category
Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2021.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2021.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.
Note: Vitis Accelerated Libraries are available as a separate download. They can be downloaded from GitHub or directly from within the Vitis IDE as well.
AIE DSP
- DSPLib published as part of the Vitis Acceleration Library set on Github
- DSPLib contains common parameterizable DSP functions used in many advanced signal processing applications. All functions currently support window interfaces with streaming interface support.
FIR Filters
Function |
Namespace |
Single rate, asymmetrical |
dsplib::fir::sr_asym::fir_sr_asym_graph |
Single rate, symmetrical |
dsplib::fir::sr_sym::fir_sr_sym_graph |
Interpolation asymmetrical |
dsplib::fir::interpolate_asym::fir_interpolate_asym_graph |
Decimation, halfband |
dsplib::fir::decimate_hb::fir_decimate_hb_graph |
Interpolation, halfband |
dsplib::fir::interpolate_hb::fir_interpolate_hb_graph |
Decimation, asymmetric |
dsplib::fir::decimate_asym::fir_decimate_asym_graph |
Interpolation, fractional, asymmetric |
dsplib::fir::interpolate_fract_asym:: fir_interpolate_fract_asym_graph |
Decimation, symmetric |
dsplib::fir::decimate_sym::fir_decimate_sym_graph |
FFT/iFFT - The DSPLib contains one FFT/iFFT solution. This is a single channel, single kernel decimation in time, (DIT), implementation with configurable point size, complex data types, cascade length and FFT/iFFT function.
Function |
Namespace |
Single Channel FFT/iFFT |
dsplib::fft::fft_ifft_dit_1ch_graph |
Matrix Multiply (GeMM) - The DSPLib contains one Matrix Multiply/GEMM (GEneral Matrix Multiply) solution. This supports the Matrix Multiplication of 2 Matrices A and B with configurable input data types resulting in a derived output data type.
Function |
Namespace |
Matrix Mult / GeMM |
dsplib::blas::matrix_mult::matrix_mult_graph |
Widget Utilities - These widgets support converting between window and streams on the input to the DSPLib function and between streams to windows on the output of the DSPLib function where desired and additional widget for converting between real and complex data-types.
Function |
Namespace |
Stream to Window / Window to Stream |
dsplib::widget::api_cast::widget_api_cast_graph |
Real to Complex / Complex to Real |
dsplib:widget::real2complex::widget_real2complex_graph |
DSP Library functions are supported in Vitis Model Composer, enabling users to easily plug these functions into the Matlab/Simulink environment to ease AI Engine DSP Library evaluation and overall AI Engine ADF graph development.
Vitis HPC Library release introduces HLS primitives, prebuild kernles and software APIs for HPC applications on FPGAs. These applications are:
2D Acoustic RTM (Reverse Time Migration) FDTD (Finite Difference Time Domain) algorithm, including forward kernel and backward kernel
3D Acoustic RTM (Reverse Time Migration) FDTD (Finite Difference Time Domain) algorithm, including forward kernel
MLP (Mult-Layer Perceptron) components: activation functions and fully connected network kernels
PCG (Preconditioned Conjugate Gradient) Solvers for both dense matrix and sparse matrix
- First release of selected vision functions for Versal AI Engines:
Functions available
Filter2D
absdiff
accumulate
accumulate_weighted
addweighted
blobFromImage
colorconversion
convertscaleabs
erode
gaincontrol
gaussian
laplacian
pixelwise_mul
threshold
zero
xfcvDataMovers : Utility datamovers to facilitate easy tiling of high resolution images and transfer to local memory of AI Engines cores. Two flavors
- Using PL kernel : higher throughput at the expense of additional PL resources.
- Using GMIO : lower throughput than PL kernel version but uses Versal NOC (Network on chip) and no PL resources.
- New Programmable Logic (PL) functions and features
- ISP pipeline and functions:
- Updated 2020.2 Non-HDR Pipeline
- Support to change few of the ISP parameters at runtime: gain parameters for red and blue channels, AWB enable/disable option, gamma tables for R,G,B, %pixels to compute min&max for awb normalization.
- Gamma Correction and Color Space conversion (RGB2YUYV) made part of the pipeline.
- New 2021.1 HDR Pipeline : 2020.2 Pipeline + HDR support
- HDR merge for 2 exposures which supports sensors with digital overlap between short exposure frame and long exposure frame.
- Four Bayer patterns supported : RGGB,BGGR,GRBG,GBRB
- HDR merge + isp pipeline with runtime configurations, which returns RGB output.
- Extraction function : HDR extraction function is preprocessing function, which takes single digital overlapped stream as input and returns the 2 output exposure frames(SEF,LEF).
- HDR merge for 2 exposures which supports sensors with digital overlap between short exposure frame and long exposure frame.
- 3DLUT : provides input-output mapping to control complex color operators, such as hue, saturation, and luminance.
- CLAHE: Contrast Limited Adaptive Histogram Equalization is a method which limits the contrast while performing adaptive histogram equalization so that it does not over amplify the contrast in the near constant regions. This it also reduces the problem of noise amplification.
- Updated 2020.2 Non-HDR Pipeline
- Flip : Flips the image along horizontal and vertical line.
- Custom CCA : Custom version of Connected Component Analysis Algorithm for defect detection in fruits. Apart from computing defected portion of fruits , it computes defected-pixels as well as total-fruit-pixels
- Canny updates : Canny function now supports any image resolution.
Library Related Changes
- All tests have been upgraded from using OpenCV 3.4.2 to OpenCV 4.4
- Added support for Versal Edge series (VCK190)
- A new benchmarking section with benchmarking collateral for selected pipeline/functions published.
The 2021.1 release provide Two-Gram text analytics:
Two Gram Predicate (TGP) is a search of the inverted index with a term of 2 characters. For a dataset that established an inverted index, it can find the matching id in each record in the inverted index.
- Community Detection: Louvain Modularity
- 2-Hop Search
N/A
- Adds double-precision SpMV (Sparse Matrix dense Vector multiplication) implementation with L2 kernels
In 2021.1 release, GQE receives early-access support the following features
64-bit join support: now the gqeJoin kernel and its companion gqePart kernel has been extended to 64-bit key and payload, so that a larger scale of data can be supported.
Initial Bloom-filter support: the gqeJoin kernel now ships with a mode in which it executes Bloom-filter probing. This improves efficiency on certain multi-node flows where minimizing data size in the early stage is important.
Both features are offered now as L3 pure software APIs, please check corresponding L3 test cases.
- GZIP Multi Core Compression:
- New GZIP Multi-Core Compress Streaming Accelerator which is purely stream only solution (free running kernel), it comes with many variant of different block size support of 4KB, 8KB, 16KB and 32KB.
- Facebook ZSTD Compression Core:
- New Facebook ZSTD Single Core Compression accelerator with block size 32KB. Multi-cores ZSTD compression is in progress (for higher throughput).
- GZIP low latency Decompression:
- A new version of GZIP decompress with improved latency for each block, lesser resources (35% lower LUT, 83% lower BRAM) and improved FMax.
- ZLIB Whole Application Acceleration using U50:
- L3 GZIP solution for U50 Platform, containing 6 Compression core to saturate full PCIe bandwidth. It is provided with Efficient GZIP SW Solution to accelerate CPU libz.so library which provide seamless Inflate and deflate API level integration to end customer software without recompiling.
- Versal Platform Supports.
- Add AIE Support - See above
- The 2021.1 release provide support for: * RIPEMD160 * Initial support for BLS (not complete)
- In the 2021.1 release, Data-Mover is added to this library. Unlike other C++ based APIs, this addition is targeting people less experienced in HLS based kernel design and just want to test their stream-based designs. The Data-Mover is actually a kernel source code generator, creating a list of common helper kernels to drive or validate designs, like those on AIE devices.
- Produce QoR metrics (Vitis QoR Generation API)
- Cycles took by Application kernel
- Stall cycles (computed from VCD file)
- Measure overhead cycles in the wrapper (time spent in other functions than the kernel itself)
- Throughput
- 3 levels of optimization XLOPT=0, 1 (default), 2
- New functionalities for xlopt=2:
- loop fusion, flatten single iteration outer loops, enhance loop peeling heuristics
- Analyze "__restrict" usage and give guidance
- Incremental recompile: when the graph does not change, recompile only kernels that've been modified
- Packet Switched data → up to 32-split (was limited to 4)
- New DMA FIFO location constraint (mapper/router changes between release do not impact performances)
- Use mapping solution as a constraint in the new compilation: prevent future mapping variations that impact performance
- Bring x86sim feature support to aiesim level
- Start of deprecation of PL kernels in ADF graphs (complete deprecation in 2021.2)
- New “Flow Navigator” in GUI for quick access to flow phases and reports. The contextual "synthesis, analysis, debug" views are merged into a general default context
- New synthesis report section for the BIND_OP and BIND_STORAGE directives
- A new post-synthesis text report reflects the information provided in the GUI synthesis report
- The IP export and Vivado implementation run widgets have been redesigned with options to pass settings and constraint files to Vivado
- New function call graph viewer to visualize functions and loops which can be highlighted with an optional heatmap to detect II, latency, or DSP/BRAM utilization hot spots
- Versal timing calibration and new controls for DSP block native floating-point operations (the -precision option for config_op)
- The Vitis HLS Migration guide (former UG1391) is now a chapter in UG1399
- New methodology sections in user guide (UG1399 and web)
- Alternate flushable pipeline option has been improved (free-running pipeline aka "frp")
- In Vitis, a top port pointer can now simply be mapped onto the axi-lite adapter rather than a global memory
- The aggregate directive now provides a "-compact bit" option for maximum packing
- Adds back a "Leave Feedback" entry in Help menu with optional survey
- Fixed bug for "Man Pages" tab not displaying information on some Linux systems
- In Vitis, reshaping m_axi interfaces should be done via the hls::vector types
- New customization options for s_axilite and m_axi data storage which can be "auto, "uram", "bram" or "lutram" allowing you to tweak RAM utilization in your design
- In Vitis, introducing a new continuously (aka "never-ending") running mode for kernel
- The axi_lite secondary clock option has been re-instated
- Enhance support for RTL kernel packaging in Vivado IP packager
- public and productized feature with proper methodology and documentation.
XRT managed kernel is the default flow.
Support encrypted AIE source files as input
AIE compiler can accept encrypted AIE source file and v++ supports the rest of the flow.
- Add Create Boot Image Wizard support for Versal devices
- Multiple improvements for AI Engine programming and debugging
- Being able to turn on and off micro code labels
- Static Cross-probing between the source code and the microcode
- Full view of the microcode
- Bringing the last PC in the visible area whenever Pipeline view updates the data
- Aligning the Instruction data in Pipe line view
- Adding "Single Instruction Mode" action to disassembly view.
- Be able to generate a default BIF file for a platform project
- Program Flash for SD and eMMC adds raw mode support
- In-context help messages are added to AI Engine development flow
- Upgraded GCC toolchain version to 10.2
- Users can emulate AXI-MM master/slave through an external process such as Python / C++. This may help users to emulate design with quick design time of AXI Master / Slave, without investing resources in developing AXI Master or VIP. AXI-MM Inter-process communication can also help to emulate the Chip-to-Chip connection between two FPGAs.
- Enabling compilation of Versal models for VCS.
- Platform developers can run hardware emulation on the platform with standalone applications to test the platform in the early stage.
- User range profiling information and user event information are aggregated into profile summary report
Vitis Analyzer shows a critical timing path.
Vitis Analyzer will display a simplified version of the Vivado GUI timing report, without the need to open a Vivado project or netlist. This allows users to quickly navigate to the failing timing path.
Vitis Analyzer multiple strategies support
Results from multiple strategies run can be visualized in Vitis Analyzer.
- New xrt.ini switches for profiling and debug
Reduce memory and loading time for large applications
The new profile tool takes less resource for processing large csv file, which reduces the loading time and the crashing problem occurrence.
PL continuous trace offloading improvement
Use DDR or HBM as memory resource to store trace data
Circular buffer support for large data offloading
Trace buffer size and offloading interval can be set in xrt.ini
Improvements to the visualization of AIE design’s trace report
All AIE inputs will be displayed(window, stream, cascaded stream, etc.)
Support all IO data types
- Stable native XRT API, with C++ APIs for AIE graph control and execution, Software Emulation and tracing support.
- XRT provides new helper APIs to help users to move from OpenCL API to XRT native API in $XILINX_XRT/include/CL/cl2xrt.hpp.
- XRT New API xrt::device.get_info() can extract device properties
- Greatly improved next generation xbutil and xbmgmt utilities are now the default.
- xbutil can report power status
- xbmgmt can support runtime clk scale and setup user power threshold to protect board and server.
- sysfs, xbmgmt and xbutil can report MAC address of Alveo board
- KDS scheduler in xocl has been refactored to significantly improve the throughput across hundreds of processes exercising multiple compute units across multiple devices concurrently. For legacy shells you may notice small percentage of throughput degradation. Please see the AR for proper solution.
- XRT driver debug trace support through debugfs /sys/kernel/debug/xclmgmt/ and /sys/kernel/debug/xocl/
Access the latest Vitis Target Platforms for Alveo Accelerator cards at www.xilinx.com/alveo. Please refer to the Getting Started section of the accelerator card you want to deploy your applications on.
Please refer to UG1120 - Alveo Data Center Accelerator Card Platforms User Guide for more details and to keep up-to-date on the latest Vitis Target Platform releases, as they become available.
New Platforms
- Alveo U200 Gen3x16 XDMA 1RP
- Name: xilinx_u200_gen3x16_xdma_1_202110_1
- Features: Slave Bridge, P2P, GT Kernel, DDR Self-Refresh
- Alveo U50 Gen3x16 noDMA 1RP
- Name: xilinx_u50_gen3x16_nodma_1_202110_1
- Features: Slave Bridge, P2P, GT Kernel, Clock Throttling
Vitis Embedded Platforms
- VCK190 Base Platform enables ECC on DDR and LPDDR; constraints become concise.
- MPSoC base platforms increased CMA size to 1536M. All Vitis-AI models can run with this CMA size.
- Embedded platform creation flow gets simplified: Device Tree Generator can automatically generate a ZOCL node; XSCT can generate BIF files. Base platform source files are reduced.
- Support for Kubernetes(K8s) clusters: Xilinx FPGA Resource Manager (XRM) can now be used together with the Kubernetes to run and manage compute units (CUs) across a pool of multiple Alveo accelerator cards attached to a server and scale applications to multiple servers with Alveo cards.
- A comprehensive constraint editor enables users to specify any constraint for AI Engine kernels in Vitis Model Composer. The generated ADF graph will contain these constraints.
- Addition of AI Engine FFT and IFFT blocks to the library browser.
- Users now have access to many variations of AI Engine FIR blocks in the library browser.
- Ability to specify filter coefficients using input ports for FIR filters.
- Addition of two new utility blocks "RTP Source" and "To Variable Size".
- Enhanced AIE Kernel import block now also supports importing templatized AI Engine functions.
- Ability to specify AMD platforms for AI Engine designs in the Hub block.
- Through the Hub block, users can relaunch Vitis Analyzer at any time after running AIE Simulation.
- Users can now plot cycle approximate outputs and see estimated throughput for each output using Simulink Data Inspector.
- Enhanced usability to import a graph as a block using only the graph header file.
- Revamping of the progress bar with cancel button
- Usability improvement during importing an AI Engine kernel or simulation of a design when MATLAB working directory and model directory are not the same.
- New TX Chain 200MHz example.
- New 2d FFT examples showcasing designs with HLS, HDL, and AI Engine blocks.
HDL
- Simulation speed enhancement for SSR FIR (more than 10x improvement), and SSR FFT.
- Simulation speed enhancement for memory blocks like RAMs, and FIFOs
- Questa Simulator updated with VHDL 2008 in the Black-box import flow
General
- Vitis Model Composer now contains the functionality of AMD System Generator for DSP. Users who have been using AMD System Generator for DSP can continue development using Vitis Model Composer.
- MATLAB Support - R2020a, R2020b & R2021a
Vitis Software Platform 2020.2 Release Highlights:
- Vitis 2020.2 supports application acceleration and embedded software development for Versal ACAP Platforms
- Vitis Core Development Kit now includes the AI Engine Compiler to compile C/C++ applications for Versal AI Engines. AI Engine, part of Versal AI Core Series, is a vector processor for compute-intensive applications
- Vitis HLS is default for both accelerated-kernel compilation (Vitis) and C/C++ to RTL IP creation flow (Vivado)
- 600+ FPGA-accelerated functions across 13 performance-optimized libraries. 2020.2 introduces the new Vitis HPC library for accelerating high-performance computing applications and several enhancements & additions to the Data Analytics, Graph, BLAS, Sparse, Security & Database libraries
- Support for evaluating multiple implementation strategies for final FPGA binary creation & enhancements for easier RTL-kernel integration within Vitis applications
- Other enhancements this release include support for AI Engine application profiling, Git version control for Vitis projects, Vitis AI profiler data integration within Vitis Analyzer and enhancements for emulation modes.
- Add-on for MATLAB® and Simulink® : Unification of AMD Model Composer and System Generator for DSP. AI Engine is a new domain in Add-On for MATLAB and Simulink.
附注
- Based on testing on August 10, 2023, across 1000 Vitis L2/L3 code library designs, with Vitis HLS release 2023.2 vs. Vitis HLS 2023.1. System configuration during testing: Intel Xeon E5-2690 v4 @ 2.6GHz CPU, 256GB RAM, RedHat Enterprise Linux 8.6. Actual performance will vary. System manufacturers may vary configuration, yielding different results. -VGL-04
- The benchmark tests were performed on all 1208 Vitis L1 library C-code designs as of February 12th, 2023. All designs were run using a system with 2P Intel Xeon E5-2690 CPUs with CentOS Linux, SMT enabled, Turbo Boost disabled. Hardware configuration not expected to effect software test results. Results may vary based on software and firmware settings and configurations- VGL-03