Vitis What's New

AMD Vitis™ Software Platform 2024.2 Release Highlights:

Enhancements for AMD Versal AI Engine DSP Designs

Latency and throughput estimates using Vitis Analyzer
Mark the unavailable PLIOs using Vitis Analyzer
Rapid Prototyping of AMD Versal™ AI Engine Designs
Heap stack and program memory reporting

New and Enhanced Vitis Library functions for Versal AI Engines

Enhanced DSP Library Functions for AIE (Available on Versal AI Core, Versal Premium Series)
- Performance enhanced Time Division Multiplexed (TDM) FIR filter functions
- Higher performance versions of
  - General Matrix Vector (GEMV)
  - General Matrix Multiply (GEMM)
- 2D IFFT – partitioned across AIE + PL for high performance
New DSP Library Functions for AIE-ML (Available on Versal AI Edge)
- Performance enhanced TDM FIR filter functions
- Support for Radix-3/Radix-5 FFTs
- GEMV
- GEMM

New Ease-of-use Features in the Vitis IDE (new GUI)

New Serial Terminal: Monitor serial messages from the hardware
Install and explore third-party extensions
PS Trace feature for debugging & optimizing the performance of embedded systems

Enhancements to Vitis Model Composer for AIE DSP Designs

AI Engine DSP Library Updates
- AIE (Available on Versal AI Core, Versal Premium Series)
  - Mixed Radix FFT
  - Stockham FFT performance enhancements
  - TDM FIR
- AIE-ML (Available on Versal AI Edge Series)
  - TDM FIR
  - Direct Digital Synthesis (DDS – used for waveform generation)
  - Mixer (used for frequency shifting)
- AIE-MLv2 (Available on Versal AI Edge Gen 2 Series)
  - FIR
  - DFT
  - DDS
  - Mixer
Additional Data Types for Vitis Model Composer
- Support for cbfloat16
- Additional data type support for cascaded signals
  - int8/uint8
  - int16/uint16/cint16
  - int32/uint32/cint32
  - float/cfloat
Export AIE/HLS Kernel designs from Vitis Model Composer to Vitis as a Vitis Subsystem (VSS)
Debug AIE/HLS Kernels Built in Vitis Model Composer Using Vitis Debugger
Updates to HDL Blockset in Vitis Model Composer
- Simple Dual-Port RAM
  - New block
  - Examples
- DDS Compiler
  - Added native floating-point support
  - Examples
- FFT
  - Added native floating-point support with SSR=2, 4
  - Maps to DSPFP32 primitive on Versal devices
Other Enhancements in Vitis Model Composer
- Improved response time for code generation
  - Simulation runs only once for any design
- Save Hub block configurations as a JSON file (useful for rapid prototyping or batch processing)
- Added support for MATLAB R2024a
- Added support for Red Hat Enterprise Linux (RHEL) 8.10, 9.4
Design Rule Checks (DRCs) to Replace Design Considerations

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2024.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2024.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

Enhanced DSP Library Functions for AIE (Available on Versal AI Core, Versal Premium Series)

Performance enhanced TDM (Time Division Multiplexed) FIR Filter Functions
Higher performance versions of
GEMV (General Matrix Vector)
GEMM (General Matrix Multiply)
2D IFFT – partitioned across AIE + PL for high performance

New DSP Library Functions for AIE-ML (Available on Versal AI Edge)

Performance enhanced TDM (Time Division Multiplexed) FIR Filter Functions
Support for Radix-3/Radix-5 FFTs
GEMV (General Matrix Vector)
GEMM (General Matrix Multiply)

Latency and Throughput Estimate with Vitis Analyzer
Mark which PLIOs are unavailable using Vitis Analyzer

AI Engine DSP Library Updates
- AIE (Available on Versal AI Core, Versal Premium Series)
  - Mixed Radix FFT
  - Stockham FFT Performance Enhancements
  - TDM FIR
- AIE-ML (Available on Versal AI Edge Series)
  - TDM FIR
  - DDS (Direct Digital Synthesis – used for waveform generation)
  - Mixer (used for frequency shifting)
- AIE-MLv2 (Available on Versal AI Edge Gen 2 Series)
  - FIR
  - DFT
  - DDS
  - Mixer
Additional Data Types for Vitis Model Composer
- Support for cbfloat16
- Additional data type support for cascaded signals
  - int8/uint8
  - int16/uint16/cint16
  - int32/uint32/cint32
  - float/cfloat
Export AIE/HLS Kernel designs from Vitis Model Composer to Vitis as a VSS (Vitis Subsystem)
Debug AIE/HLS Kernels Built in Vitis Model Composer using Vitis Debugger
Updates to HDL Blockset in Vitis Model Composer
- Simple dual-port RAM
  - New block
  - Examples
- DDS Compiler
  - Added native floating-point support
  - Examples
- FFT
  - Added native floating-point support with SSR=2, 4
  - Maps to DSPFP32 primitive on Versal
Other Enhancements in Vitis Model Composer
- Improved response time for code generation
- Simulation runs only once for any design
- Save Hub block configurations as a JSON file (useful for rapid prototyping or batch processing)
- Added support for MATLAB R2024a
- Added support for Red Hat Enterprise Linux (RHEL) 8.10, 9.4
Design Rule Checks (DRCs) to Replace Design Considerations

Modeling scalar/wire inputs that change during execution (Direct I/O)
Support for arbitrary precision floating-point types
Mapping HLS code to DSP blocks
User-determined sequence of code execution
HLS debugger that shows data types in a user-friendly manner (using the pretty print technology of GNU debugger)

AMD Vitis™ Software Platform 2024.1 Release Highlights:

Enhancements for AMD Versal™ AI Engine DSP Designs

Enhanced DSP Library Functions for AMD Versal AI Core Series
- Time division multiplexed (TDM) FIR filter functions for SSR > 1
- FFT with 32-bit twiddle
- Mixed-Radix 3 & Mixed-Radix 5 FFTs
- Kronecker Matrix Product
- Householder-based QRD solver for improved stability
- DFT for SSR > 1
- New DSP library functions for AMD Versal AI Edge Series with AIE-ML
- General Matrix Vector (GEMV) with SSR support
- General Matrix Multiply (GEMM) with SSR support
AIE API Enhancements
- Support Radix-3/Radix-5 FFTs
- AIE Simulator Enhancements
- Cycle approximate simulation capabilities for AI Engine designs with PL, without the need for control, interfaces, and processing system (CIPS) IP core
- AMD Vitis analyzer support for hardware emulation with 3rd party simulators such as VCS, Questa, Xcelium, and Riviera

Key improvements to Vitis Unified Software Platform

New device support:AMD Versal™ Premium VP1902 Adaptive SoC, AMD MicroBlaze™ V Processor
Enhanced embedded application development and BSP generation for Windows® environment
User-managed flow to debug embedded applications compiled externally
New Bootgen GUI
Enable incremental builds for platform project

Key improvements to AMD Vitis IDE (New GUI)

Added support for processing subsystem hierarchical debug
Added support for export and import of projects/workspace
Added support for Python interpreter and API
New feature preview page
New file change notification for embedded, AIE, platform projects

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2024.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2024.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

Enhanced DSP Library Functions for AMD Versal AI Core Series

Time division multiplexed (TDM) FIR filter functions for SSR > 1
FFT with 32-bit twiddle
Mixed-Radix 3 & Mixed-Radix 5 FFTs
Kronecker Matrix Product
Householder-based QRD solver for improved stability
DFT for SSR > 1

New DSP library functions for AMD Versal AI Edge Series with AIE-ML

General Matrix Vector (GEMV) with SSR support
General Matrix Multiply (GEMM) with SSR support

AIE API Enhancements

Support Radix-3/Radix-5 FFTs

AI Engine Simulator Enhancements

Cycle approximate simulation capabilities for AI Engine designs with PL, without the need for CIPS (Control, Interfaces, and Processing System IP Core).
Vitis analyzer support for hardware emulation with 3rd party simulators such as VCS, Questa, Xcelium,and Riviera

Export tables from Vitis analyzer to CSV format

New DSP functions supported for AIE and AIE-ML within AMD Vitis Model Composer
- Time Division Multiplexed (TDM) FIR Filter functions
- For building polyphase channelizers @ 1 GSPS and higher throughput
- DFT/IDFT – with SSR support
- Optimized transforms for throughput/latency on small sizes
- FFT/IFFT – with extended support for CINT32-bit twiddle
- Mixed-Radix FFT/IFFT – with AIE-ML support
Ease-of-use improvements to Model Composer Hub block
Enhancements to Hardware Validation flow
OS and MATLAB® version support added with v 2024.1:
- RHEL 9
- MATLAB R2023a and R2023b

New example designs available on Github.

A new stencil pragma simplifies HLS C++ code for image and video filters
New library function wizards tap into the AMD Vitis libraries github repo
- Create “Solver” and “Vision” (OpenCV compatible) IPs for AMD Vivado design tool
- Run the available library examples
Pragma for memory interface (ap_memory) can now bundle ports for AMD Vivado IP Integrator
New HLS component comparison displays side-by-side metrics for 2 or more components
Support for user-provider RTL code to replace a C++ function (black-box flow)
Code Analyzer can now disaggregate C++ struct members to fine-tune performance analysis
New user control for HLS global FSM encoding and selection of safe state
Access to Clang sanitizers during C-Simulation to perform address and initialization checks

Vitis™ Software Platform 2023.2 Release Highlights:

Enhancements for Versal™ AI Engine DSP Designs

New DSP library functions
New API support for DSP functions
New features in AI Engine compiler and simulators

New Standalone Vitis Embedded Software

A smaller standalone installer for designers writing C code for the Arm® embedded subsystem
All embedded features are provided, including utilities such as Bootgen and XSCT

New Vitis Unified Integrated Design Environment

Consistent GUI and CLI across all Vitis workflows
Next-generation, Eclipse Theia-based GUI provides better flexibility and user-friendly features for enhanced work efficiency

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in Vitis software platform 2023.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2023.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

New DSP library functions for AI Engines

Mixed Radix FFT
Discrete Fourier Transform (DFT)
General Matrix-Vector Multiply (GEMV)

New API support for DSP functions

FFT IP with cint32 twiddle data types
Support for cint16 for Radix-4 FFT APIs
Vectorized "fix2flt" and "flt2fix" implemented in API

New API support for AIE-ML

APIs now support int32/cint32 data types in sliding_mul() function
APIs now support <float> data types in sliding_mul() function
All AIE API routines required to support sparse matrix multiplication are provided

Major Component Updates:

U-boot 2024.1
Arm Trusted Firmware 2.10
Linux Kernel 6.6_LTS
Qemu 8.1
Xen 4.18
OpenAMP 2023.10

Sunset BSPs:

AMD Microblaze™: VCU118, KCU105, KC705, AC701
Zynq: zc706
AMD Versal™: VMK180-EMMC, VMK180-OSPI
Zynq MP: ZCU111

New BSPs (XSCT) :

VEK280 Production BSP with New ETH Phy

New System Device Tree Flow (SDT) BSP :

ZCU102, ZCU104, ZCU105, ZCU216
ZCU208, ZCU208-sdfec, ZCU670
VCK190
VMK180
VPK120
VPK180
VEK280

AIE compiler can now support 2D and 3D arrays as inputs or outputs
Vitis Analyzer now generates guidance report to adjust FIFO size
New support for multi-threaded simulator kernel and value change dump (VCD) analyzer speedup
External interfacing with MATLAB® environment & Python traffic generators
Enhanced AXI Stream model with support for empty/wait cycles in PLIO alignment
Enhanced Design Rule Checking

AI Engine trace offload via high-speed debug
NoC and hard DDRMC profiling support in the Vitis environment
Vitis tool now supports AIE-ML trace for VEK280 and Alveo™ V70 AI inference accelerator card

AI Engine block updates
Support for importing AIE-ML graphs as blocks into Vitis Model Composer
New DSPlib functions for AIE and AIE-ML implementation in Vitis Model Composer
Plotting of AIE simulator output for internal signals in the Simulink® tool
HLS Kernel block updates
Automatic test bench generation
Expanded data type support for HLS Kernel blocks
Integration of Vitis Model Composer and Vitis tool
Generation of .xo and libadf.a files directly from Vitis Model Composer
Other enhancements
MATLAB® tool version support: R2021a, R2021b Update 6, R2022a Update 6, R2022b
Additional topologies supported for the hardware validation flow
New example collaterals available from GitHub

New Vitis Unified IDE for HLS components
New Vitis HLS license requirements
New code analyzer feature for obtaining performance estimations before running C synthesis
Enhancements to AXI interface:
- Support for HLS AXI Stream side-channels
- Support for user-configurable AXI master caching
Other enhancements:
- New code complexity report to enable identifying design size issues during C synthesis
Compile time improvements: Average compile time improvement of 20% in 2023.2 compared to 2023.1¹

Vitis Software Platform 2023.1 Release Highlights:

New Vitis™ Library Functions for Versal™ AI Engine (AIE) Arrays

DSP library functions – more FIR filter configurations
Solver library functions – enhancements for higher performance

Design Flow Enhancements for Versal AI Core and AI Edge Series

AIE compiler support for 2D and 3D arrays as inputs/outputs
AIE simulator guidance support for FIFO sizing to avoid deadlock conditions
AIE status reporting enhancements
New default GUI for the Vitis analyzer

Support for Vitis environment export to the Vivado™ environment

Enables Vitis and Vivado tool development teams to work in parallel based on a common interface checkpoint

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2023.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2023.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

DSP Library - FIR Filters

Enhanced Fractional Resampler FIR, Single Rate FIR, Half Band FIR, and Rate Change FIR to support coefficient bit widths to be larger than data bit widths
Fractional Resampler FIR also supports SSR operation using multiple AIE tiles and incorporates coefficient reload feature

Solver Library

Enhanced API performance with high-performance streaming designs (~300 tiles)
QR and Cholesky Decomposition support for 4D data mover functions to help read or write data from AIE arrays

AIE compiler can now support 2D and 3D arrays as inputs or outputs in addition to 1D.
AIE compiler supports graph-within-graph constructs (subgraphs) and conditional port constructs.
New AIE CINT-to-CFLOAT data conversion APIs.

AIE status reporting enhancement to generate a file that includes information about tiles, events, and additional registers on AIE-ML and AIE tiles in the design.
Offloading of AIE event trace over high-speed differential pairs (HSDPs) instead of storing it in memory on Versal devices.
NoC and hard DDR MC profiling support in the Vitis environment.
AIE windowed event trace for inspecting a specific part of an application.

Guidance with FIFO sizing to avoid deadlocks.
Ability to select nodes that are reported by the AIE simulator to reduce the size of the simulator VCD file and speed up simulation.
AIE simulator now generates a report (that can be viewed in the Vitis analyzer) that shows which AIE has memory access violations and how these correspond to lines in the graph C code.
Trace view data visualization now supports the AIE-ML array as well.

New data type support for FIR filter configurations that target Versal AI Engines
Two new floating-point functions optimized for DSP58s in Versal adaptive SoCs
Faster response time for all Vitis Model Composer library functions targeting Versal AI Engines
Other enhancements:
- Enhancements to HLS kernel blocks
- Enhancements to the Vitis Model Composer Hub
- Support for MATLAB tool versions R2021a, R2021b, R2022a

Performance improvements²: Average latency improvements of 5.2% in 2023.1 compared to 2022.2
Easy way to download, view, and instantiate L1 libraries functions in the Vitis HLS tool
Enhanced support for AXI transactions and burst reporting within the Vitis HLS tool

Vitis Software Platform 2022.2 Release Highlights:

New Vitis™ Library Functions for Versal™ AI Engine (AIE) Arrays

DSP library functions – enhanced features
Solver library functions
Vision library functions
Ultrasound library functions

Design Flow Enhancements for Versal AI Core and AI Edge Series

Control relative placement of kernels in the AI Engine array – higher performance and better utilization
AIE x86 simulator enhancements - improved modeling of deadlock conditions in x86 simulator
AIE API enhancements - Radix 3/5 FFT and Matrix ‘x’ Vector APIs added
Enhanced profiling and debugging capabilities for Versal designs – deadlock detection, larger trace data collection, RTL/Python testbench support
New simulation options for heterogenous designs in Vitis

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2022.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2022.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

DSP library functions

Super sample rate (SSR) FIR filter implementation on AI Engine now supports coefficient reload feature and dynamic point size
Added FFT windowing element to the FFT function that targets the AI Engine array

Solver library functions

Quadrature rotation (QR) decomposition
Cholesky decomposition

Vision library functions

Four new video functions targeting the AI Engine array

Ultrasound library functions

Various functions to help build medical ultrasound designs

Ability to add constraints to control relative placement of kernels in the AI Engine array - this allows users to get higher performance and better utilization
Improved modeling of AIE deadlock conditions in x86 simulator
New AIE API added - Radix 3/5 FFT and Matrix ‘x’ Vector APIs added

Generation of AI Engine profiling reports in HW Emulation
Deadlock detection using XSDB (AMD System Debugger) for both AI Engine and PL-based designs
Xilinx Runtime (XRT) controlled continuous offloading of AI Engine event trace over PLIO

Supports PS application on x86 host machine for SW emulation
Allows SystemC functional models for HW emulation instead of RTL
Allows users to simulate the AI Engine kernel with a simple RTL test bench or Python script-based traffic generator
AI Engine status can be analyzed during HW emulation with the Vitis™ analyzer

Vitis environment 2022.2 new simulation options: Processor system x86 simulation and AI Engine x86 simulation: Programmable logic simulation can be performed using the x86 simulator.

图像缩放

Features for Versal AI Engine Design
Ability to add graph constraints to AI Engine DSP library blocks designs – better utilization and performance
New capability for cycle approximate simulation for AI Engine designs
AI Engine Graph Import block automatically detects Run Time Parameter (RTP) ports
Enhancements and additions to the DSP Library blocks
General Features
Hardware validation flow supported for heterogenous system designs that use PL and AIE array
Vitis Model Composer Hub block updated to support heterogenous design
- Automatic detection of valid AI Engine, HDL, and HLS subsystems
Hardware validation flow enhanced for HDL only designs and HDL → AI Engine → HDL designs for Versal platforms

Improved 'task level parralellism' coding style support
- Enables faster C simulation and better QoR
Additional performance and timing enhancements
- Improved burst inference
- Automatic inference of Unroll, Pipeline, Array_Partition, and inline pragmas for better performance
- Improved timing accuracy resulting in better timing closure at higher frequencies
Other features
- Analysis and debug: printf inserted in C-code now supported even after synthesis in the RTL
- Ease of use: new performance pragma to automatically achieve a given transaction interval
- HLS::stream interfaces now supported by FFT and FIR IP

Vitis Software Platform 2022.1 Release Highlights:

Vitis™ Flow Enhancement for Versal™ ACAP and AI Engine

Supports AMD base DFX platform with one static region and one DFX region
AIE profiling supports stall/deadlock detection, generates AI Engine status (including error events) view reports in Vitis Analyzer
External Traffic Generators in x86sim, AIEsim, and SW emulation are much more flexible and can be inserted very easily in Simulation and Emulation flows
Vitis Model Composer supports Hardware Validation, Linux and HW emulation

Vitis for DC and Vitis HLS

Vitis Provides additional reporting support for the dynamic region generation process and Flow reporting enhancements include 3 new or updated reports
Vitis Improves PL profiling with the choice of offloading trace to memory resources (preferred) or FIFO in the PL for better performance
A new Timeline Trace Viewer to show the runtime profile and allows user to remain in the Vitis HLS GUI is now available after simulation
Vitis HLS now supports a higher-level type of "smart" construct via the new performance pragma or the set_performance_directive
Vitis Graph Library with L3 API enhancements (1 mS time saved for kernel call) for performance

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2022.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2022.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

new Genomics Accelerator Library Added (L1&L2 and L3
Graph Library, L3 enhancements for performance
Vitis Database Library, GQE Multi-Functional Kernel
New functions added in Vision Library
New functions in Vitis AIE Vision Library additions/enhancements
Vitis AIE DSP library, FIR resampler supersedes FIR fractional interpolator
Vitis Codec Library new APIs, API jxlEnc, API ‘leptonEnc’, API ‘resize’, API ‘WebpEnc’

Vitis Data Compression Library

ZLIB Compress Improvement, Customized Octa-Core compression for 8KB solution
ZLIB Decompression Improvement, Customized IP for 8KB file size

Platform Capability Query Improvement
HBM Easy-of-Use Improvement, Ability to choose a specific S_AXI entry point to the HMSS for a kernel M_AXI, RAMA insertion supported from the configuration files

Vitis AI Engine Compiler

AI Engine Automated Stall/Deadlock Detection & Analysis in Hardware
Analyzing the Automated Status Output
Analyzing the Automated Status Output – Buffers
Analyzing the Manual Status Output in Hardware
Analyzing the Manual Status Output
AI Engine Event Trace Enhancements
External traffic generators AIEsim
AI Engine Profiling Improvements on HW
AI Engine support for Broadcast windows
Vitis AI Engine Compiler Enhanced Graph Programming Model
Vitis AI Engine Compiler - PLIO/GMIO in ADF Graphs

Vitis HLS

Analysis Enhancements, New Timeline Trace Viewer
Coding Style Enhancements, Array Partition support for Stream of Blocks type
Pragma Abstraction, New Performance Pragma (and directive)
Vitis Core “one liner”, Vitis HLS - New Timeline Trace Viewer, new PERFORMANCE pragma, Stream of Blocks support windows
New Viewer introduced
- Shows the runtime profile of all surviving functions in your design - i.e., those that get converted into modules
- Especially useful to see the behavior of dataflow regions after Co-simulation
- Native to Vitis HLS - No need to launch the xsim waveform viewer anymore (external tool)

Vitis Analyzer

Vitis Analyzer Improvement, Save/Restore Timeline Customization
Reporting Enhancement, report_qor_assessment, xclbin Clocking Information, Vivado Automation Summary
Profiling Enhancement, New PL profiling infrastructure enabled, Multiple trace_memory options can be added to insert multiple memory monitors (HW Only), Sample config file for v++ linker to offload trace data for all CUs in SLR0 to DDR0 and same for all CUs in SLR1 to DDR1

Vitis IDE

Updated Bootgen GUI for Versal
Toolchain Update
XSCT, Support STAPL, Add Linker script generation command
System Compile Flow, Refer to system compile doc

Vitis Emulation

Add Software Emulation support for Auto-restart and mailbox support for always running kernels
Free running kernel doesn’t need while(1) for sw-emu
Add Software Emulation support for external traffic generator
Hardware Emulation can use HLS C source code function model for Streaming IP.

Add API xrt::system for Probing number of devices
Add API xrt::message for Logging messages
XRT Native API host code now requires
-std=c++17 or above
Add experimental xrt::queue APIs for asynchronous execution of synchronous operations
xbutil can show AIE FIFO counters that helps to debug AIE deadlock scenarios
xbutil --legacy option is removed.
xclbinutil --info provides clock information for embedded platforms
xbutil on ARM can load SOM images
xbtop standalone utility to show linux top like output (replacing legacy xbutil -top)
XRT Utilities supports auto-completion in Bash with tab key.

Alveo Platform Updates, Platform Updates for improved stability, Card Management Updates, SC Firmware Update Tool
Embedded Platform, New VCK190 DFX Platform: xilinx_vck190_base_dfx_202210_1, Embedded Platforms are now installed with Vitis, Vivado adds a new Customizable Example Design: Vitis Platform for MPSoC

Major overhaul of the Vitis Model Composer hub block for scalability and ease of use
Hardware validation flow now supports Linux in addition to bare-metal
"AIE to HDL" and "HDL to AIE" blocks no longer include the HDL gateway blocks
2022.1 now ships with a snapshot of the examples for customers who do not have access to the internet. The tool will prompt the user to download a new revision of the examples from GitHub if available
For ease of use, utility blocks that are not part of code generation are now presented with a white background color
Enhanced and reorganized the library browser for ease of use
RHEL 8.x support
MATLAB Support - R2021a and R2021b

Vitis Software Platform 2021.2 Release Highlights:

New domain specific development environments
- Vitis™ Video Analytics SDK on Kria™ SOM, Alveo™ U30/U50, and VCK5000 Versal™ development card: Learn More >
- Vitis Blockchain solution on Varium™ C1100 card with Vitis libs: Learn More >
Full end to end flow support for VCK5000 and Varium C1100 cards
Enhanced core tool features
- Vitis AI Engine Compiler C/C++ high level abstraction API, Auto Pragma Inference, Area Group Constraints
- Vitis AI Engine x86simulator enhancements: Trace Report, Memory Access Violation and Deadlock Detection
- Vitis HLS EoU, Timing and QoR enhancement, HLS APIs for user-controlled burst inferencing
- Enhanced Vitis Analyzer for better timeline trace report, data visualization, stall analysis
- Vitis XRT for AI Engine Multiple Process and Multi Thread Support for AI Engine graph control
- Vitis IDE & Emulation support AI Engine Trace, SW Emulation for AI Engine applications
39 new C/C++ library in diverse domains covering in DSP, Data Analytics, Vision, Compression, Database, Graph, Security, … total of over 1000 library functions, Database, Graph, Security, …
Vitis Model Composer
- 3x compile/simulation time, 7x compilation time reduction with Parallel Compilation
- New Hardware Validation Flow and Enhanced Functional Co-simulation

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2021.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2021.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

Note: Vitis Accelerated Libraries are available as a separate download. They can be downloaded from GitHub or directly from within the Vitis IDE as well.

Library	2021.1	2021.2	New functions in 21.2
xf_blas	167	167	0
xf_codec	3	3	0
xf_DataAnalytics	33	36	3
xf_database	62	65	3
xf_compression	78	93	15
xf_dsp	94	96	2
xf_graph	53	59	6
xf_hpc	37	37	0
xf_fintech	116	116	0
xf_security	135	140	5
xf_solver	11	11	0
xf_sparse	11	11	0
xf_utils_hw	55	57	2
xf_opencv	147	150	3
total	1002	1041	39

Note: For vision, just count the number of sub folders in L*/tests, because each API has multiple tests for different types

Vitis Vision Library

Programmable Logic (PL)
- End-to-end Mono Image Processing（ISP）with CLAHE TMO
- RGB-IR along with RGB-IR Image Processing(ISP) pipeline
- Global Tone Mapping(GTM) along with an ISP pipeline using GTM

New Features	Cat	Customer/Strategic	Segments	Description
RGB-IR	ISP	Seeing Machines	Automotive, ISM	•Support 4x4 RGB-IR demosaicking •Primarily for in-cabin monitoring system •Low light surveillance camera
Mono (CCCC)	ISP	Strategic	Automotive, ISM, A&D	•Machine vision •Low light applications
Global Tone Mapping (GTM)	ISP	Strategic	Automotive, ISM, A&D	•Improved dynamic range and contrast •Lower cost version compared to local tone mapping (LTM)
Dense Optical Flow TV-L1	CV	NTT	ISM	•Improved robustness (against illumination, noise, occlusions) for optical flow

AI Engine (AIE)

BlobFromImage
Back to back filter2D with batch size three support

New Features	Cat	Customer/Strategic	Segments	Description
RGB-IR	ISP	Seeing Machines	Automotive, ISM	•Support 4x4 RGB-IR demosaicking •Primarily for in-cabin monitoring system •Low light surveillance camera
ML+X	ISP	Strategic	Automotive, ISM, A&D	•ML interference pre-processing
Gaussian Pyramid	CV	Strategic	Automotive, ISM, A&D	•Fundamental for multi-scale image processing
Box Filter	CV	Strategic	Automotive, ISM, A&D	•Fundamental for smoothing, low pass filter

Vitis Data Analytics Library

Vitis Blockchain Solution based on Vitis libraries
- Out-of-Box Mining solutions for Ethereum
- Open-Source & easy to use and deploy with Vitis Libs using C++
- Flexible & Scalable with Vitis Libs
- Be flexible to mine multiple coins
- Customize and compile into hardware
- Highly optimized design
Adding CSV parser API into library
- CSV parser could parse comma-seperated value files and generate object stream which could easily be connected with DataFrame APIs

Vitis Graph Library

New L2 libraries added
Louvain with renumber
Renumbering
The ‘weight’ feature is supported for Cosin Similarity

Vitis Database Library

GQE start to support asynchronous input / output feature, along with multi-card support.
- Asynchronous support will allow the FPGA start to process as soon as part of the input data is ready.
- Multi-card support allows to identify multiple Alveo cards that suitable for working.

Vitis Data Compression Library

ZSTD Mult-Core Compression
- Created new ZSTD multi-core architecture and provided >1GB/s throughput using quad-core.
ZSTD Decompress optimization
- ZSTD decompress optimized for performance (increased by 20%) and resource (reduced < 30%)
GZIP/ZLIB Stream Core Improvement for IBM
- Customized Static & Dynamic compress streaming IP (4KB & 8KB)
- Added functionality to provide compressed size in TUSER port
GZIP/ZLIB Decompress Improvement for IBM
- Optimized huffman decoder to reduce latency < 1.5K cycles
- Reduced resources significantly from to 6.9K (older > 9K)
- Added ADLR32 Checksum Functionality
GZIP System Compiler PoC
- Created a System Compiler PoC for GZIP Compress solution and benchmarked against OpenCL Host.

Vitis DSP Library

DSPLib on Github since 2021
Fast Fourier Transform (FFT/iFFT)
- Point size increase to 32k (data type dependent)
- Support for stream API as well as window API.
- Parallel Power (0-4)
  - Allows higher throughput and extends range of supported point sizes

FIR Filters
- Initial Stream support for Single Rate asymmetric / symmetric FIR

DDS/Mixer
- New library unit in 2021.2

Vitis Security Library

KECCAK-256 (hash function) and CRC32C (checksum function) are released

Vitis Utilities Library

Two Data-Mover implementation are added for debugging hw issue.
- LoadDdrToStreamWithCounter: For loading data from PL’s DDR to AI Engine through AXI stream and recording the data count sending to AI Engine.
- StoreStreamToMasterWithCounter: For receiving data from AI Engine through AXI stream and saving them to PL’s DDR, as well as recording the data count sending to DDR.

AI Engine API

Implemented as a C++ header-only library that provides types and operations that get translated into efficient AI Engine intrinsics.
Provides parametrizable data types that enable generic programming
Implements most common operations in a uniform way for different data types
Transparently translates higher-level primitives into optimized AI Engine intrinsics
Improves portability across AI Engine architectures

AI Engine API will be the lead method for AI Engine kernel programming

High Level Optimizations

AI Engine compiler optimization options

--xlopt=0, no optimization applied.
--xlopt=1, automatic computation of heap size, guidance generation from LLVM IR analysis.
--xlopt=2, automatic inlining, loop peeling for unrolled loops, pragma insertion.

Introducing --xlopt=2 to improve performance, default remains --xlopt=1

Automatic inline
- Automatically inlines functions if it is practical and possible to do so, even if the functions are not declared as __inline or inline
Automatic pragma insertion
- Insert pragmas to kernel code automatically. (see next slide for more details)

Pragma Inference

Necessary for optimizing the kernels

Alleviate user’s responsibility of adding effective & correct chess pragmas

Support to auto-infer five pragmas in 2021.2

for performance:
- chess_prepare_for_pipelining for innermost loop, and outer loops with known trip count
- chess_loop_range for loops with known trip count
- chess_unroll_loop/chess_flatten_loop for innermost loops with known trip count
for correctness:
- chess_unroll_loop_preamble when trip count is not a multiple of unroll factor

Updated Graph Programming Model PLIO and GMIO

Model Changes Include:

Changes to usage of “simulation::platform”
Interaction with PLIO/GMIO objects in the graph, position determines input/output.
Changes of global PLIO/GMIO objects in the graph.
Changes around graph connect<> statements.

PLIO/GMIO in ADF Graphs

Current

Write PLIO, GMIO, simulation::platform, and connections at global scope

GMIO gm0(“GMIO_In0”, 64, 1);

GMIO gm1(“GMIO_In1”, 64, 1); … GMIO gm7(“GMIO_In7”, 64, 1);

PLIO pl0(“PLIO_Out0”, plio_32_bits, “data/output0.txt”, 250.0);

PLIO pl1(“PLIO_Out1”, plio_32_bits, “data/output1.txt”, 250.0); … PLIO pl7(“PLIO_Out7”, plio_32_bits, “data/output7.txt”, 250.0);

simulation::platform<8,8> plat(&gm0, &gm1,…, &gm7, &pl0, &pl1,…, &pl7,);

subgraph g;

connect<> net0(plat.src[0], g.in[0]);

connect<> net1(plat.src[1], g.in[1]); …

connect<> net7(plat.src[7], g.in[7]);

connect<> net8(g.out[0], plat.sink[0]);

connect<> net9(g.out[1], plat.sink[1]);

…

connect<> net15(g.out[7], plat.sink[7]);

Alternative method

Create a top-level graph and move PLIO, GMIO, and connections inside
Allow managing connections within for loop

class topgraph

{

input_gmio gm[8];

output_plio pl[8];

subgraph sg;

topgraph()

{

for (i=0; i<8; i++)

{

gm[i] = input_gmio::create(“GMIO_In”+std::to_string(i), 64, 1); pl[i] = output_plio::create(“PLIO_Out”+std::to_string(i), plio_32_bits, “data/output”+std::to_string(i)+”.txt”, 250.0); connect<>(gm[i].out[0], sg.in[i]); connect<>(sg.out[i], pl[i].in[0]);

}

};

topgraph g;

Area Group Constraints Improvements

Ability to use flags in the ADF graph or constraints file to control the mapper and router

-contain_routing – when specified true ensures all routing, including nets between nodes contained in the nodeGroup, is contained within the area group.
-exclusive_routing - when specified true ensures all routing, excluding nets between nodes from the nodeGroup, is excluded from the area group.
-exclusive_placement - when specified true prevents all nodes not included in the nodeGroup from being placed within the area group bounding box.

Snapshots

Snapshots are textfiles containing comments and data relative to all kernel ports

streams, packet streams, cascade streams
windows, buffer
RTP

Includes also all platform ports

PLIO, GMIO, RTP

Allows users to inspect data traffic at kernel ports without using the debugger and without requiring instrumentation of kernel code

Deadlock Detection

Detects deadlocks in x86 simulations whether this situation arises from insufficient input data, or an imbalanced FIFO depth on a re-convergent path
The stop-on-deadlock feature must be enabled during x86 simulation by specifying option --stop-on-deadlock
If the simulation is stopped because of a deadlock, the error message indicates that you should rerun with option -trace --timeout

Memory Access Violation Detection

Integration with Valgrind for Memory Access Violation Detection

Detect
- out-of-bounds read and write
- read of uninitialized memory
No specific flag required for compilation
Simulation flags can be either
- --valgrind : simulation runs as usual and valgrind displays a report
- --valgrind-gdb : same thing but with gdb debug at the same time

Trace report

Deadlock situation results in poor simulation output and difficulties to analyze bug origin

X86 simulation trace option allows the simulator to log various timestamped information:

Start/End of Kernel iterations
Start/End of Stream stalls
Start/End of lock stall

Timestamps are different in between x86 simulation and AI Engine simulation

User Controlled Burst Inference

For use cases that do not satisfy the automatic burst inference by Vitis HLS tool, user can adopt the newly introduced manual burst optimization
A new class 'hls::burst_maxi’ to support manual controlling burst behavior. New HLS APIs are provided to use together with the new class
User need to understand AXI AMBA protocol and the hardware transaction level modeling in HLS design

Timing and QoR Enhancements

Provide support for user to input high level throughput constraints
Improve HLS timing estimation accuracy. When HLS reports timing closure, the RTL synthesis in Vivado should also expect to meet timing

EoU Enhancements

Add interface adaptors report in the C synthesis reports

Users need to know the resource impact that interface adaptors have on their design
Interface adaptors have variable properties that impact design QoR
Some of these properties have associated user controls which should be reported to users
Text version of bind_op and bind_storage reports are provided

Add new section in synthesis report to show list of pragmas and warnings on pragmas

User can easily understand which of the pragmas that add have issues.

Analysis and Reporting Enhancements

The Function Call Graph Viewer has some new features

New mouse drag based zoom in and out capability
New Overview feature that shows the full graph and allows the user to zoom in on parts of the overall graph
All functions and loops are shown along with their simulation data

A new Timeline Trace Viewer is now available after simulation. This viewer shows the runtime profile of your design and allows the user to remain in the Vitis HLS GUI.

Link Summary Enhancement

Provide clock frequency information for the AI Engine, platform and compute units
Provide a new table called Clocks in system diagram and platform diagram

Platform Export Enhancement

XSA export from Vivado no source files required to be local to the project
XSA export from Vivado no change to the project structure
Package the IPs that are used in the hardware platform project instead of packaging the whole IP repo

AI Engine application emulation enhancements

Provide support for external testbench integration with aiesimulation
Provide support for external testbench integration with x86simulation
Support for GDB debugging with x86simulation
Provide support for snapshots of the data between kernels in a graph for x86simulation
Provide support for access violation checking to x86sim
Provide support for stop on deadlock to x86sim

Support AI Engine Trace

Support SW Emulation for AI Engine applications

Support external traffic generator in Verilog / System Verilog

Extend Profiling Monitor insertion to Monitor Memory

Currently the profiling monitor logic can be inserted on kernel/CU port basis. This feature provides user the option to insert monitor logic on memory interface directly
The visualization of memory bandwidth achieved directly on the memory interfaces can be reflected in profile summary report
DDR memory and PLRAM are supported
Hardware flow is supported
To enable this feature, both linking phase and xrt need to be set up
- memory=all
- data_transfer_trace= coarse|fine or
- opencl_device_counter=true

Extend Profiling Monitor insertion to Monitor Memory

A vadd example that enables memory interface monitoring
- A new table ‘Memory Bank Data Transfer’ is included

Vitis Analyzer Enhancements

Generic profile summary report generated for non-OpenCL applications

Provide the same level of support for XRT API and HAL API applications.
Users select which types of reports they want to create, the tool automatically generate and visualize them in Vitis Analyzer

Add OpenCL commands to PL event timeline

Profiling will add overhead, XRT provides capability to dump the OpenCL events on the timeline trace without overhead.
Vitis Analyzer can process the XRT output and show it in timeline trace view.
xocl_debug=true needs to set in the xrt.ini.

Flatten signal hierarchy in timeline trace report

By default, the timeline trace report displays the signal trace in hierarchical way
Vitis Analyzer provides the capability of flattening the hierarchy by toggling the “Flatten Signal” symbol
Comparing the waveform is supported for flattened timeline trace

Vitis Analyzer – Data Visualization

Display input/output data to AI Engine kernels in an AI Engine design
- Helps debug AI Engine designs to show input/output data along with timeline
Works with aiesimulator
Supports
- Window/stream/cascade data types
- Packet streams
- Templated kernels
- data-dump utility

Vitis Analyzer – AI Engine Stall Analysis

Vitis Analyzer provide visualization capabilities to enable users to identify root cause of stalls
Support
- Performance Metrics
- Lock Stall Analysis
- Stream Stall Analysis
- Cascade Stall Analysis
- Memory Stall Analysis
Support Flow
- aiesimulator
- HW emulation

Xilinx Runtime Library (XRT):

XRT API
- The XRT native API supports user managed kernel control with xrt::ip
XRT Utilities
- The xbutil and xbmgmt tools now becomes default
  - To use the legacy utilities, please use xbutil --legacy or xbmgmt --legacy with legacy sub-commands
- New utility, xball
  - Apply xbutil or xbmgmt commands to all or a filtered part of the installed data center cards. Check xball --help for details
- A new command, xbutil configure
  - Allow you to enable, disable, or configure the PCIe Host Memory and PCIe Peer to Peer features. See the XRT documentation for more details
- All XRT utilities now globally support the --force option to skip user interactive confirmation
Profiling
- A profile summary report is generated when any profiling option is enabled.
- All applicable summary tables and guidance are generated based on the profiling options enabled in the xrt.ini file
- New data transfer summary table for aggregate information on a memory resource when monitors are added to memory resources in the design
- New AIE profiling metric sets to count different AIE events including (1) floating point exceptions in AIE, (2) tile execution counts, and (3) stream puts and gets
Embedded
- zocl memory manager improvements to support any sptag

Vitis XRT for AI Engine Multiple Process Support

C and C++ APIs to define access modes for multiple processes to share access to the same AI Engine array and graphs.
- ¬Protect AI Engine array & graphs from unwanted access.
Three modes are supported for opening AI Engine array & graphs
- Exclusive Mode (prevent any other processes to access)
- Primary Mode (only allow other processes to do nondestructive access)
- Shared Mode (only do nondestructive access)
Take into consideration when multiple process support is needed. For example:
- Prevent others to access AI Engine array(exclusive access)
- Multiple users to control different graphs separately (multiple application support)
- One primary user to control graph, and allow others to probe the running status (primary & shared access)

Vitis XRT for AI Engine Support Status

C and C++ APIs

C version API
- For AI Engine array:
  - xrtAIEDeviceOpenExclusive (Exclusive mode)
  - xrtAIEDeviceOpen (Primary mode)
  - xrtAIEDeviceOpenShared (Shared mode)
- For AI Engine graph:
  - xrtGraphOpenExclusive (Exclusive mode)
  - xrtGraphOpen (Primary mode)
  - xrtGraphOpenShared (Shared mode)
C++ version API
- xrt::aie::device class support access mode in constructor
  - enum class access_mode : uint8_t { exclusive = 0, primary = 1, shared = 2 };
- xrt::graph class support access mode in constructor
  - enum class access_mode : uint8_t { exclusive = 0, primary = 1, shared = 2, none = 3 };

Access latest Vitis Target Platforms for Alveo Cards and refer to the Getting Started section of the Accelerator Card.
Download Vitis and refer to the Alveo Packages section

AI Engine DSP Library – New Blocks

AIE DDS
AIE Mixer

Parallel Compilation

Reduced times vs. 2021.1 (As an example, the following numbers are for the 200 MHz TX Chain):
Time to compile and simulate reduced by factor of 3
Compilation times reduced by a factor of 7
Dead time after simulation reduced from 25s to ~0s

Constraint Editor Enhancement

2021.2 Improved Navigation

To Fixed Size Improvements

To Variable Size Block Improvements

Enhanced Functional Co-simulation Capabilities

Export Matlab data for AI Engine input – xmcVitisWrite
Import AI Engine Data into Matlab – xmcVitisRead
Import AI Engine Data into Matlab - xmcVitisRead

Others

Import an AI Engine or HLS Kernel block with no input (Source block)
New Data Type Support
- the Simulink native int64 and uint64 for AI Engine development instead of AMD data types, x_sfix64 and x_ufix64.
- accfloat and caccfloat for AI Engine Development
Support for Ubuntu 20.04
Support for MATALB 20a, 20b, 21a (No support for MATLAB 21b)
Addition of new examples
- Dual stream SSR filter example with 64 kernels
- Pseudo inverse(64x32) – commslib example.
Use xmcLibraryPath command to point to a custom DSPLib location.
Many more enhancements and bug fixes

Vitis Software Platform 2021.1 Release Highlights:

AMD Kria System-on-Modules (SOMs) KV260 vision AI starter kit support. The full Vitis flow for ML (DPU inference engine) + X (RTL kernel and Vitis HLS based computer vision kernels). Learn More >
Support for new C/C++ Vision, DSP, Graph (Louvain Modularity), Codec in image processing, compression (GZIP, Facebook ZSTD, ZLIB whole application acceleration) performance-optimized libraries on FPGA and/or Versal ACAP over CPU/GPUs
Enhanced Vitis™ core development kit design flow on Versal ACAP devices: visualization improvements for AI engine design trace report, AI engine event tracing via GMIO, incremental recompile, new boot image wizard, and encrypted AI engine source file support
The new Vitis Model Composer tool enables rapid design exploration and verification within the MathWorks MATALB and Simulink® environment, enabling co-simulation of blocks targeting AI Engines and Programmable Logic, code generation, and test bench creation.
New Vitis HLS Flow Navigator GUI for quick access to flow phases and reports. Merge synthesis, analysis, and debug views into a general default context

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2021.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2021.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

Note: Vitis Accelerated Libraries are available as a separate download. They can be downloaded from GitHub or directly from within the Vitis IDE as well.

AIE DSP

DSPLib published as part of the Vitis Acceleration Library set on Github
DSPLib contains common parameterizable DSP functions used in many advanced signal processing applications. All functions currently support window interfaces with streaming interface support.

FIR Filters

Function	Namespace
Single rate, asymmetrical	dsplib::fir::sr_asym::fir_sr_asym_graph
Single rate, symmetrical	dsplib::fir::sr_sym::fir_sr_sym_graph
Interpolation asymmetrical	dsplib::fir::interpolate_asym::fir_interpolate_asym_graph
Decimation, halfband	dsplib::fir::decimate_hb::fir_decimate_hb_graph
Interpolation, halfband	dsplib::fir::interpolate_hb::fir_interpolate_hb_graph
Decimation, asymmetric	dsplib::fir::decimate_asym::fir_decimate_asym_graph
Interpolation, fractional, asymmetric	dsplib::fir::interpolate_fract_asym:: fir_interpolate_fract_asym_graph
Decimation, symmetric	dsplib::fir::decimate_sym::fir_decimate_sym_graph

FFT/iFFT - The DSPLib contains one FFT/iFFT solution. This is a single channel, single kernel decimation in time, (DIT), implementation with configurable point size, complex data types, cascade length and FFT/iFFT function.

Function	Namespace
Single Channel FFT/iFFT	dsplib::fft::fft_ifft_dit_1ch_graph

Matrix Multiply (GeMM) - The DSPLib contains one Matrix Multiply/GEMM (GEneral Matrix Multiply) solution. This supports the Matrix Multiplication of 2 Matrices A and B with configurable input data types resulting in a derived output data type.

Function	Namespace
Matrix Mult / GeMM	dsplib::blas::matrix_mult::matrix_mult_graph

Widget Utilities - These widgets support converting between window and streams on the input to the DSPLib function and between streams to windows on the output of the DSPLib function where desired and additional widget for converting between real and complex data-types.

Function	Namespace
Stream to Window / Window to Stream	dsplib::widget::api_cast::widget_api_cast_graph
Real to Complex / Complex to Real	dsplib:widget::real2complex::widget_real2complex_graph

DSP Library functions are supported in Vitis Model Composer, enabling users to easily plug these functions into the Matlab/Simulink environment to ease AI Engine DSP Library evaluation and overall AI Engine ADF graph development.

Vitis HPC Library release introduces HLS primitives, prebuild kernles and software APIs for HPC applications on FPGAs. These applications are:
- 2D Acoustic RTM (Reverse Time Migration) FDTD (Finite Difference Time Domain) algorithm, including forward kernel and backward kernel
- 3D Acoustic RTM (Reverse Time Migration) FDTD (Finite Difference Time Domain) algorithm, including forward kernel
- MLP (Mult-Layer Perceptron) components: activation functions and fully connected network kernels
- PCG (Preconditioned Conjugate Gradient) Solvers for both dense matrix and sparse matrix

First release of selected vision functions for Versal AI Engines:
Functions available
- Filter2D
- absdiff
- accumulate
- accumulate_weighted
- addweighted
- blobFromImage
- colorconversion
- convertscaleabs
- erode
- gaincontrol
- gaussian
- laplacian
- pixelwise_mul
- threshold
- zero
xfcvDataMovers : Utility datamovers to facilitate easy tiling of high resolution images and transfer to local memory of AI Engines cores. Two flavors
- Using PL kernel : higher throughput at the expense of additional PL resources.
- Using GMIO : lower throughput than PL kernel version but uses Versal NOC (Network on chip) and no PL resources.
New Programmable Logic (PL) functions and features
ISP pipeline and functions:
- Updated 2020.2 Non-HDR Pipeline
  - Support to change few of the ISP parameters at runtime: gain parameters for red and blue channels, AWB enable/disable option, gamma tables for R,G,B, %pixels to compute min&max for awb normalization.
  - Gamma Correction and Color Space conversion (RGB2YUYV) made part of the pipeline.
- New 2021.1 HDR Pipeline : 2020.2 Pipeline + HDR support
  - HDR merge for 2 exposures which supports sensors with digital overlap between short exposure frame and long exposure frame.
    - Four Bayer patterns supported : RGGB,BGGR,GRBG,GBRB
  - HDR merge + isp pipeline with runtime configurations, which returns RGB output.
  - Extraction function : HDR extraction function is preprocessing function, which takes single digital overlapped stream as input and returns the 2 output exposure frames(SEF,LEF).
- 3DLUT : provides input-output mapping to control complex color operators, such as hue, saturation, and luminance.
- CLAHE: Contrast Limited Adaptive Histogram Equalization is a method which limits the contrast while performing adaptive histogram equalization so that it does not over amplify the contrast in the near constant regions. This it also reduces the problem of noise amplification.
Flip : Flips the image along horizontal and vertical line.
Custom CCA : Custom version of Connected Component Analysis Algorithm for defect detection in fruits. Apart from computing defected portion of fruits , it computes defected-pixels as well as total-fruit-pixels
Canny updates : Canny function now supports any image resolution.

Library Related Changes

All tests have been upgraded from using OpenCV 3.4.2 to OpenCV 4.4
Added support for Versal Edge series (VCK190)
A new benchmarking section with benchmarking collateral for selected pipeline/functions published.

The 2021.1 release provide Two-Gram text analytics:
- Two Gram Predicate (TGP) is a search of the inverted index with a term of 2 characters. For a dataset that established an inverted index, it can find the matching id in each record in the inverted index.

Community Detection: Louvain Modularity
2-Hop Search

N/A

Adds double-precision SpMV (Sparse Matrix dense Vector multiplication) implementation with L2 kernels

In 2021.1 release, GQE receives early-access support the following features
- 64-bit join support: now the gqeJoin kernel and its companion gqePart kernel has been extended to 64-bit key and payload, so that a larger scale of data can be supported.
- Initial Bloom-filter support: the gqeJoin kernel now ships with a mode in which it executes Bloom-filter probing. This improves efficiency on certain multi-node flows where minimizing data size in the early stage is important.
- Both features are offered now as L3 pure software APIs, please check corresponding L3 test cases.

GZIP Multi Core Compression:
- New GZIP Multi-Core Compress Streaming Accelerator which is purely stream only solution (free running kernel), it comes with many variant of different block size support of 4KB, 8KB, 16KB and 32KB.
Facebook ZSTD Compression Core:
- New Facebook ZSTD Single Core Compression accelerator with block size 32KB. Multi-cores ZSTD compression is in progress (for higher throughput).
GZIP low latency Decompression:
- A new version of GZIP decompress with improved latency for each block, lesser resources (35% lower LUT, 83% lower BRAM) and improved FMax.
ZLIB Whole Application Acceleration using U50:
- L3 GZIP solution for U50 Platform, containing 6 Compression core to saturate full PCIe bandwidth. It is provided with Efficient GZIP SW Solution to accelerate CPU libz.so library which provide seamless Inflate and deflate API level integration to end customer software without recompiling.
Versal Platform Supports.

Add AIE Support - See above

The 2021.1 release provide support for: * RIPEMD160 * Initial support for BLS (not complete)

In the 2021.1 release, Data-Mover is added to this library. Unlike other C++ based APIs, this addition is targeting people less experienced in HLS based kernel design and just want to test their stream-based designs. The Data-Mover is actually a kernel source code generator, creating a list of common helper kernels to drive or validate designs, like those on AIE devices.

Produce QoR metrics (Vitis QoR Generation API)
- Cycles took by Application kernel
- Stall cycles (computed from VCD file)
- Measure overhead cycles in the wrapper (time spent in other functions than the kernel itself)
- Throughput
3 levels of optimization XLOPT=0, 1 (default), 2
New functionalities for xlopt=2:
- loop fusion, flatten single iteration outer loops, enhance loop peeling heuristics
Analyze "__restrict" usage and give guidance
Incremental recompile: when the graph does not change, recompile only kernels that've been modified
Packet Switched data → up to 32-split (was limited to 4)
New DMA FIFO location constraint (mapper/router changes between release do not impact performances)
Use mapping solution as a constraint in the new compilation: prevent future mapping variations that impact performance
Bring x86sim feature support to aiesim level
Start of deprecation of PL kernels in ADF graphs (complete deprecation in 2021.2)

New “Flow Navigator” in GUI for quick access to flow phases and reports. The contextual "synthesis, analysis, debug" views are merged into a general default context
New synthesis report section for the BIND_OP and BIND_STORAGE directives
A new post-synthesis text report reflects the information provided in the GUI synthesis report
The IP export and Vivado implementation run widgets have been redesigned with options to pass settings and constraint files to Vivado
New function call graph viewer to visualize functions and loops which can be highlighted with an optional heatmap to detect II, latency, or DSP/BRAM utilization hot spots
Versal timing calibration and new controls for DSP block native floating-point operations (the -precision option for config_op)
The Vitis HLS Migration guide (former UG1391) is now a chapter in UG1399
New methodology sections in user guide (UG1399 and web)
Alternate flushable pipeline option has been improved (free-running pipeline aka "frp")
In Vitis, a top port pointer can now simply be mapped onto the axi-lite adapter rather than a global memory
The aggregate directive now provides a "-compact bit" option for maximum packing
Adds back a "Leave Feedback" entry in Help menu with optional survey
Fixed bug for "Man Pages" tab not displaying information on some Linux systems
In Vitis, reshaping m_axi interfaces should be done via the hls::vector types
New customization options for s_axilite and m_axi data storage which can be "auto, "uram", "bram" or "lutram" allowing you to tweak RAM utilization in your design
In Vitis, introducing a new continuously (aka "never-ending") running mode for kernel
The axi_lite secondary clock option has been re-instated

Enhance support for RTL kernel packaging in Vivado IP packager
- public and productized feature with proper methodology and documentation.
- XRT managed kernel is the default flow.
Support encrypted AIE source files as input
- AIE compiler can accept encrypted AIE source file and v++ supports the rest of the flow.

Add Create Boot Image Wizard support for Versal devices
Multiple improvements for AI Engine programming and debugging
- Being able to turn on and off micro code labels
- Static Cross-probing between the source code and the microcode
- Full view of the microcode
- Bringing the last PC in the visible area whenever Pipeline view updates the data
- Aligning the Instruction data in Pipe line view
- Adding "Single Instruction Mode" action to disassembly view.
Be able to generate a default BIF file for a platform project
Program Flash for SD and eMMC adds raw mode support
In-context help messages are added to AI Engine development flow
Upgraded GCC toolchain version to 10.2

Users can emulate AXI-MM master/slave through an external process such as Python / C++. This may help users to emulate design with quick design time of AXI Master / Slave, without investing resources in developing AXI Master or VIP. AXI-MM Inter-process communication can also help to emulate the Chip-to-Chip connection between two FPGAs.
Enabling compilation of Versal models for VCS.
Platform developers can run hardware emulation on the platform with standalone applications to test the platform in the early stage.

User range profiling information and user event information are aggregated into profile summary report
Vitis Analyzer shows a critical timing path.
- Vitis Analyzer will display a simplified version of the Vivado GUI timing report, without the need to open a Vivado project or netlist. This allows users to quickly navigate to the failing timing path.
Vitis Analyzer multiple strategies support
- Results from multiple strategies run can be visualized in Vitis Analyzer.

New xrt.ini switches for profiling and debug
Reduce memory and loading time for large applications
- The new profile tool takes less resource for processing large csv file, which reduces the loading time and the crashing problem occurrence.
PL continuous trace offloading improvement
- Use DDR or HBM as memory resource to store trace data
- Circular buffer support for large data offloading
- Trace buffer size and offloading interval can be set in xrt.ini
Improvements to the visualization of AIE design’s trace report
- All AIE inputs will be displayed(window, stream, cascaded stream, etc.)
- Support all IO data types

Stable native XRT API, with C++ APIs for AIE graph control and execution, Software Emulation and tracing support.
XRT provides new helper APIs to help users to move from OpenCL API to XRT native API in $XILINX_XRT/include/CL/cl2xrt.hpp.
XRT New API xrt::device.get_info() can extract device properties
Greatly improved next generation xbutil and xbmgmt utilities are now the default.
xbutil can report power status
xbmgmt can support runtime clk scale and setup user power threshold to protect board and server.
sysfs, xbmgmt and xbutil can report MAC address of Alveo board
KDS scheduler in xocl has been refactored to significantly improve the throughput across hundreds of processes exercising multiple compute units across multiple devices concurrently. For legacy shells you may notice small percentage of throughput degradation. Please see the AR for proper solution.
XRT driver debug trace support through debugfs /sys/kernel/debug/xclmgmt/ and /sys/kernel/debug/xocl/

Vitis Target Platforms

Access the latest Vitis Target Platforms for Alveo Accelerator cards at www.xilinx.com/alveo. Please refer to the Getting Started section of the accelerator card you want to deploy your applications on.

Please refer to UG1120 - Alveo Data Center Accelerator Card Platforms User Guide for more details and to keep up-to-date on the latest Vitis Target Platform releases, as they become available.

New Platforms

Alveo U200 Gen3x16 XDMA 1RP
- Name: xilinx_u200_gen3x16_xdma_1_202110_1
- Features: Slave Bridge, P2P, GT Kernel, DDR Self-Refresh
Alveo U50 Gen3x16 noDMA 1RP
- Name: xilinx_u50_gen3x16_nodma_1_202110_1
- Features: Slave Bridge, P2P, GT Kernel, Clock Throttling

Vitis Embedded Platforms

VCK190 Base Platform enables ECC on DDR and LPDDR; constraints become concise.
MPSoC base platforms increased CMA size to 1536M. All Vitis-AI models can run with this CMA size.
Embedded platform creation flow gets simplified: Device Tree Generator can automatically generate a ZOCL node; XSCT can generate BIF files. Base platform source files are reduced.

Support for Kubernetes(K8s) clusters: Xilinx FPGA Resource Manager (XRM) can now be used together with the Kubernetes to run and manage compute units (CUs) across a pool of multiple Alveo accelerator cards attached to a server and scale applications to multiple servers with Alveo cards.

AI Engines

A comprehensive constraint editor enables users to specify any constraint for AI Engine kernels in Vitis Model Composer. The generated ADF graph will contain these constraints.
Addition of AI Engine FFT and IFFT blocks to the library browser.
Users now have access to many variations of AI Engine FIR blocks in the library browser.
Ability to specify filter coefficients using input ports for FIR filters.
Addition of two new utility blocks "RTP Source" and "To Variable Size".
Enhanced AIE Kernel import block now also supports importing templatized AI Engine functions.
Ability to specify AMD platforms for AI Engine designs in the Hub block.
Through the Hub block, users can relaunch Vitis Analyzer at any time after running AIE Simulation.
Users can now plot cycle approximate outputs and see estimated throughput for each output using Simulink Data Inspector.
Enhanced usability to import a graph as a block using only the graph header file.
Revamping of the progress bar with cancel button
Usability improvement during importing an AI Engine kernel or simulation of a design when MATLAB working directory and model directory are not the same.
New TX Chain 200MHz example.
New 2d FFT examples showcasing designs with HLS, HDL, and AI Engine blocks.

HDL

Simulation speed enhancement for SSR FIR (more than 10x improvement), and SSR FFT.
Simulation speed enhancement for memory blocks like RAMs, and FIFOs
Questa Simulator updated with VHDL 2008 in the Black-box import flow

General

Vitis Model Composer now contains the functionality of AMD System Generator for DSP. Users who have been using AMD System Generator for DSP can continue development using Vitis Model Composer.
MATLAB Support - R2020a, R2020b & R2021a

Vitis Software Platform 2020.2 Release Highlights:

Vitis 2020.2 supports application acceleration and embedded software development for Versal ACAP Platforms
Vitis Core Development Kit now includes the AI Engine Compiler to compile C/C++ applications for Versal AI Engines. AI Engine, part of Versal AI Core Series, is a vector processor for compute-intensive applications
Vitis HLS is default for both accelerated-kernel compilation (Vitis) and C/C++ to RTL IP creation flow (Vivado)
600+ FPGA-accelerated functions across 13 performance-optimized libraries. 2020.2 introduces the new Vitis HPC library for accelerating high-performance computing applications and several enhancements & additions to the Data Analytics, Graph, BLAS, Sparse, Security & Database libraries
Support for evaluating multiple implementation strategies for final FPGA binary creation & enhancements for easier RTL-kernel integration within Vitis applications
Other enhancements this release include support for AI Engine application profiling, Git version control for Vitis projects, Vitis AI profiler data integration within Vitis Analyzer and enhancements for emulation modes.
Add-on for MATLAB® and Simulink® : Unification of AMD Model Composer and System Generator for DSP. AI Engine is a new domain in Add-On for MATLAB and Simulink.

What's New in Vitis 2020.2 Documents

附注

Based on testing on August 10, 2023, across 1000 Vitis L2/L3 code library designs, with Vitis HLS release 2023.2 vs. Vitis HLS 2023.1. System configuration during testing: Intel Xeon E5-2690 v4 @ 2.6GHz CPU, 256GB RAM, RedHat Enterprise Linux 8.6. Actual performance will vary. System manufacturers may vary configuration, yielding different results. -VGL-04
The benchmark tests were performed on all 1208 Vitis L1 library C-code designs as of February 12th, 2023. All designs were run using a system with 2P Intel Xeon E5-2690 CPUs with CentOS Linux, SMT enabled, Turbo Boost disabled. Hardware configuration not expected to effect software test results. Results may vary based on software and firmware settings and configurations- VGL-03

数据中心

商用系统

个人和游戏

嵌入式产品

资源

加速器

自适应加速器

DPU 加速器

以太网适配器

工作站

台式机

笔记本电脑

资源

自适应 SoC 和 FPGA

模块化系统 (SOM)

技术

开发者资源

评估板与套件

处理器工具

显卡工具和应用

自适应 SoC 和 FPGA

IP 与应用

GPU 加速器工具和应用

概要

面向数据中心和云计算

面向边缘计算和终端

面向开发人员

行业

行业

行业

Industrias

游戏

系统

技术

资源

EPYC（霄龙）处理器

Radeon 显卡与 AMD 芯片组

FPGA 和自适应 SoC

Alveo 加速器和 Kria SOM

锐龙处理器

以太网适配器

概要

处理器

加速器

自适应 SoC、FPGA 和 SOM

显卡

概要

资源按市场领域

资源按产品

资源按类型

关于我们的合作伙伴

AMD 全球支持

处理器与显卡

加速器

FPGA 与自适应 SoC

选择我们的零售合作伙伴

自适应和嵌入式计算

Get AMD Fan Gear

Get AMD Fan Gear

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Your cart is empty

New in AMD Vitis™ Software Platform

AMD Vitis™ Software Platform 2024.2 Release Highlights:

Enhancements for AMD Versal AI Engine DSP Designs

New and Enhanced Vitis Library functions for Versal AI Engines

New Ease-of-use Features in the Vitis IDE (new GUI)

Enhancements to Vitis Model Composer for AIE DSP Designs

Vitis What's New by Category

Vitis Accelerated Libraries – Targeting the AI Engine Array

Vitis Profiling and Debugging Tools

Vitis Model Composer

Vitis HLS

AMD Vitis™ Software Platform 2024.1 Release Highlights:

Enhancements for AMD Versal™ AI Engine DSP Designs

Key improvements to Vitis Unified Software Platform​

Key improvements to Vitis Unified Software Platform

Key improvements to AMD Vitis IDE (New GUI)

New DSP library functions for AI Engines

New API support for DSP functions