The framework API makes it easy for an application to call for an accelerated function. The simple API has three parameters: the source data (address/location); the accelerated action to be performed; and the destination (address/location) to send the resulting data.
Meanwhile, the innovative FPGA framework logic implements all of the computer engineering interface logic, data movement, caching, and pre-fetching work–leaving the programmer to focus on their accelerator functionality, or “Action Code” on the FPGA. The framework takes care of retrieving the source data (whether it is in system memory, storage, networking, etc.) as well as sending the results to the specified destination. The programmer, writing in a high level language such as C/C++ or Go, needs only to write their data transform, or “action code”. Framework compatible compilers translate the high-level language to Verilog, which in turn gets synthesized using Xilinx’s Vivado toolset.
The CAPI infrastructure provides the technology and ecosystem foundation to enable datacenter applications to integrate with FPGA acceleration. The technology base has everything needed to support the datacenter-virtualization (for multiple simultaneous context calls), a threaded model (for programming ease), removal of the device driver overhead (performance enablement), and an open ecosystem (for the masses to build upon).
But there is still a skills gap between the FPGA experts (computer engineers) and the programming experts working for most Independent Software Vendors (ISVs). CAPI SNAP bridges that gap by providing a simple API to call for an accelerated action, and through programming methods to code customized accelerated actions on the FPGA.
The simplicity of the API parameters is elegant and powerful. Not only can source and destination addresses be coherent system memory locations, but they can also be attached storage, network, or memory addresses. For example, if a framework card has attached storage, the application could source a large block (or many blocks) of data from storage, perform an action such as a search, intersection, or merge function on the data in the FPGA, and send the search results to a specified destination address (such as main system memory). This method has two large performance advantages compared to the standard software method:
- By moving the compute (in this case, search, intersection or merge) closer to the data, the FPGA has a higher bandwidth access to storage and the data does not need to travel as far. The data reaches the compute faster (up to 4x as fast) and with a lower power profile.
- The accelerated action on the FPGA is faster than the software search.