IBM FlashSystem® is focused on low latency hardware data paths. IBM FlashCore®, which is the heart of IBM FlashSystem technology, implements a controller design that uses techniques such as health binning, heat segregation, read voltage shifting, and hard decision error correction codes to lower both read and write amplification while providing consistent low latency responses to a broad range of workloads. This technology has been delivered as part of the currently available IBM FlashSystem 900, the first all-flash array with inline hardware compression and 3D TLC. The latest release in November of 2017 used these techniques to deliver the new 3D TLC-based IBM FlashCore Modules.
The flash modules for FlashSystem 900 are a custom form factor that provide the flexibility to implement not only a unique controller design, but also enable fewer restrictions on component placement, power, and cooling requirements than standard form factors. These modules also offer the implementation of customized control and communication interfaces that offer the ability to provide hardware data path protocols focused on a single solution. This reduces overhead in the communication protocol and takes advantage of owning both ends of the interface between the module and the system. As seen below, the card is laid out to allow for multiple NAND controllers and a front-end complex called the gateway that connects these controllers to the rest of the system. The gateway can be seen on the left side of the card, while the controllers are located on the right side of the card.
The controllers in this design are IBM’s custom controller intellectual property. Derived from the work that was acquired from Texas Memory Systems, IBM FlashCore customizes the controller designs for each class of NAND. IBM research teams focus on understanding the technology and characterizing how the technology reacts to different workloads and data patterns. Using that characterization, development teams drive changes to the controller design while mitigating the effects of retention, endurance, and the complexities of erase, programming, and read effects in the media. As the media characteristics evolve, the controller design adapts to these changes.
To take IBM FlashCore architecture further and expand into other products, IBM FlashCore would need to move from a custom form factor to a common standard form factor as well as a common interface. The common form factor chosen was the 15mm 2.5” U.2 form factor. The common interface was the NVMe protocol over a PCIe interface.
Designing for a common form factor created new challenges. The current design relies on batteries in the system to provide the de-stage power for the custom flash modules. With the new IBM FlashCore Module, the design required that the de-stage function be wholly self-contained, which created a few immediate challenges to overcome. The first, was that up to now, the module did not have the ability to self-power after power was removed. The second was that with an FPGA-based design, the power requirements of the controller were typically higher than with ASIC-based designs. Finally, the power draw of the solution made a DRAM to NAND de-stage not possible in the holdup period provided by the physical capacitance on the board.
To solve these challenges, IBM implemented an emerging persistent memory technology – STT-MRAM technology from Everspin, which allowed us to avoid de-staging the DRAM to NAND but did allow the FPGA to shut down within milliseconds of power loss and allowed the design to rely on the available bulk capacitance.
The MRAM area not only provides the design with near memory-like access speeds that do not require de-stage on power loss, but also allows us to simplify some areas of the controller design that need persistent data sets. The previous controller used volatile memory and relied on a de-stage function to preserve the information. In the new implementation, the STT-MRAM is split into a number of functional areas, including a journal checkpoint area and heat segregation bins for the write cache streams. On power loss, the controller sends a final set of commands to the MRAM memory controller, while closing all the outstanding pages, preserving them through the power loss.
Physical constraints required by the standard form factor drove the need to innovate on the controller and FPGA designs, including integration of components that previously were separate. An MPSoC FPGA was key to delivering the new form factor. Working with Xilinx and understanding what technology was available for the time frame we needed, the Zynq Ultrascale+ MPSoC FPGA was selected as the integration point for the architecture.
In previous designs, an external processor and multiple FPGAs were used to develop the solution. For the standard form factor design, a single FPGA now integrates the control processor, the gateway FPGA, and the controller FPGA, allowing for the solution to fit in the physical footprint.