The Future of SSDs – Part II

NVMe, SCM Product, SSD that Fits in a 3.5-inch Bay and Computational Storage

By | Jun 28, 2021 | All, Technology

I am hoping that you have had a chance to read my part I blog post on storage class memory, MRAM, large capacity solid state drives. Part II continues on with other SSD technologies that are changing the storage landscape. As always, Phison is here to help!


 

 

When do you think NVMe will supersede SATA and when do you think Gen4 will become dominant?

In many respects, both of those changes have already come about on the client side. The SATA interface persisted for many years because value configurations focused on HDD and HDD were only available with a SATA interface. Though there is no technical reason preventing a 300-600 MB/s HDD from adopting a Gen3x2 PCIe interface. The price of SSD has come down to the point where, over 75% of laptops were equipped with SSD in 2019. The advantages in weight reduction, battery life improvement and mechanical warranty issues outweigh any saving attributed to lower-priced HDD. The value tier will likely stay on PCIe Gen3x4 for a few more years, but the mainstream and premium tiers are making a broad adoption of Gen4x4.

The enterprise space now has more PCIe SSD sales than SATA SSD, but at this point SATA is likely to stay around for another 4-8 years. The enterprise has a typical 4-year refresh cycle and there is already a very large install base for SATA. organizations that needed the faster speed have already switched to Gen3 NVMe. Over time, SATA and SCSI-based equipment will become less common. The enterprise Gen3 install base is expected to start a large migration to Gen4 this year, but the migration will be gradual. We expect another 4 years of solid sales of Gen3 in this space. That is why we refreshed our popular E12 controller with the new FX controller which has the lowest IOPS/Watt on the market.

 

 

How does Phison view storage class memory?

SSDs were easy to integrate into PCs and data center storage because they are 100% compatible with existing infrastructures. This applies to the server chassis, PC cases, laptops, BIOS, OS and applications. Initial deployments could not take full advantage of the SSD characteristics, but users did see an immediate benefit when switching over due to lower power, faster sequential speed and higher robustness.

On the other hand, SCM is typically implemented on the DDR bus as an NVDIMM. Existing applications can’t take advantage of the non-volatile aspect without significant changes because they are designed to treat DDR as volatile. This knocks SCM off the easy adoption path. Placing the SCM behind an NVMe interface addresses the backward compatibility problem, but current SSD can already saturate the PCIe bus. The only benefit to using SCM as storage is that it has a lower individual command latency. It turns out that there are very few applications that can take advantage of the latency gain above what SSDs already offer. As such, you end up with an SSD that is significantly more expensive and provides no real benefit for most applications. We do believe SCM has a place in the SSD, but it is not as primary storage.

 

 

What is Phison doing with respect to computational storage?

We already have a type of computational hybrid device that is very successful: Smart NIC. They combine a high speed NIC (typ. 10 GB/s) with a powerful CPU or FPGA. Though this combination works for NIC, it does not work as well for storage. The reason is fairly straight forward. The Smart part of the NIC is processing data that is already passing through the NIC to the host. The Smart NIC works well when it can process data as it streams through or when the Smart NIC can service a request by directly accessing resources within the chassis while bypassing the host CPU.

The typical value proposition for computational storage is presented as followed: the SSD is closer to the data – it frees up bus bandwidth and it offloads the host CPU. At face value computational storage appears to be an easy sell, but it hasn’t turned out that way.

1. First the SSD today is already using 100% of its resources and power budget to service its primary function. In many cases, high density enterprise SSD must limit performance to avoid exceeding their power or cooling budget.

2. Second the SSD are typically using small CPU cores that are nowhere near what the host CPU or a GPU can do. Third, this experiment has already been tried before computational storage was a buzzword. One company attempted to combine a GPU and SSD, but the solution ended up degrading both technologies. To meet the GPU requirements, the SSD had to run very fast and add significant heat load to the GPU. The GPU is much hotter than an SSD and created substantial retention stress on the NAND.

3. Lastly, an SSD is a consumable item that has a finite write bandwidth, whereas a GPU can run indefinitely until it becomes obsolete. This last point created warranty issues that were difficult to resolve.

 

 

Taking a different approach, we could add a more powerful CPU directly on the SSD, but we run into the RAM limitation. Today, most enterprise SSD maintains a 1000:1 NAND to DDR ratio. The SSD only needs to pull a few bytes for every 4K LBA translation, so the DDR bandwidth requirement is relatively low. This means SSD can use slower grade DRAM which lowers the entire module cost. Adding a larger guest CPU to the SSD, along with more DDR for applications, decreases the power available for the SSD’s primary role of providing IO to the main host. It also increases the SSD cost but does not provide a proportional gain in compute power. The SSD PCB is also quite small, so adding more components means there is less space for NAND.

Then there is the general problem of data reliability. All hardware eventually fails, but most organizations cannot tolerate the loss of data (ie: think of a bank database w/ your account balance). To protect against this type of failure, data is usually striped across multi-unit RAID sets so no one SSD will ever see the full data set. We could change the way storage is used, ensuring each SSD always sees complete data elements and use full replication to ensure redundancy. This approach is not likely to take hold because this model does a poor job of sharing storage bandwidth if only one SSD contains the data that is currently needed. RAID stripes address this problem by staggering the accesses so that each subsequent client starts shortly after the current client. We could extend the model where each SSD has a full copy of a data set by implementing replication across multiple units, but then we must add a lookup and load share mechanism. Duplication also has a much higher storage footprint than simple RAID5 or RAID6. Simply put, the way we use storage today is cost effective, easy to deploy and works well for most scenarios. Completely changing the storage infrastructure for what amounts to adding a few server CPU is hard to justify.

Despite the downside for general purpose computational storage, there are specific cases it does make sense. It occurs when the storage use-case mirrors the winning case for Smart NIC. That is to say that the SSD only needs to process the data once as it moves through the device. We can associate encryption and compression with computational storage, but that’s a stretch. It is more accurate to define these two use-cases as in-line or streaming data processing using a very simple algorithm.

Phison and one of our customers developed a product where we have found a computational storage application that is well suited to the SSD. It does not require a large amount of memory or CPU power and does not interfere with the primary purpose of the SSD which is storage IO. We are developing a security product that uses machine learning to look for signs the data is being attacked. It can identify ransomware and other unauthorized activities with no measurable impact on the SSD performance.

 

What about other types of computational storage workloads: on-the-fly encryption / compression / dedupe?

These three workloads can be associated with computational storage, though they pre-date the buzzword by several decades. As mentioned above, streaming workloads are easy for an SSD to handle, though search and post-processing are less effective.

Encryption and compression fall into the streaming category. Phison offers on-the-fly encryption on our Opal and FIPS 140-2 SSD products. Compression is easy to accommodate on the SSD and aligns with the streaming model concept, but it provides limited benefit given that most of the bulk data (photos, video or music) is already fully compressed. There are large data sets that can benefit from compression, but the use-case is relatively uncommon, so it tends to be relegated to dedicate server appliances.

The case for dedupe breaks the streaming model for several reasons:

1.  It requires a huge amount of memory to track the hashes for each sector, but the SSD PCB does not have room for more DRAM

2.  SSDs are already fully tasks in data center environments, so any work spent searching is taken away for host IO.

 

 

The only real benefit in having the SSD perform the search is a slight reduction in PCIe bus transfer time and a reduced load on the host CPU. Conversely, the SSD has to go up in cost due to higher computational requirements and additional DRAM. Its active power also necessarily must go up. For the organizations that do require dedupe, the problem is better implemented using spare system resources, particularly over night when people are sleeping, instead of adding 10-20% to the SSD cost.

 

How does Phison help branded customers differentiate their products?

Phison acts as an on-demand engineering service for our partners. Each company has a different idea of what aspects to prioritize on their SSD. We configure our product to align with their requirements. Some customers focus on price, others want low power and others still go after the upper end of performance. This is a win-win for both sides, because Phison can focus on engineering, while our customers can focus on selling the drives. This division of labor lowers Phison’s business risk by spreading development cost across many sales organizations. Our partners lower their overall risk by only paying for the engineering service they are using without having the ongoing operational expense of maintaining large engineer teams. If they offer a product to the market that does not sell, they can quickly adapt by ordering a different configuration

 

 

Phison announced an enterprise controller with MRAM, but where is it?

Enterprise ASICs have a longer development cycle than client ASICs. Phison’s next generation high-end enterprise controller is now in the engineering sample stage and we expect product roll out 2H’2021. Once the mainstream solution is in mass production, we will start enabling MRAM. We expect to announce the MRAM based solution in Q2 or Q3 2022.

If you have any questions and/or want information on any other Solid State Technologies, please contact us.

The Foundation that Accelerates Innovation™

en_USEnglish