A Closer Look at Storage Latency—And Why It Matters

By | Feb 27, 2023 | All, Technology

How fast is an instant, as in instant gratification? For most software users, research shows that a delay of .1 seconds or less is considered instantaneous. At one second, the user’s mind starts to wander. If an application doesn’t respond in five or six seconds, the user is likely to get frustrated or stop using the app altogether. Much of the time, storage infrastructure is the culprit in these delays, which result in poor end-user experiences and other negative impacts on a business.

But excess wait times can be caused by several different things in an organization’s data storage infrastructure. Too often, the storage industry focuses on input/output operations per second (IOPS) or throughput (MB/s), instead of latency, which is the metric that should be front and center.

 

 

What is storage latency?

Storage latency refers to the time it takes for a storage device to send and respond to a read or write request. Specifically, storage latency is the time required for a read or write command to move across the entire storage ecosystem, from application request to final output. It’s the metric that is affected by  all four components of the ecosystem:

      • NAND flash
      • Storage firmware
      • SSD controller that powers the firmware
      • Storage system infrastructure

Latency takes into account how efficient the storage firmware is and how quickly it can utilize CPU resources to process input/output requests.

Like most things in computing, faster is better. The less time it takes a request to traverse the storage system, the lower the latency and the faster the request is processed. Latency is one of the most important factors that should be considered when evaluating performance of computing workloads, especially those that are transaction-heavy.

 

 

Why is storage latency so important?

An article in Forbes used an interesting analogy to explain why latency is so important in storage:

“You are commuting to work and drive mostly on a highway. The highway has certain characteristics like a number of lanes, which is akin to bandwidth, and a certain capacity in the total number of cars per hour, which is akin to IOPS. The problem is that even if you know both of these values, you can’t answer the most important question: How long will it take to get to work?”

It goes on to say that latency is that final measurement, the time it will take to get to work. That’s what really matters when you’re planning your commute, and while the number of highways and the capacity of cars per hour will affect that travel time, the actual time it takes is the true measure of efficiency.

When latency is low, there’s less overall idle time in the computing system. Utilization of resources is more efficient and organizations can actually get more value from their existing storage.

 

Flash storage improved latency in some ways

Back when hard disk drives (HDDs) were the storage medium of choice, latency time fell in the range of milliseconds (one-thousandth of a second). With today’s NAND flash solid state drives (SSDs), however, we now measure latency in microseconds (one-millionth of a second). That’s a pretty big improvement, but you might be surprised to find that the latency can sometimes reach a peak by a factor of 10.

While flash media increased throughput and reduced latency as the evolution of the NAND flash technology, it hasn’t affected latency in the other storage system components at all. SSD vendors can differentiate themselves by improving latency in the other levels of the storage system.

For instance, the SSD controller plays a very significant role in increasing or decreasing latency. It’s the component that orchestrates the entire process of sending a read or write request through the system. No matter how fast the flash storage media is, latency will still be high if the controller can’t send or receive that data quickly as well. Also, an input/output (I/O) operations request might be assigned to a flash chip that is already working on an I/O operation, and as a result become queued.

Other reasons an SSD might have higher latency include:

      • Wear leveling – there are two types of wear leveling mechanisms: dynamic and static. Static wear leveling periodically moves the static/cold blocks that are rarely accessed so that the low-usage cells are able to be used by other data. Although this technique can extend the device lifespan, the complex background process involves multiple operations to move static data around, which will cause latency and impact performance of the SSD.
      • Garbage collection (GC) – GC is one of the main causes of the long-tail latency problem in storage systems. Long-tail latency due to GC is more than 100 times greater than the average latency at the 99th percentile. In this behind-the-scenes process, the SSD controller identifies stale pages of data in the block and erases them after moving all the useful data on those blocks elsewhere. This can slow down an SSD and increase latency.
      • Media scan – an SSD controller sometimes runs a media scan in the background to proactively find and fix media errors before they cause a real problem during read/write operations. This can cause a rise in SSD latency.

 

 

How Phison addresses storage latency

As one of the industry’s leading providers of PCIe NVMe SSD controllers and SSD modules, Phison is heavily invested in R&D—not only to develop tomorrow’s technology solutions, but also to improve its existing solutions.

Some of the ways Phison helps reduce latency in its SSD solutions include:

      • Optimized firmware design – Phison can customize firmware based on different use cases and customer requirements. For instance, some firmware operations can be designed to execute during idle time to avoid increasing latency. QoS can be optimized with Phison’s in-house technology to meet unique enterprise requirements. In most cases, Phison can design custom SSD controllers and modules for specific needs and designated latency performance, including ultra-low latency.
      • Proprietary CoXProcessor 2.0 – this coprocessor includes a hardware accelerator that offloads some of the original CPU’s loading, helping to make the storage device more efficient in issuing read/write commands.
      • Dual CPU setups – with each CPU performing independently within a single SSD, the drive can simultaneously handle multiple commands for read/write. Different operations can be processed in parallel, which reduces latency.
      • Garbage collection firmware – Phison SSDs split garbage collection loads into small pieces and process them little by little to help reduce latency and improve consistency.

 

 

The Foundation that Accelerates Innovation™

en_USEnglish