Nowadays, people rely on big data applications to embrace different kinds of services. Take the famous delivery robot Amazon Scout for example. We can now receive packages without human involvement.
It’s not difficult to imagine that more unprocessed data will be generated by the increasing demand for various services in our daily lives. International Data Corporation (IDC) predicts that worldwide data will grow tremendously to 175 ZB by 2025. The ever-growing volumes of data thus become a factor that drives the development of Computational Storage. This is a state-of-art storage system that can speed up data processing, make real-time analytics more efficient and shorten the time we wait for our package and data.
The distinction between conventional and Computational Storage models
In traditional computing architecture, data frequently moves between storage systems and memory units of the application servers. The CPU first requests the data from the storage system before the data is transferred to the CPU to do the computing. After the CPU finishes processing, the results are sent back to the storage system to be saved. This process can be repeated thousands of times. The high cost of moving data during these repetitive steps can result in extra energy consumption and degrade the performance of big data applications.
Unlike the Conventional Storage model, the Computational Storage model moves the processing to the data system, and no data is sent to the CPU. It means that data workloads are processed directly on the storage controller. It seeks to analyze and process the data where it resides.
This is how it works. The CPU sends a request to the storage subsystem, but the data doesn’t need to leave the storage system. Instead, the operation is carried out by the drive itself. Computational Storage dramatically saves time and energy by eliminating the complex and repetitive steps that require large volumes of data to be sent back and forth across a network.
The components of Computational Storage
-
-
- Computational Storage Drives (CSD): A CSD is a Computational Storage device that represents either an ASIC or a microprocessor embedded in a storage device. It can perform computation in the storage system and supports persistent data storage.
-
-
-
- Computational Storage Processors (CSP): A CSP is a processor positioned as the controller of an array of SSDs. It’s a component that provides computational services and functions to an associated storage system, but it doesn’t offer persistent data storage
-
-
-
- Computational Storage Arrays (CSA): A CSA is a collection of CSD and CSP that contains both compute and storage with optional storage devices and the control software.
-
In a nutshell, Computational Storage is a storage subsystem that combines several CPUs located on the storage media, Computational Storage arrays or their controllers.
Three key benefits of Computational Storage
1. Minimum bandwidth and power: By using Computational Storage, data is not moved as frequently across the interface, allowing the user to allocate more data on the drive. By reducing data movement between storage and the main CPU, only the end result has to be delivered to the host. Computational Storage offers significant power savings and additional I/O bandwidth.
2. Strip out latency: As storage sizes usually vastly exceed memory, the data must be read in chunks. This slows down real-time data analytics and affects the efficiency of high-performance computing. However, Computational Storage can break these bottlenecks. By putting processing onto the storage subsystem itself, massive amounts of data can be processed where the data originates. It shortens the time taken to move, analyze and process the data, which significantly improves latency and network bandwidth utilization.
3. Data-Centric Computing: Unlike computer-centric architecture, data-centric architectures are designed to analyze massive data volumes, emphasizing the first data concept. By installing processing capabilities directly in the storage application, Computational Storage can free up CPU cycles for the high-level functions and tasks and enable parallelism for specific workloads that improve throughput and performance.
The Phison approach
Speaking of all these benefits above, Phison, as one of the leading companies in the computing industry, will soon launch an all-in-one solution for Computational Storage. Phison has accomplished this to allow for broad and immediate deployment of this new and innovative technology. We are partnering with many current server manufacturers to provide customers with turnkey solutions. The new technology not only can reduce network traffic, parallel computing but also ease other constraints on computing, I/O, memory and storage. Phison will offer the best platform and the widest range of solutions for all the data centers and cloud service providers to make the best use of more extensive data applications.