BlueDBM tackles storage network gremlins with FPGAs
By Jack Clark, 31st January 2014
Massachusetts Institute of Technology Parking Lot, off Memorial Drive, Facing West (Photo credit: MIT-Libraries)
Distributed file systems may be cheap to run, but their performance can be atrocious when the network becomes saturated, and some boffins are hoping to change this so to better simulate our universe.
MIT researchers have tried to solve the network saturation problems bought about by SSD-loaded distributed storage systems with a new approach named BlueDBM, and hope the approach will give scientists a boost when running complex simulations.
One potential application of the BlueDBM system is speeding a University of Washington simulation of the universe.
“Scientists need to query this rather enormous dataset to track which particles are interacting with which other particles, but running those kind of queries is time-consuming,” MIT Sang-Woo Jun told MIT News. “We hope to provide a real-time interface that scientists can use to look at the information more easily.”
This tech sees the boffins sit field-programmable gate arrays (FPGA) between the host computer and the storage, and lash them together via their own network. The result is a low-latency, high bandwidth, scalable storage system that has an order of magnitude greater performance than Microsoft’s rival “CORFU” system [PDF].
The secret to this performance increase is the combination of PCIe-based flash storage with a storage controller implemented on an FPGA that is linked to all other controllers by multi-gigabit low latency serial links with a SERialize/DESerializer (SERDES) function that is implemented directly within each FPGA.
By doing this “each node is able to access remote storage with negligible performance degradation,” they write. “Not only does the controller-to-controller network provide pooling of storage capacity, but it also allows combining the throughput of all nodes on the network, resulting in linear throughput scaling with more nodes.”
When the MIT boffins evaluated a four-node implementation of the BlueDBM system they found it had a zippy network with an average packet latency of around 0.5 microseconds.
“Considering that the typical latency of a flash read is several tens of microseconds requests in our network can, in theory, traverse dozens of nodes before the network latency becomes a significant portion of the storage read latency,” they write in the Scalable Multi-Access Flash Store for Big Data Analytics paper [PDF].
From he perspective of an end-user, the prototype BlueDBM system has “has an average latency to client applications of about 70 microseconds, which is an order of magnitude lower than existing distributed flash systems,” they write.