For years, data centers have been notching incredible performance gains and cost reductions by virtualizing servers. As these virtualization projects have become more mature in their operation, the productivity gains have slowed. In addition, data center managers see performance challenges resulting from the inability to easily re-allocate additional compute, memory, storage or access to special processors like graphics processors to applications that need it.
These resources need to be locally attached and, in some cases, dedicated to that application. This makes it more expensive to right-size the resources for applications, or to scale resources for fast growing applications.
The solution is resource disaggregation; that is creating pools of interconnected compute power, storage, memory and other resources that can be flexibly apportioned to applications as they need it.
For several years, the data center industry has wanted such a network, but has always lacked a low-latency, high performance interconnect to make it happen. That’s where CALIENT comes in.
I wrote about a similar, but more limited application, earlier this year after a trip to SC2015. I saw a demonstration of a graphics processing (GPU) board hardwired to a server to allow 3D visualization users to do their work on an ordinary laptop that was network connected to this server. That GPU board, however, was not available to other users not connected to that server, who might utilize that resource for another application.
The CALIENT S-Series Optical Circuit Switch can help to make that GPU board a stand-alone resource accessible by all users and servers. And what the S-Series can do for the GPU application, it can also do for a broader disaggregated data center.
Here are some of the characteristics of a disaggregated data center:
- Separate the core components of compute nodes: compute, memory, accelerators, storage, etc.
- Compute clusters made up of specialized equipment racks will become the core building blocks of the data center. The rack is no longer the building block of the data center.
- The compute node interconnect physical layer will be optical
- The interconnect protocol between compute resources within a compute node must be native. TCP/IP is not viable for compute node interconnects.
- Composition of compute nodes cannot be limited by hardware defined infrastructure (HDI) boundaries.
Once the computing resources are separate, there’s a need for a physical interconnect bus that can offer low-latency interconnect across rack boundaries and transport the 100Gb/s serial data streams that today’s optical interfaces are capable of. Some additional interconnect requirements include:
- The interconnect solves a compute node architecture problem. Node interconnects most replicate data buses, not networking links.
- Because the interconnect is, in fact, an external bus, low latency is of utmost importance. Latency needs to be speed of light; even physical distance inside the data center will limit the size of a compute cluster.
- The interconnect protocol must be as native as possible. The interconnect protocol will depend on the specific device, and no single protocol will meet the disparate requirements of all resource types within a compute node.
- Throughput: the interconnect must continuously and deterministically operate at the full native bus rate.
These requirements eliminate any electrical network or interconnect fabric; only an all-optical switch like the S-Series switches can meet these requirements. What does the S-Series bring to the disaggregated data center?
- It is completely protocol agnostic. An S-Series fabric will support any native protocol and at any data rate.
- It has speed of light latency. There is no OEO conversion and no buffering.
- To maximize the economic benefit of resource pooling, the S-Series must be large enough to support all the devices in a compute cluster; the 320-port S320 is ideally sized for the data center.
- The OCS is very low cost. With no intermediate active optics devices and 45W total system power, the OCS adds very little total cost to the compute cluster.
There’s much more to recommend the CALIENT S-Series for this application and the application note covers more of the core technology and features that the product brings to this network.
The growth in data is showing no signs of abating, which means that data centers must keep increasing their performance and efficiency so that enterprises and service providers can continue to serve their customers. A disaggregated data center architecture with an S-Series interconnect is one way to take data center performance to the next level.