“In a spine and leaf architecture, compute resources – racks of servers equipped with CPUs, GPUs, FPGAs, storage, and/or ASICs – are connected to leaf or top-of-rack switches, which then connect through various aggregation layers to the spine,” Google wrote. “Traditionally, the spine of this network uses Electronic Packet Switches (EPS), which are standard network switches provided by companies like Broadcom, Cisco, Marvell, and Nvidia. However, these EPS consume a significant amount of power.”
“Apollo is believed to be the first large-scale deployment of optical circuit switching (OCS) for data center networking. The Apollo OCS platform includes a homegrown, internally developed OCS, circulators, and customized wavelength-division-multiplexed (WDM) optical transceiver technology that supports bidirectional links through the OCS and circulators. Apollo has served as the backbone of all Google data center networks, having been in production for nearly a decade, supporting all data center use cases.
“Incorporating the Apollo OCS layer replaces the spine blocks, resulting in significant cost and power savings by eliminating the electrical switches and optical interfaces used in the spine layer. Google uses these optical switches in a direct connect architecture to link the leaves through a patch panel. This method is not packet switching; it functions as an optical cross-connect,” Google stated.
“OCS switches offer high bandwidth and low network latency, along with a significant reduction in capital expenditures. This is due to their ability to reduce the number of required electrical switches, thereby eliminating costly optical-to-electrical-to-optical conversions,” said Sameh Boujelbene, vice president with the Dell’Oro Group. “Moreover, unlike electrical switches, OCS switches do not need frequent upgrades when servers adopt next-generation optical transceivers.”
However, OCS is still an emerging technology. “To date, only Google has managed to deploy them at scale in its data center networks after many years of development. Additionally, OCS switches may necessitate changes to the existing fiber infrastructure, depending on the cloud service provider,” Boujelbene said.
“OCS switches have been deployed at Google in spine layer but with the emergence of AI applications, we see them being deployed more inside the AI clusters because of the benefits that they bring,” Boujelbene said.
Standardizing optical transport technologies
Requirements for higher speed Ethernet networking equipment are evolving as AI networks expand. For example, there’s rising demand for 800G Ethernet employing 800ZR high-speed optical transmission technology and OpenZR+, the industry initiative to develop interoperable standards for coherent optical transceivers.
At the 400G Ethernet level, 400ZR has been “a great success for the coherent pluggable industry with multiple suppliers and a tremendous volume of 400ZR QSFP-DD and OSFP modules deployed in metro DCI [data center interconnect] applications,” according to Cisco’s Acacia website. (Cisco acquired optical maker Acacia Communications for $4.5 billion deal in 2021.)
“Network grade pluggable optics such as 400ZR and others will see significant uptick in deployments in 2024 in communication service provider networks,” IDC reported recently.