High-Radix Scalable Modular Crossbar Switches

2016-05-01T00:00:00Z (GMT) by Cagla Cakir
As process technologies have scaled, the increasing number of processor cores and memories
on a single die has also driven the need for more complex on-chip interconnection networks.
Crossbar switches are primary building blocks in such networks-on-chip, as they can be used
as fast single-stage networks or as the core of the router switch in multi-stage networks.
While crossbars offer non-blocking, single-hop, all-to-all communication, they tend to scale
poorly with the number of nodes due to the latency and energy of the long wires and highradix
multiplexor structures needed. In this work, we investigate how to improve crossbar
performance, energy-efficiency, and scalability.
To better understand the design space and scaling limitations, we have developed an on chip
switch modeling tool calibrated using circuit-level simulations. The tool enables a design
space exploration showing how area, power, and performance vary across radix, data width,
wire parameters, and circuit implementation. In addition to conventional design options,
we examined capacitively coupled low-swing signaling to improve to energy consumption of
the I/O wires. This exploration shows that the main bottlenecks are the long I/O wires and
the key to improving the performance and efficiency is to minimize the area. Using these
insights, we present modular crossbar switches that can perform better at high radices than
the monolithic designs. The modular sub-blocks are arranged in a controlled flow-through,
pipelined scheme to eliminate global connections and maintain linear performance scaling
and high throughput. Modularity also enables energy savings via deactivation of unused
I/O wires.
To evaluate our design, we implemented a prototype radix-64 modular crossbar switch
testcip in 40nm CMOS bulk process. The testchip operates at 2.38GHz at 1V nominal
supply voltage and consumes 1.2W power. It offers 2.2X better throughput and 2.4X better
energy-efficiency than published state of the art designs. We further evaluated modular
crossbar networks with the proposed crevaluation tool. The proposed design achieves more than 90% saturation throughput with
an internal speed up of 1.5, supports high data line rates, and offers lower average network
latency compared to conventional crossbars. Evaluation results show that modular crossbars
are scalable to high-radices while still offering high-performance, energy-efficiency and onehop
simplicity.ossbar switches using BookSim2, a network on chip