vNUMA: Virtual Multiprocessors on Clusters of Workstations

A Presentation by Matthew Chapman

Shared memory multiprocessors, such as SMP and ccNUMA topologies, are typically simpler to program and administer than clusters of workstations. However, in many cases there has been a trend towards using clusters of commodity workstations, due to cost and scalability considerations.

We present a system called vNUMA which attempts to bridge the gap using virtualisation. vNUMA allows a virtual shared memory multiprocessor to be constructed from a cluster of workstations, by utilising a thin hypervisor on each node. The operating system sees a NUMA system with a collection of CPUs and a large memory space across multiple nodes. The shared memory is implemented using techniques from the area of distributed shared memory, using a protocol that can adapt to different page access patterns. The current implementation of vNUMA runs on the Itanium architecture, and we also take advantage of Itanium's relaxed memory ordering model to permit certain optimisations.

We have been using Linux as a guest operating system, since it is open source and supports NUMA hardware. vNUMA is an extremely NUMA system, with high penalties for remote access, and it is no surprise that we encounter various performance bottlenecks. Some of these are unique to vNUMA, but some are more general NUMA scalability problems which are exaggerated in the vNUMA environment. In both cases we can improve the performance by a combination of clever algorithms in vNUMA and small modifications to Linux.

Direct link to video