17 February 2009

thin vs thick hypervisors

I made a presentation at the 2008 Linux Plumbers Conference about "thin" vs "thick" hypervisors, a subject very important to KVM in embedded systems. It's true Linux doesn't fit everywhere, and many embedded systems people have been dismayed by Linux's increasing memory footprint (which also hurts performance through cache and TLB pressure). Kevin Lawton has also written an article about this issue: it would increase KVM's appeal in embedded systems if we could first slim it down.

However, there's an important issue here that is sometimes obscured by memory footprint: functionality. Many of the proprietary "embedded virtualization" solutions (offered by vendors like OK Labs, Trango/VMware, VirtualLogix, et al) are thin precisely because they don't do a whole lot. In many cases, they are strictly about isolating hardware. That sounds good, right?

Strictly hardware isolation is a double-edged sword, because while it allows you to minimize the virtualization layer (good for memory/flash footprint, security, etc), it doesn't let developers take advantage of most of virtualization's benefits. I'm not even talking about "frills" like live migration (which could still be considered critical for high-availability in network infrastructure); I'm talking about basic consolidation, which is virtualization's bread and butter (and yes, software consolidation still makes sense in embedded systems).

You have 2 cores? If you only have 1 network interface, with a thin hypervisor you can only have 1 partition on the network. But even if you have 2 network interfaces for your 2 cores, there are still pesky bits of hardware you must consider. How many interrupt controllers do you have? How many real-time clocks? How many UARTs, and can you isolate them with the MMU, or are they all in the same page? If your software stacks require nonvolatile storage, how many flash controllers do you have?

For very simple services like PIC control, you can get away with embedding that directly in the hypervisor itself. Note however that the PICs on some modern systems require rather complicated configuration code, and re-implementing that can actually be pretty tricky. (I'm thinking about x86 ACPI and PowerPC device tree parsing here.)

Regardless, the only alternative for these services with a "thin" hypervisor is a dom0-style service partition. ("dom0" is the Xen terminology for the all-powerful Linux partition that is allowed to muck with all hardware on the system; all other partitions depend on it for IO services.) Once you go that route, you still have a thin hypervisor, but you've completely lost the security and reliability benefits that may have brought you to virtualization in the first place.

So for very simple use cases on very low-resource systems, thin hypervisors can make sense. But once you start to need anything beyond strict hardware partitioning, you've already entered the world of "thick" hypervisors, and the larger footprint of KVM becomes much less of an obstacle.

2 comments:

  1. Well written Hollis.

    I agree with most of the content, minus the last paragraph. I don't agree with the term 'very low-resource systems'.

    Thin hypervisors provide less features then thick hypervisors for sure (I like thick vs thin distinction much better than 'type 1' vs 'type 2'). As you stated, thin hypervisors typically do not provide device virtualization (where a single device can be shared by multiple virtual machines).

    However, in many embedded scenarios this is not a problem. A good example is the networking industry. Many systems there are not exactly low resource, think 8 core systems, > 10 GB of memory. People often still prefer to build these systems with thin hypervisors (in my experience).

    The reasoning here is that thin hypervisors provide better throughput, latency and determinism than thick hypervisors. With thin hypervisors the guests have direct access to the devices, where in thick hypervisors there is a delegation through either dom0 or through the hypervisor itself.

    In general, whenever people are looking towards real-time operating systems, determinism, latency and throughput I find them looking towards thin hypervisors.


    (disclaimer: I work for Wind River as a product manager for a thin hypervisor)

    ReplyDelete
  2. Does anyone know of an open source (read: 'free') version of something like NxTop out there?

    I do some work with non-profits who are constantly getting an influx of slightly better donated machines and we have to spend tons of time ripping the (donated by microsoft volume-licensed) xp (which we need to run their stuff) off the soon-to-be-recycled machines and reinstalling to the 'new' ones.

    I'd love to have a thinnish kqemu hypervisor running under a linux kernel that I can stick on these old machines, then drop hardware-agnostic images onto.

    Can't afford NxTop and being that it is Xen based, it won't run on non-virtual extension cpus anyway (and machines with VT are not coming in as donations yet). I'm not smart enough to custom-compile something (perhaps Damn Small with kqemu auto-loading?). Is anyone else out there?

    ReplyDelete