Some notes taking from reading the paper ‘Xen and the art of virtualization’.

Xen is an x86 virtual machine monitor allowing multiple commodity OSs to share conventional hardware in a safe and resource managed fashion.

Virtual machine

goal: isolation, hosting services

design principles

  • must be isolated from one another
  • should support a variety of different OS
  • the performance overhead should be small

VM vs process

  • process can not achieve performance isolation
  • why not process? - processes do not give you performance isolation

VM vs container

  • both of them have performance isolation
  • VM can run different OS

VM vs. exokernel

  • VMs give illusion of having the entile physical machine -> do not need to modify os
  • libos must expilicit request physical resourse -> need to modify os

Full virtualization vs paravirtualization

full virtualization - (VMware)

  • guest OS shouldnt be modified
  • OS -> hardware: software write registers (ex: page fault, disk access… VM context)
  • VMM save VM context and swap
  • VMM: kernel mode, guest OS: user mode
  • CRUX: guest OS may have privelage instruction, but it run in user mode -> raise an exception -> handle by VMM

why difficult for x86

  • in x86 some privlige insr do not trap -> binary rewriting
  • x86 is difficult to virtulize memory

paravirtulization

  • we can modify the OS slightly: benefit? drawness?
  • Hypercalls: simillar to exokernel sys calls, key diff with full virtual. Ex: replace the problematic priviledge operation with a hypercall

design priciples of paravirtualization

  • support for unmodified application binnaries
  • support full multi-application OS
  • should obtain high performance and strong resourse isolation
  • completely hiding the effects of resource virtualization

Xen vs Denali

  • Denali does not target existing ABIs -> does not fully support x86 segmentation
  • Denali implementation does not address the problem of supporting app multiplexing / Xen hosts a real OS may securely multiplex itself
  • Denali VMM performs all paging to and from disk / Xen each guest OS proform its own paging using own guaranteed mam reservation
  • Denali virtualize ’namespace’ of machine resources / Xen has access control within

Virtualization in Xen

mem virtualize:

  • unix: VPN->PPN
  • full virtual: two stage mapping, VPN->PPN->hardware, shadow page table have pointer to hardware addr
  • Xen:
    • hypercall to change page table, mem manage by VMM (has direct read access)
    • cannot install fully-privileged segement descriptors and cannot overlap with the top end of linear address space
    • avoid TLB flush

cpu virtualize

  • priviledge: guest OS must be modified to run at a lower provilege level, running at same level as apps, OS run in a seperate address space (for x86, Xen can be run in ring 1)
  • exception handlers:
    • hypercall to VMM telling where the exception hander located, and VMM tell hardware, so it could be executed directly via ring 0. (descriptor table for exception handlers)
    • guest OS need to modify page fault handler because it would read from a priviledged register.
  • system call - install a fast handler can be call diectly
  • hypercall
  • has both real and virtual time

domain 0

  • a domain created at boot time and permitted to use the control interface
  • privilge domain management, can set VMM parameters and control other domains

device virtualize:

  • full: 2 stage
  • patial: DMA, shared-memory,asynchronous I/O ring

control

  • hypercall: domain -> Xen (sync)
  • event: Xen -> domain (async)
  • pending: stored in per-domain bitmask

multiple notion of time

  • real time
  • virtual time and BVT scheduling
  • wall time

network

  • packet filter(rule), domain 0 responsible for inserting and removing rules
  • zero copy reception: by using descriptor rings, DMA

disk

  • only domain0 has direct unckecked access to physical disks, others: VBD
  • reorder requests whthin guest OS and within Xen
  • Domains may pass down reorder barriers to prevent reordering

evaluation method

  • compariison
  • metrics: speed
  • workload: benchmarks (macro and micro)

performance overheads

  • slower in fork, exec, sh: require large numbers of page table updates, must all be verified by Xen
  • context switching time: executes a hypercall to change page table
  • page fault latency: Xen require two stage: 1. take hardware fault and pass details to guest OS. 2. install updated page table entry on guest OS
  • high level of synchronous disk activity

Paravirtualization today

  • OS modification -> hard to take off
  • performance: not a big deal