Xen and the art of virtualization - my notes
Contents
Some notes taking from reading the paper ‘Xen and the art of virtualization’.
Xen is an x86 virtual machine monitor allowing multiple commodity OSs to share conventional hardware in a safe and resource managed fashion.
Virtual machine
goal: isolation, hosting services
design principles
- must be isolated from one another
- should support a variety of different OS
- the performance overhead should be small
VM vs process
- process can not achieve performance isolation
- why not process? - processes do not give you performance isolation
VM vs container
- both of them have performance isolation
- VM can run different OS
VM vs. exokernel
- VMs give illusion of having the entile physical machine -> do not need to modify os
- libos must expilicit request physical resourse -> need to modify os
Full virtualization vs paravirtualization
full virtualization - (VMware)
- guest OS shouldnt be modified
- OS -> hardware: software write registers (ex: page fault, disk access… VM context)
- VMM save VM context and swap
- VMM: kernel mode, guest OS: user mode
- CRUX: guest OS may have privelage instruction, but it run in user mode -> raise an exception -> handle by VMM
why difficult for x86
- in x86 some privlige insr do not trap -> binary rewriting
- x86 is difficult to virtulize memory
paravirtulization
- we can modify the OS slightly: benefit? drawness?
- Hypercalls: simillar to exokernel sys calls, key diff with full virtual. Ex: replace the problematic priviledge operation with a hypercall
design priciples of paravirtualization
- support for unmodified application binnaries
- support full multi-application OS
- should obtain high performance and strong resourse isolation
- completely hiding the effects of resource virtualization
Xen vs Denali
- Denali does not target existing ABIs -> does not fully support x86 segmentation
- Denali implementation does not address the problem of supporting app multiplexing / Xen hosts a real OS may securely multiplex itself
- Denali VMM performs all paging to and from disk / Xen each guest OS proform its own paging using own guaranteed mam reservation
- Denali virtualize ’namespace’ of machine resources / Xen has access control within
Virtualization in Xen
mem virtualize:
- unix: VPN->PPN
- full virtual: two stage mapping, VPN->PPN->hardware, shadow page table have pointer to hardware addr
- Xen:
- hypercall to change page table, mem manage by VMM (has direct read access)
- cannot install fully-privileged segement descriptors and cannot overlap with the top end of linear address space
- avoid TLB flush
cpu virtualize
- priviledge: guest OS must be modified to run at a lower provilege level, running at same level as apps, OS run in a seperate address space (for x86, Xen can be run in ring 1)
- exception handlers:
- hypercall to VMM telling where the exception hander located, and VMM tell hardware, so it could be executed directly via ring 0. (descriptor table for exception handlers)
- guest OS need to modify page fault handler because it would read from a priviledged register.
- system call - install a fast handler can be call diectly
- hypercall
- has both real and virtual time
domain 0
- a domain created at boot time and permitted to use the control interface
- privilge domain management, can set VMM parameters and control other domains
device virtualize:
- full: 2 stage
- patial: DMA, shared-memory,asynchronous I/O ring
control
- hypercall: domain -> Xen (sync)
- event: Xen -> domain (async)
- pending: stored in per-domain bitmask
multiple notion of time
- real time
- virtual time and BVT scheduling
- wall time
network
- packet filter(rule), domain 0 responsible for inserting and removing rules
- zero copy reception: by using descriptor rings, DMA
disk
- only domain0 has direct unckecked access to physical disks, others: VBD
- reorder requests whthin guest OS and within Xen
- Domains may pass down reorder barriers to prevent reordering
evaluation method
- compariison
- metrics: speed
- workload: benchmarks (macro and micro)
performance overheads
- slower in fork, exec, sh: require large numbers of page table updates, must all be verified by Xen
- context switching time: executes a hypercall to change page table
- page fault latency: Xen require two stage: 1. take hardware fault and pass details to guest OS. 2. install updated page table entry on guest OS
- high level of synchronous disk activity
Paravirtualization today
- OS modification -> hard to take off
- performance: not a big deal
Author xymeow
LastMod 2018-09-19