Virtualization.

What is Virtualization?

Virtualization is a technology that helps to mimic the behaviour of another system on the existing system by enhancing or extending the current interface. This is nothing new. It was first employed by IBM in the 1960s to make the legacy software run on newer versions of the mainframe.

OS is an interface between hardware and different programs running on the system, programs interact with the hardware using the interface provided by the operating system. The interfaces provided by different operating systems (eg. Windows 10, Unix, Linux etc) are different, so a program written to run on one operating system won’t run on any other operating system because of the difference in the interface. We say the programs are not portable. A good and popular example of this is Java. Java solves this issue of portability by running java compiled code(byte code) on a Java Virtual Machine(JVM). JVM is a layer of software running on top of an OS which mimics an interface expected by the Java byte code. So portability is guaranteed between systems running the JVM.

This idea of virtualization can be applied to any interface between different layers of a computing system. In a computer system, the lowest layer is the hardware(bare metal) primarily the processor. The interface a processor provides to interact with it is its instruction set (machine instructions or at a human level assembly instructions). Different processor families have their own instruction sets. Now if we put software on this bare metal to separate the hardware and provide an interface which will allow us to emulate(mimic) multiple instances of the underlying hardware what we’ve achieved is hardware virtualization.

The layer above the hardware is the OS layer. On top of the OS layer are libraries/APIs and above that are the application program layers. If this idea of mimicking is applied between OS and library layer what we get is OS virtualization. Similarly, we can achieve application virtualization. Virtualization technology can be used for virtualizing any hardware or software resource, like storage virtualization, NIC (Network Interface Card) virtualization, LAN virtualization VLAN etc.

Why do we need Virtualization?

Virtualization is this idea of changing mindset from physical to logical, which means creating many logical resources from one physical resource (you can create multiple VMs on one physical machine). Virtualization is used to optimize resource utilization, portability, and flexibility. Consider an enterprise which runs multiple physical servers. Say one physical server is running a web server, 2nd physical server is running a database, 3rd a load balancer. Most of these enterprise servers are over-provisioned by 70% to 80%(done to protect the system from occasional load bursts) meaning only 30% of the resource available is used. Using hardware virtualization these servers can be moved to three Virtual Machines(VM) created on a single physical server thereby saving cost in terms of hardware, maintenance and power consumption. These VMs are mimicking a hardware interface so will have their own operating system, virtual NIC, virtual storage etc, so an application running on them will be no different from running on the physical server. Even from a security perspective, these VMs are isolated from each other and if required can communicate with each other using the VNICs. This is often referred to as server consolidation using virtualization.

These VMs are typically a single file, so like any digital files, they can be moved from one machine to another easily and opened on the new machines as VMs(this does require special software). This helps with portability.

As these VMs are files, they can be easily backed up. Also, we can easily create snapshots(an image of the VM at a point in time), this comes in handy for performing destructive testing. Any damage to the system because of the tests be rectified by just deleting the current VM and resurrecting a new one from the snapshot created before the tests. This is how virtualization helps with flexibility.

Different Types of Virtualization

Based on the implementation, major types are:

Full/Native Virtualization:

In this type of virtualization, the software is used to emulate the native interface itself, meaning if the hardware is X86, then the emulator makes an exact copy of the X86 so any software that can run on X86 will run this emulation without any modification. Obviously, the question is why we want to emulate an X86 on an X86. Simple, with this approach we can make multiple copies of X86(call VMs) and each copy can have its own software stack, from OS to applications.
Examples: Oracle VirtualBox, VMWare Workstation

Para Virtualization:

In this form of virtualization, the OS running on VMs(Guest OS) is modified to convert all the privileged and sensitive instructions (instructions which should be run only in kernel mode) with function calls which are trapped and the virtualization layer and serviced. Examples: Xen, ESX Server

Based on the resource being virtualized:

Server virtualization

It basically involves partitioning physical servers so that multiple virtual servers can share the same physical server(refer to server consolidation above).

Operating system virtualization

In this form of virtualization, the OS is divided into multiple isolated user-spaces(name-spaces) allowing multiple applications to run in their own isolated user space. From the perspective of the application, it is as good as running on a regular physical machine. It is slightly difficult to comprehend this idea. Picture this, in a Linux server, there could be many processes running. Now if you can isolate some of these processes so that they cannot be seen outside of their own silo then what you did is OS virtualization. Here, the processes running in these silos will behave as if they are the only processes running(meaning no different from a process running on VM or physical server). These silos are called Containers most popular one is Docker Container. The docker containers are created on top of the LXC Linux container part of the Linux kernel. Simply stated OS virtualization is a way of running isolated containers (lightweight Linux) on a single Linux kernel.

These containers are lightweight compared to VMs because every VM has its own full-blown operating system which makes it heavyweight. Containers just mimic multiple instances of OS interface but do not have a full-blown copy of the OS as VMs do. Containers are lightweight so can be provisioned on demand in seconds and provides needed security. Also, each container is isolated from the others(or can be allowed to talk by using the appropriate configuration).

Containers gained popularity because of microservices architecture, most of the legacy enterprise applications are monolithic (simply means one consolidated process which may include all modules supporting multiple enterprise services/functions, which could be finance, HR, purchase etc ). In monolithic applications, the whole system will have to be shut down even when only one of the functions/modules need maintenance causing a system-wide outage, not very modern business friendly. With micro-services architecture, these different functions/services can be separated and run on different containers(with all the required access to other services) there allowing these services to be maintained without impacting unrelated services.

Data Virtualization

Data virtualization is a logical data layer that integrates all enterprise data siloed across disparate systems, manages the unified data for centralized security and governance, and delivers it to business users in real time.

Network Virtualization

Network virtualization is used by cloud providers to develop SDN(software-defined networks). Basically, an abstraction is created on top of the traditional networking infrastructure which makes the maintenance of the networks easy, as the control plane is software based.

How does Virtualization work?

The primary component of any virtualization is software that separates the physical resource(resource to be virtualized) from the virtual environment, this software is called hypervisor. The hypervisor is what controls/allocates the underlying physical resources to the virtual instances. Hypervisors can be thought of as specialized operating designed to run Virtual machines. Hypervisors are of different types. Depending on where they are located, type 1 and type 2 hypervisors.

Type 1 hypervisor

Type 1 hypervisor runs directly on top of the hardware. They are also called bare metal hypervisors. In this case, the hypervisor is installed on the server directly(kind of like OS) and on top of which multiple VMs are created. For Type 1 hypervisor it is the hypervisor that boots up on system startup, not the OS. This type of hypervisor is used by enterprises for virtualizing IT resources.

Type 1 hypervisor acts like an OS and runs all the privileged and sensitive operations on behalf of the VMs. OS on the VMs is not modified in any way. The OS on VMs when trying to execute a sensitive or privileged instruction a trap is generated by the processor (the processor should have additional circuitry to support this) which gets serviced by the hypervisor.

Examples:

  1. VMware ESX and ESXi
  2. Microsoft Hyper-V
  3. XenServer
  4. Oracle VM

Type 2 hypervisor

Type 2 hypervisors run on a host operating system. This type of hypervisor uses the host OS services for providing virtualization. The VMs sit on top of the hypervisor layer, the stack has hardware, OS, Hypervisor, and VMs. Each of these VMs could have its own OS referred to as Guest OS.

Type 1 hypervisor acts like an OS and runs all the privileged and sensitive operations on behalf of the VMs, OS on the VMs are not modified in any way. The OS on VMs when trying to execute a sensitive or privileged instruction a trap is generated by the processor (the processor should have additional circuitry to support this) which gets serviced by the hypervisor.

Examples:

  1. Oracle VirtualBox
  2. Microsoft Virtual PC
  3. Vmware workstation