A REVIEW PAPER ON ENERGY AWARE VIRTUAL MACHINE MIGRATION IN
CLOUD DATA CENTER
Aishwarya Kaushik 1, Amisha Gupta 2, Arushi Jindal3
Department of Computer Science
Abstract— Cloud computing provides resources
to an amount near to infinity at a competitive rate and allows users to obtain
resources according to demand with pay-as-you-go pricing model. A cloud can
be expanded to satisfy the increased resource requests and reduced to improve
the system’s resource utilization.
For these improvements that in the world of Cloud computing they are first
experimented on simulation software called CloudSim.
challenges include efficient Provisioning of the resources in cloud computing
because it keeps changing and also needs to support different types of
Virtualization technology helps to increase the
utilization of resources but, still the price of operation of cloud increases
gradually mainly due to electrical energy consumption. So in order to reduce this,
virtual machines (VM) are dynamically combined to lesser number of physical
machines (PMs) by live VM migration technique. But this may cause SLA violation
and the provider is penalized. So to maintain a better energy-performance, the
number of VM migration should be minimized1. So in order to reduce the number of VM
migration this paper proposes an algorithm that will decide whether it is
really necessary to migrate a VM depending on the present as well as future
load which is predicted based on the multi- layer feed forward neural network.
Neural Networks, Service Level Agreement, Cloud Computing, Resource Mnagement,
Virtual Machine Migration, Cloudsim
Cloud computing is a model for
enabling at every place the convenient, on-demand network access to a
collection of configurable computing resources (e.g., networks, servers,
storage, applications, and services) that can be provided fast and released
with minimal management effort or interaction with the service provider. The
main aim of cloud service providers is maximizing the utilization of their datacenters
to execute the user applications using minimal physical machines.
For fulfilling of SLAs, a right resource management scheme
would be required which includes dynamically allocating each service request the
minimal resources that are needed leaving the surplus resources free to provide
work to more virtual machines. Cloud provides computing resources in the form
of virtual machine, (an abstract machine that runs on physical machine). The mapping between VMs and PMs can be changed while
applications are running by VM live migration technique. It transfers state of
a VM from one physical machine to another with minimum downtime1.
A host may have insufficient resources which do not meet the demands. So, forecasting
is performed to find hosts which may become overloaded in near future. This will reduce the number of VM migrations
because we will decide accordingly whether VM migration needs to take place or
not. The load is forecasted to find appropriate destination host
for the placement of VM. When the host is found to be overloaded in future it
will be removed from the target host which is set. It will be also not
considered as under-utilized server when the VM consolidation is done. When
load predication model is used this algorithm reduces number of virtual machine
migration and along with it saves energy providing a green IT solution.
Cloudsim is a library that
simulates the scenarios of cloud computing which have features including
modeling and simulation of cloud infrastructures2. So, we use this simulator
for implementing the Artificial Neural Network forecasting technique to predate
future load demand, based on this load we decide whether VM migration performed
II. Literature Survey
Cloud computing is a computing
paradigm. It provides the basic level computing services to meet the everyday
needs of the general community. A cloud is a type of parallel and distributed
system consisting of a collection of inter-connected and virtualized computers.
These are dynamically provisioned and presented as one or more unified
computing resources based on Service Level Agreements established by service
providers to the consumers. It provides a path by which the applications over
the Internet can be accessed. It also allows the user to create, configure and
customize applications online from anywhere in the world on demand.
Some of the cloud based
applications are social networking, web hosting, content delivery and real time
instrumental data processing for example Amazon EC2,Microsoft Azure, Google App
The various service models in
Cloud Computing include-
a service which can access the basic resources like physical machines, virtual
machines, virtual storage, etc.
B) PLATFORM-AS-A-SERVICE (PAAS)
PaaS is a service which provides the runtime
environment for various applications, development, deployment tools, etc.
C) SOFTWARE-AS-A-SERVICE (SAAS)
a service model which can allow usage of software applications as a service to
The features of cloud computing are-
The technologies working behind
cloud computing platform making it reliable, flexible and usable are-
Virtualization provides sharing physical instance of an application from a collection
of customers which can be performed by providing a logical name to a physical
resource and providing a pointer to that physical resource when it is demanded.
Architecture gives applications
as a service for other applications without the affect of the type of seller and
output. Thus exchange the data can be performed between applications of
different sellers without additional changes in the services.
Cloud infrastructure consists of –
Hypervisor or the Virtual Machine manager- a low-level
program which allows sharing of one physical instance between several
Management Software- maintaining and arrangement of
Deployment Software- place and combine software in
Network-gives permission to connect services of cloud
Server- helps in the calculation of sharing of
resources and gives other services like resource allocation,
de-allocation, resource monitoring and security.
Storage- multiple copies of storage so if one cancels
providing data then it can be extracted from the other.
the copy of an actual process or system over time. It opens a possibility of
evaluation the evaluation which need to be done before the actual software
development in an environment where one can find the results. Simulation allows
repeating the experiments under a similar environment. Thus, it allows the
comparison of different strategies including scheduling.
CLOUDSIM4 is a new unspecialized and expandable
simulation framework that allows modeling, simulation and experimentation of
the infrastructures and services of cloud computing. It provides basic classes
which describe datacenters, virtual machine applications, users, computational
resources and policies.
(Vm) can run inside a host along with sharing of host list with different VMs.
It also processes in accordance to the policies which are given by cloudlet
classes available in cloudsim package are-
Datacenter- models core infrastructure level
services by cloud providers
Datacenter Broker- broker acting on behalf of
Cloudlet-models cloud based application services
CloudSchedular- rules to share of power of processing among cloudlets in VM –
i)space shared ii)time shared
Host- models physical resource like storage
NetworkTopology- information for inducing
CloudSim- main class which manages event queries
and keeps check of the number wise execution of all occurances
VmSchedular- models the policies required to
allocate processor core to VM
SimEntity- represents a simulation entity which
can send messages to other entities
VmmAllocationPolicy -represents the stipulation
rules that a VM uses for appointing VM’s hosts.
DataCenterCharacterstics- contains arrangement
details of resources of data centers
RamProvisioner- gives the rules of appointing
primary memory to all the VMs involved.
large scale of Cloud computing data centers can be
modeled and simulated
virtualized server hosts with rules
through which resources to virtual machines simulated can be performed
application containers simulation
energy-aware computational resources
can be experimented before actual implementation
Topologies of data network and
message-passing applications modeling and simulation
federated cloudsim deployment of
internal and external cloud services for business roles can be modeled
momentary insertion of simulation
elements, stopping and continuation of simulation can be performed
User-defined policies for allocation
of hosts to virtual machines and policies for allocation of host resources
to virtual machines.
Artificial Neural Networks
neural networks are computational models inspired by working of neurons on the
brain that are capable of learning6. They compute values from
inputs through the complex network. The sets of numerical parameters (Weights)
are worked on by using a learning algorithm and linear functions are
approximated. The weights show the strength of the connection between neurons
and are activated during training and prediction.
paradigm is the novel structure of the information processing system. It is
consists of a large number of highly interconnected processing elements
(neurons) working in together to solve problems. ANNs7, like people, learn by example. An ANN is arranged for specific
applications, such as recognition of pattern or classification of data, through
a learning process
type of artificial neural network used in common includes three groups, or
layers, of units: a layer of “input” units that is
connected to a layer of “hidden” units that is
further connected to a layer of “output”
information to be fed into the network is represented by the input units.
activity of each hidden unit is determined by the activities of the input units
and the weights on the connections between these units.
behavior of the output units depends on the activity of the hidden units and
the weights between the hidden and output units.
The hidden units can
construct their own representations of the input. The weights between the input
and hidden units determine when each hidden unit is active. These weights are
modified and a hidden unit can choose what it represents. There are two types
of architectures in ANN- single-layer and multi-layer. The single-layer
organization is the one in which all units are connected to one another and it
constitutes more potential computational power than multi-layer organizations
which are hierarchically structured. In multi-layer networks, units are often
numbered by layer, instead of following a global numbering
D. Backpropogation Algorithm7
Units are connected to one another. A real
number is associated with each connection, which is called the weight of the connection.
It is denoted by Wij, the weight
of connection from unit ui to
unit uj. The weight matrix W represents the pattern whose elements are the
weights Wij. Connections include: excitatory and inhibitory. The positive
weights are excitatory connection while the negative weights are an inhibitory
The activity of output layer is shown by
following two step procedure.
First, the total
weighted input xj is calculated, using the formula:
Where yi represents
activity level of the jth unit in the previous layer and Wij represents the weight of the link between
the ith and the jth unit.
Next, the activity yj is shown using some function which gives the
total weighted input. The sigmoid function is used:
As soon as the activities of all output
units have been found, the error E is
computed by the network, which is given by the following expression:
Where yj represents
the activity level of the jth unit in the top layer and dj gives the output of the jth unit
The back-propagation algorithm has four
1. How fast the error changes with the
activity of an output unit. This error derivative (EA) is the difference
between the actual and the desired activity.
2. How fast the error changes with the
total input that is received by an output unit is changed. This quantity (EI)
is the answer from step 1 multiplied by the rate of change in the output of a
unit with its total input.
3. How fast the error changes with the
weight on the connection into the change in an output unit. This quantity (EW)
is the answer from step 2 multiplied by the level of activity of the unit from
which the connection emanates.
4. How fast the error changes with the
activity of a unit in the previous layer so that back propagation can be
applied to multilayer networks. When the activity of a particular unit in the
previous layer is changed, the activity of all the output units to which it is
connected is changed. So in order to find the overall effect on the error, all
these separate effects on output units are added. But each effect is simple to
calculate. It is the answer in step 2 multiplied by the weight on the
connection to that output unit.
By the use of steps 2 and 4, the EAs of
one layer of units into EAs for the previous layer can be calculated. This
procedure can be repeated to get the EAs for as many previous layers as
desired. Once the EA of a unit is found, steps 2 and 3 can be used to compute
the EWs on its incoming connections.
III. Forecasting Using ANN and VM Migration
Artificial Neural Network (ANN) proves itself as a notable
method for irregular series and multiple-period-ahead forecasting. The live migration of Virtual machines helps in the even
distribution of load across the physical machines by hosts. It thus reduces the
spreading of sever by consolidation of VM .1
VM migration is an expensive
operation when it comes to consumption of resources of the source and destination
hosts. Also network bandwidth transfers the memory image of the VM. So VM
migration should be done in a restricted manner, otherwise it may lead to
degradation in performance by unnecessary movement of virtual machines.
In order to reduce the number of VM migration we have
proposed an algorithm that will decide whether migration of virtual machine is
necessary depending on the present as well as future load which is predicted
using Artificial Neural Network forecasting technique. The future values are
predicted based on previously observed values. We have applied the Feed forward
and back propagation algorithm to the past data to make forecasts.
So whether the migration will take place or not is found
out and if it is happening then where the VM will be migrated depend on the
current as well as predicted CPU utilization.
The input nodes take the present CPU utilization which is
further used to predict the future CPU utilization. Now, the error between the
actual and predicted CPU utilization. And accordingly back propagation is
Migration decision is taken by checking whether the
predicted value is greater than the current upper threshold or not.
VMs will be selected from the overUtilizedHosts list using Minimum
Utilization VM selection policy (class). The last step is the VM placement
which is basically a bin packing problem.
Modification of the list of destination hosts available is
done by predicting their future load. The rest of the code is the same. All the
VMs are sorted in decreasing order based on their current CPU utilizations.
Then each VM is allocated to a host that provides least increase in power
consumption after the VM allocation.
IV. Performance Analysis
A) After Experimental Setup
It is very difficult to conduct repeatable
large-scale experiments on a real infrastructure and it is required for the
evaluatation and comparison of the proposed algorithms. So in order to ensure
that the experiments can be repeated, simulations have been chosen in such a
way so that it can evaluate the performance of the proposed problem solving
method. The simulation platform is CloudSim toolkit which allows the modeling
of virtualized environments, supporting the provisioning demand resources, and
their management. It has been extended to enable energy aware simulations,
which is not provided by the core framework. Along with energy consumption
modeling and accounting, incorporation of the ability to simulate service applications
with dynamic workloads has been done. The implemented extensions have been
included in the 3.0.3 version of the CloudSim toolkit.6
Each network was trained for 500 epochs
when the learning rate was set at 0.5.
a ) Host
Two server configurations with
dual-core CPUs published in February 2011: HP ProLiant ML110 G4 (Intel Xeon
3040, 2 cores × 1860 MHz, 4 GB), and HP ProLiant ML110 G5 (Intel Xeon 3075, (2
cores × 2660 MHz, 4 GB) is to be used. The configuration and power consumption
characteristics of the selected servers are shown in Table I. The reason why servers with more cores are not used is that it is important to simulate a large
number of servers to evaluate the effect of VM consolidation. Thus, simulating
less powerful CPUs is advantageous. Nevertheless, dual-core CPUs are sufficient
to evaluate resource management algorithms designed for multi-core CPU
b) Virtual Machines
The frequency of the servers’
CPUs is mapped onto MIPS ratings: 1860 MIPS each core of the HP ProLiant ML110
G5 server and 2660 MIPS each core of the HP ProLiant ML110 G5 server. Each
server is modeled to have 1 GB/s network bandwidth. The characteristics of the
VM types correspond to Amazon EC2 instance types‡ with the only exception that
all the VMs are single-core, which is explained by the fact that the workload
data used for the simulations come from single-core VMs (Section 7.3). For the
same reason the amount of RAM is divided by the number of cores for each VM
type: High-CPU Medium Instance (2500 MIPS, 0.85 GB); Extra Large Instance (2000
MIPS, 3.75 GB); Small Instance (1000 MIPS, 1.7 GB); and Micro Instance (500
MIPS, 613 MB). Initially the VMs are allocated according to the resource
requirements defined by the VM types. However, during the lifetime, VMs utilize
less resource according to the workload data, creating opportunities for
c) Workload 2
For this experiment, data is provided
as a part of the CoMon project, a monitoring infrastructure for PlanetLab. The
data on the CPU utilization by more than a thousand VMs from servers located at
more than 500 places around the world is to be used. The interval of
utilization measurements is 5 minutes. Randomly 10 days are to be chosen from
the workload traces collected during March and April 2011.1
B) SIMULATION RESULT AND ANALYSIS
Number of VM migration
Energy consumption in Kw
Average SLA violation %
As compared to Mean Average
Deviation Minimum Utilization algorithm, ANN algorithm turns out to make fewer
amounts of VM migrations along with reduction in the energy consumption as well
as the average SLA violation.
This experiment has been done in
the case of single data center. If it is used for distributed data centers then
the network overheads will affect the performance of the simulator.
V. Conclusion and Future Work
In this paper a computational
model Artificial Neural Networks is used as the base of the forecasting of
future load which further helps in deciding whether Virtual Machine migration
should take place or not. Because it is known that VM migration is a process
that requires stopping service as well as energy consumption for a certain
time, for this to be minimum the server usage has to be about 60-70%.
This experiment has to be performed in CloudSim and can be further utilized in
the actual Cloud environment in future creating better application performance.