TMLAI-13419.pdf

Page 1 of 20

Transactions on Machine Learning and Artificial Intelligence - Vol. 10, No. 6

Publication Date: December, 25, 2022

DOI:10.14738/tmlai.106.13419. Jason, S. (2022). Real-time Virtual Machine Energy-Efficient Allocation in Cloud Data Centers Using Interval-packing Methods.

Transactions on Machine Learning and Artificial Intelligence, 10(6). 15-34.

Services for Science and Education – United Kingdom

Real-time Virtual Machine Energy-Efficient Allocation in Cloud

Data Centers Using Interval-packing Methods

Sebagenzi Jason

Department of Information Technology

AUCA University, Kigali 2461, Rwanda

ABSTRACT

The reduction of power consumption, which can lower the operation costs of Cloud

providers, lengthen the useful life of a machine, as well as lessen the environmental

effect caused by power consumption, is one of the critical concerns for large-scale

Cloud applications. To satisfy the needs of various clients, Virtual Machines (VMs)

as resources (Infrastructure as a Service (IaaS)) can be dynamically allocated in

cloud data centers. In this research, we study the energy-efficient scheduling of real- time VMs by taking set processing intervals into account, with the providers' goal of

lowering power consumption. Finding the best solutions is an NP-complete problem

when virtual machines (VMs) share arbitrary amounts of a physical machine's (PM)

total capacity, as demonstrated in numerous open-source resources. Our strategy

treats the issue as a modified interval partitioning problem and takes into account

configurations with dividable capacities to make the problem formulation easier

and assist save energy. There are presented both exact and approximate solutions.

The proposed systems consume 8–30% less power than the existing algorithms,

according to simulation data.

Keywords: Cloud computing; Cloud data centers; resource scheduling; fixed processing

intervals; modified.

INTRODUCTION

Based on a number of recent developments in virtualization, Grid computing, Web computing,

utility computing, and related technologies, cloud computing is developing. Through the

Internet or an internal network, cloud computing offers platforms and applications on demand

[1]. Google App Engine [2], IBM Blue Cloud [3], Amazon EC2 [4], and Microsoft Azure [5] are a

few examples of new cloud computing platforms. Software can be shared, allocated, and

aggregated via cloud computing, together with computational and storage network resources.

The concealment and abstraction of complexity, the efficient utilization of remote resources,

and virtualized resources are a few of the major advantages of cloud computing. Since there are

still many difficult problems to be solved, cloud computing is still regarded as being in its

infancy [1,6-8]. Youseff et aldetailed .'s ontology of cloud computing's five main layers—cloud

applications (SaaS), the cloud software environment (PaaS), cloud software infrastructure

(IaaS), the software kernel, and hardware (HaaS)—is established in their paper published in

Youseff et aljournal, .'s Cloud Computing, [9], and it is used to illustrate how these layers are

related to one another and how they depend on earlier technologies. We concentrate on

Infrastructure as a Service (IaaS) in cloud data centers in this study. There has been a lot of

study done on cloud data center-related topics. The main problems and solutions in cloud

Page 2 of 20

Transactions on Machine Learning and Artificial Intelligence (TMLAI) Vol 10, Issue 6, December - 2022

Services for Science and Education – United Kingdom

computing are outlined by Armbrust et al. [1]. Grid computing and Cloud computing are

contrasted by Foster et al. [8]. By taking into account dynamic traffic models, Tian [10]

introduces multi-dimensional algorithms for cloud data centers. One of the essential services

in cloud computing is IaaS. Creating an on-demand resource management system for IaaS in

cloud environments is crucial.

Regarding cloud architecture, Liu et al[11] .'s GreenCloud architecture intends to cut down on

data center power usage while maintaining user-perceived performance. Using the suggestions

made in its open-source incubator for Cloud standards, DMTF [12] concentrates on

standardizing interactions between Cloud environments. The Eucalyptus open-source Cloud- computing system is introduced by Nurmi et al. [13]. A dynamic and integrated load-balancing

approach for resource scheduling in Cloud data centers is proposed by Tian et al. [14].

Beloglazov et al. [6] provide a taxonomy and assessment of energy-efficient data centers and

Cloud computing, and Garg et al. [15] introduce a Green Cloud framework for enhancing the

carbon efficiency of Clouds. A state-of-the-art research study on Green Cloud computing is

provided by Jing et al. [16], who also identify three hot research areas. The interrelationships

between energy consumption, resource usage, and performance of aggregated workloads are

studied by Srikantaiah et al. [17]. By condensing active activities, Lee et al. [18] offer two online

heuristic algorithms for resource-efficient use in Cloud computing systems. In addition to

reducing the overall number of migrations, Beloglazov et al. [19] investigate offline allocation

of virtual machines (VMs) using updated best-fit bin-packing techniques. In a Xen virtualized

system, Liu et alinvestigation .'s of performance and energy modeling for live VM migration and

evaluation of models using five sample workloads. In order to automatically distribute

resources and decrease the energy usage of web-service applications, Guazzone et al. [21]

develop a two-level control approach. In order to provision VMs in Cloud data centers, Kim et

al. [22] model a real-time service as a real-time VM request and employ dynamic voltage

frequency scaling techniques. Real-time VM scheduling taking fixed processing intervals into

account is still not well researched, unlike dynamic voltage frequency scaling approaches.

The scheduling of resources is crucial in cloud data centers. The allocation and migration of

VMs with full life cycle limitations, which is sometimes overlooked, is one of the difficult

scheduling difficulties in cloud data centers [23].

In order to address the aforementioned major concerns, we provide in this study a framework

for real-time VM scheduling in IaaS that takes set processing intervals into account. The

following are the key goals of this essay:

• Giving a unified picture to make managing many heterogeneous physical machines (PMs) and

virtual machines (VMs) with diverse combinations easier. Through this single access point,

administrators and users will then be able to more easily manage and keep an eye on their

expanding collections of VMs.

• Taking into account the distribution of VMs with their set processing intervals (full life cycles).

The majority of research studies frequently ignore this. The challenge becomes more

challenging when critical capacity and real-time restrictions are taken into account.

• Using traditional interval scheduling and bin-packing approaches to create scheduling plans

for offline and online environments. Our models take into account various intervals sharing a

PM's total capacity during sometimes if the capacity restriction is satisfied, which sets them

Page 3 of 20

Jason, S. (2022). Real-time Virtual Machine Energy-Efficient Allocation in Cloud Data Centers Using Interval-packing Methods. Transactions on

Machine Learning and Artificial Intelligence, 10(6). 15-34.

URL: http://dx.doi.org/10.14738/tmlai.106.13419

apart from conventional interval scheduling problems (ISPs). Our models take into account the

VM life cycle, which sets them apart from conventional bin-packing problems (BPPs).

• Theoretical proofs are used to create optimal and approximative energy-saving methods for

the offline environment, where the scheduler is aware of every request in advance.

• For the online context, where the scheduler only knows the current requests, approximate

techniques with delay and minimum migration are proposed in light of the observation that the

total number of PMs and their time at power-on influence total power usage.

• Some current results that take into account the energy impact under various configurations

and energy models are based on a single, basic set of VM and PM setups. Through simulations,

we demonstrate how alternative setups and models might result in different energy

consumption. To make the scheduling issue simpler and use less energy, the proposed divisible

capacity arrangement of VMs and PMs is crucial.

This paper's remaining material is organized as follows: The suggested Green Cloud

architecture and key elements for resource scheduling in Cloud data centers are presented in

Section 6.2. The problem is stated in Section 6.3, which also introduces the suggested

scheduling techniques. The performance evaluation of several algorithms in various setups is

shown in Section 6.4. Related work on energy-efficient scheduling is presented in Section 6.5.

Conclusion of Section 6.6.

GREENCLOUD ARCHITECTURE

Figure 6.1 suggests the layered design for GreenCloud. At the top layer, there is a web portal

where users may choose resources and submit requests. In essence, it is a unified

representation of the few preconfigured VM types available to users. After user requests are

submitted, they move on to the CloudSched level, which is in charge of selecting the best data

centers and PMs depending on user requirements. This layer has the capacity to control a

sizable number of Cloud data centers made up of thousands of PMs. This layer allows for the

use of various scheduling algorithms based on client attributes in various data centers. The

lowest tier of the cloud contains resources like PMs and VMs, each of which has a specific

allocation of CPU, memory, storage, and bandwidth. Virtual management is primarily in charge

of keeping track of all VMs in the system, including their status, necessary capacities, hosts,

arrival timings, and departure times, at the Cloud resource layer.

FIGURE 6.1 Proposed GreenCloud architecture.

Cloud data center resources

Only computational resources are taken into account in this work. The hosts (PMs) in a data

center are in charge of controlling virtual machines throughout their life cycles. In a cloud, a

Page 4 of 20

Transactions on Machine Learning and Artificial Intelligence (TMLAI) Vol 10, Issue 6, December - 2022

Services for Science and Education – United Kingdom

host is a component that stands in for a real computing node. It receives preconfigured RAM,

storage, a scheduling policy for allocating VMs, and processor power (expressed, for example,

in GHz or Million Instructions Per Second). A cluster or data center can also be created by

connecting a number of hosts.

A uniform view for different types of VMs

There may be an excessive number of various kinds of PMs, VMs, and their mixtures. If a

standard perspective is not offered, cloud service providers will encounter a vast number of

issues. We demonstrate that it is feasible to have an unified view of various VM types using the

frequently used example of Amazon EC2. The eight VM types as listed in the Amazon EC2 web

documentation are shown in Table 6.1. Information on Amazon EC2's hardware setup is not

available. On the basis of compute units, we can still create three different sorts of PMs. In a PM

with 268.4 GB of RAM, for instance, in a genuine Cloud data center, It is possible to give 2 1690

GB storage and 16 3.25 unit cores. It is feasible to create a consistent understanding of several

VM types in this or a related manner. For heterogeneous virtualization technologies (like Xen,

KVM, and VMWare), this type of classification offers a standard picture of virtualized resources

and has significant positive effects on managing and allocating VMs.

Table 3: 8 types of VMs in Amazon EC2

Memory (GB) Compute units Storage (GB) API name VM type

1.875 1 (1 cores×1 units) 211.25 m1.small 1-1(1)

7.5 4 (2 cores×2 units) 845 m1.large 1-2(2)

15.0 8 (4 cores×2 units) 1690 m1.xlarge 1-3(3)

17.1 6.5 (2 cores×3.25 units) 420 m2.xlarge 2-1(4)

34.2 13 (4 cores×3.25 units) 845 m2.2xlarge 2-2(5)

68.4 26 (8 cores×3.25 units) 1690 m2.4xlarge 2-3(6)

1.7 5 (2 cores×2.5 units) 422.5 c1.medium 3-1(7)

7.0 20 (8 cores×2.5 units) 1690 c1.xlarge 3-2(8)

Page 5 of 20

Jason, S. (2022). Real-time Virtual Machine Energy-Efficient Allocation in Cloud Data Centers Using Interval-packing Methods. Transactions on

Machine Learning and Artificial Intelligence, 10(6). 15-34.

URL: http://dx.doi.org/10.14738/tmlai.106.13419

Table 2: Divisible configuration of three types of PMs (an example)

type

CPU

(compute

units)

Memory

(GB)

Storage

(GB)

Type1 16 30.0 3380

Type2 52 136.8 3380

Type3 40 14.0 3380

Table 3. Random configuration of three types of PMs (an example)

type

CPU (compute

units)

Memory

(GB)

Storage

(GB)

Type1 13 37.0 3000

Type2 53 137.8 4000

Type3 47 17.0 2000

Real-time VM request model

Because it makes use of virtualization, the cloud computing environment is an appropriate

solution for real-time VM service [22]. Appropriate VMs are allotted when customers ask for

the execution of their real-time virtual machines in a cloud data center.

Example 1

The interval vector vmRequestID can be used to denote a real-time VM request (VM typeID,

start time, finish time, requested capacity). Figure 6.2's vm1(1, 0, 6, 0.25) indicates that the

requested VM is of Type1 (equivalent to integer 1), with start times of 0 and 6, meaning that it

completed at the sixth slot following start times of 0, and 25% of the entire capacity of Type1

PM. Similar representations can be used for other requests. Figure 6.2 depicts the life cycles of

VM allocation utilizing two PMs and a slotted-time window arrangement, with PM#1 hosting

VMs 1, 2, and 3 and PM#2 hosting VMs 4, 5, and 6. Keep in mind that each interval must adhere

to the total capacity restriction.

Page 6 of 20

Transactions on Machine Learning and Artificial Intelligence (TMLAI) Vol 10, Issue 6, December - 2022

Services for Science and Education – United Kingdom

FIGURE 2. VM allocations using two PMs

Energy-efficient real-time scheduling

Problem description

We use a modified ISP as our model for the issue of real-time VM scheduling. In Ref. [24] and

its references, you can find additional explanation and analysis regarding fixed ISPs. We outline

a solution to the modified interval scheduling problem and compare the outcomes to well- known, current solutions.

a collection of requests (1, 2,..., n), where the ith request is linked to a capacity requirement (ci)

and represents a time period beginning at si and ending at fi. Based on the following

assumptions, the objective of energy-efficient scheduling is to satisfy all objectives with the

least amount of running PMs and overall running times possible:

1. Except as otherwise noted, all data are deterministic, and time is formatted in slotted

windows. Figure 6.3 illustrates how we divide the complete time interval [0,T] into slots

of identical length (s0), where the sum of the slots is equal to T/s0 (in integer form). One

slot's start time si and finish time fi are both integers. Afterward, a request's interval

might be represented in slot format using (start time, finish time). For instance, if s0=5

min, then an interval [16,19] has a start time at the third slot and a finish time at the

tenth slot, respectively. The actual time taken to complete this request is (103)5=35

minutes.

2. Each work stands alone. Other than what the start and finish times imply, there are no

priority restrictions.

3. Each request's necessary capacity is a positive real value between 0 and 1. Keep in mind

that a single PM's capacity is standardized to be 1. It can become multidimensional when

taking into account various resources, such as CPU, memory, or storage, for instance.

4. Assume that each VM request is given to a single PM when it is processed. Therefore,

unless otherwise specified, stopping a request and continuing it on another machine is

not permitted (such as when using migration).

5. Every PM is constantly accessible (each machine is continuously available in the range

[0, )).

6. Assume that each VM request uses the entire available capacity when it is assigned (the

worst-case scenario). When a VM requests VMi=(CPU, memory,

Page 8 of 20

Transactions on Machine Learning and Artificial Intelligence (TMLAI) Vol 10, Issue 6, December - 2022

Services for Science and Education – United Kingdom

network devices, the CPU accounts for the majority of energy usage, which is the subject of this

paperr. We employ the power model specified in Eq. in this work (6.2). Eq. (6.3) is formed by

further simplifying equation (6.2):

where Pmin represents the power used by the given PM when its CPU use is nil (when the PM

is idle without any VM running). Because workloads vary in a real setting, the CPU's utilization

may fluctuate over time. Consequently, CPU usage is a function of time and is denoted by the

symbol u. (t). As a result, Eq. (6.4) may be used to describe the total energy consumed by a PM

(Ei) as an integral of the power consumption function over time:

Ei=P(u) (t1t0) if u(t) is a constant across time, such as if average utilization is adopted, where

u(t)=u.

1. A cloud data center's overall energy usage is calculated using Equation (6.5):

It represents the total energy used by all PMs. It should be noted that the energy usage of all

VMs running on PMs is included.

2. During the testing period, all PMs had a total power-on time of:

where PMi Power-on Time is the total time, the ith PM has been powered on.

For comparison, we'll suppose that each virtual machine uses all of the requested CPU

resources.

If a PM's current CPU usage is u and it increases to u′ after creating a VM, the energy increase

brought on by the VM is designated as Evm and may be calculated using Eq. (6.7), where u

represents the rise in CPU utilization as a result of allocating a VM:

Page 9 of 20

Jason, S. (2022). Real-time Virtual Machine Energy-Efficient Allocation in Cloud Data Centers Using Interval-packing Methods. Transactions on

Machine Learning and Artificial Intelligence, 10(6). 15-34.

URL: http://dx.doi.org/10.14738/tmlai.106.13419

Formally, our task of scheduling VMs in real-time to reduce total energy consumption (Min iEi)

becomes a multidimensional combinatorial optimization problem with constraint fulfillment

(meeting capacity and time restrictions), making it an NP-complete problem [26,27].

Theorem 1

In a heterogeneous case, the decision version of a real-time VM scheduling problem is NP- complete.

Remark

Different kinds of PMs and VMs are taken into consideration in the case of a heterogeneous

arrangement. The decision variant of this problem, which asks whether there is a workable

schedule, is demonstrated to be NP-complete in Reference [23].

We take into account the heterogeneous case while mapping various VM types to

corresponding PM types in order to reduce complexity (i.e., we simplify the heterogeneous case

to a homogeneous case as given in Tables 6.1–6.3). The discussion and findings in the following

are all predicated on a homogeneous instance.

Capacity configuration of VMs and PMs

Following our observation that alternative VM and PM capacity configurations affect the

complexity of the issue and overall energy consumption, we will look at two distinct capacity

combinations.

Random capacity configuration of VMs and PMs

The capabilities (CPU, memory, storage, and other parameters) of the VMs and PMs are set at

random in this scenario, which is referred to as the random capacity case. This problem can be

changed into a dynamic multidimensional BPP or interval-packing problem, which is known to

be an NP-complete problem, if we need to take into account CPU, memory, storage, life cycles,

and other aspects of VMs and PMs. See for example Refs. [5,26,27].

Lemma 1

According to the literature, there is no optimal method for minimizing the overall number of

power-on PMs in offline scheduling in the random capacity situation.

Remark

Here, by taking CPU, memory, storage, life cycle, and other elements into account, our real-time

VM scheduling problem may be converted into a conventional multidimensional (interval) BPP.

It is simple to demonstrate that the real-time VM scheduling problem is an NP-complete

problem by reducing a well-known NP-complete problem—the multidimensional BPP—to our

issue [5,26,27]. Although there is no exact answer that can be found in polynomial time, there

are certain approximations that can be made to reduce the overall number of power-on PMs.

Divisible capacity configuration of VMs and PMs

As shown in Tables 6.1 and 6.2, (CPU, memory, and storage) capacity is handled as a whole in

this situation. For instance, the combined CPU, memory, and storage of the VM Types 1-1, 1-2,

and 1-3, respectively, are 1/16, 1/4, and 1/2 of the combined CPU, memory, and storage of the

PM Type 1. In a similar vein, the combined capacities of VM Types 2-1 and 2-2 are 1/8 and 1/4,