Power Management of Server Farms

The power management of server farms (Sf) is becoming a relevant problem in economical terms. Server farms totalize millions of servers all over the world that need to be electrically powered. Research is thus expected to investigate into methods to reduce Sf power consumption. However, saving power may turn into waste of performance (high response_times), in other words, into waste of Sf Quality of Service (QoS). By use of a Sfmodel, this paper investigates Sf power management strategies that look at compromises between power-saving and QoS. Various optimizing Sf power management policies are studied combined with the effects of job queueing disciplines. The (policy, discipline) pairs, or strategies, that optimize the Sf power consumption (minimum absorbed Watts), the Sf performance (minimum response_time), and the Sf performance-per-Watt (minimum response_time-per-Watt) are identified.By use of the model, the work the server-manager has to do to direct hisSf is greatly simplified, since the universe of all possible (strategies he needs to choose from is drastically reduced to a very small set of most significant strategies.

Most of the power absorbed by the servers of a farm is wasted.Indeed, due to over-provisioning, servers are found busy (i.e.making processing work) only 20% to 30% of the time, on average.So, energy saving requires the adoption of management policies to avoid powering the servers when they are not processing.In other words, policies to decide in which state (idle or off) to keep the servers when not busy.Two families of server management policies are considered in literature [2,3,6,9]: static and dynamic policies.Dynamic policies are policies that assume time-varying demand patterns (i.e.: job arrival rates λ(t) changing over time).Such policies adapt themselves to the changing λ(t).Static policies, instead, are defined for a given λ, i.e. not changing with t.In a previous paper [11], static and dynamic policies that optimize the Sf power consumption and performance have been considered.In the paper, however, only the effects of various policies π are studied, with noconsideration of the effects of the queueing disciplines δ.In this paper the static case is considered, however the effect of both policies π and disciplinesδ is investigated, and the strategy or combination (π, δ) that optimizes the following indices: • farm power consumption (minimum absorbed Watts), • farm performance (minimum response_time), • farm performance-per-Watt (minimum response_time-per-Watt) is studied.A busy-server in the on state absorbs around 240W, an idle-server about 160W and an off-server zero W. So why not to keep in the idle state or in the off state the servers when not busy?Just since switching a server from offto on consumes a time-overhead (the so-called setup-time).Thus, any power-saving policy may result in time-wasting problems.As a consequence, the server farm may lose performance (increased response time to the incoming jobs, low throughput of VoIP and streaming packets, etc.) and its QoS becomes unacceptable to customers.Assume the farm consists of n servers.Various static policies for an n serve farm have been investigated in literature [2,3,6,9].Authors in [2], in particular, study three different static policies π to manage server farms: the On/Idle (or NeverOff) policy, the On/Off (or InstantOff) policy and the On/Off/Sleep (or Sleep) policy.Under the On/Idle policy, servers are never turned off.All servers are either on or idle, and remain in the idle state when there are no jobs to serve.If an arrival finds a server idle it starts serving on the idle server.An arriving job that finds all n servers on (busy) joins a central queue from which the servers pick jobs when they become idle.Under the On/Off policy, instead, servers are immediately turned off when not in use, thus yielding power-saving with respect to the On/Idle.As said above, however, there is a setup cost (in terms of time-delay and of additional power penalty) for turning-on an off server, and this yields an increase in response-time.Fig. 1 compares the On/Off and the On/Idle policies [6] in an example case.The On/Idle policy proves to be better in terms of response time (11 versus 39 sec), since the incoming jobs do not suffer by setup time delays, but involves a larger amount of power waste with respect to the On/Off policy (780 versus 320 W), since of the amount of power an idle server absorbs with respect to an off one.The morale is that to reduce power consumption one has to pay a response time increase.In the Fig. 1 case, to reduce power consumption from 780W to 320W, we pay a response time increase from 11 to 39 sec.Finally, the On/Off/Sleep policy is similar to the On/Off, except that whenever the server goes idle it goes into a sleep state, where it remains until there is no work to process and begins to wake up as soon as work arrives.Quantity ρ = λ/nµ denotes the server farm load.It is known [9] that, for system stability, the condition 0 ≤ρ<1 is to be satisfied, in other words the condition 0 ≤λ< nµ.We shall use the following notation to define the Sf parameters: T setup = server mean setup time, P on = busy-server power absorption in the on state, P idle = idle-server power absorption, P off = off-server absorption, and S = average job service time.

Server farm power and QoS evaluation indices
As said above, we denote by policy π the server management policy and by queueing discipline δ the criterion according to which the servers pick jobs from the waiting queue when they become idle.Policies, policies π = On/Off, π = On/Idle, π = On/Off/Sleep and disciplines δ = timeindependent and δ = time-dependent disciplines will be considered.Time-independent disciplines are conventional disciplines, such as FIFO (First In First Out), LIFO (Last In First Out), RAND (Random extraction).Such disciplines are also called abstract disciplines.Time-dependent disciplines, instead, are disciplines in which the servers pick jobs from the waiting according to job size, in other words to their service time S.An example of such a discipline is the SPTF (Shortest Processing Time First) [10] also called SJF (Shortest Job First) [9], in which servers pick jobs of shortest size first.As said above, to reduce the server farm power consumption, one has to pay a debt of increase in average response time.
It is known [9,10] that,in queueing systems, the SPTF minimizes the system mean response time.
So, we conjecture that by using the SPTF discipline in server farm systems, the Sfdebtto pay inresponse time is smaller than with abstract disciplines.Such a conjecture will be proved in the paper.Moreover, we shall see that, in some circumstances, the response-time benefit one may obtain by moving from the FIFO to the SPTF disciplineis larger than the benefit one may obtain by moving from the On/Off to the On/Idle policy.Disciplines δ = FIFO and δ = SPTF will be considered in the paper.For any given (π, δ) strategy, i.e. policy and discipline combination, we shall use the following notation for the Sf power and QoS indices: P(π, δ) the long-run average power absorbed by the Sf under the (π, δ) strategy; T(π, δ) the Sf mean response time mean response time under the (π, δ) strategy; PT(π, δ) = P(π, δ)⋅ T(π, δ) the mean power by response-time product under the (π, δ) strategy.

Applied Mechanics and Materials Vol. 492
Tab Tab.1 shows simulation results [8] that compare the Sfpower and QoS indices in the low setup case (T setup = 1sec): • Seeing at the power consumption P, we note that there is no effect by the queueing discipline δ on the power consumption P, while there is an effect by the policy π for low ρ.Indeed, a drastic reduction can be seen (from 6000W to 4200W, for low ρ) when moving from On/Idle to On/Off, since when ρ is low, the waiting queue is almost empty and thus a large number of servers is in the off state.For high ρ instead, the power consumption P remains unchanged (P = 7100W) with the discipline δ since the queue is always full and thus the servers remains always in the on state.• Seeing at the response-time T, we note that there is an effect both by the queueing discipline δ and by the policy π.The effects hold both for lowρ and for high ρ.In the On/Idle case, when ρ is low, there is no waiting in the Sf queue and thus the mean response time T is of about the mean job service time (S = 1sec) while it increases (T = 1.8sec for δ = FIFO and T = 1.3sec for δ = SPTF) for high ρ.In the On/Off case, when ρ is low, the mean response time is higher (T = 1.2sec with no effect by the discipline, the queue is empty), since almost every arrival finds servers in the off state, and thus every job incurs in the setup time.For high ρ, instead, the mean response time increases (T = 2sec for δ = FIFO and T = 1.35sec for δ = SPTF) due to large queueing.As predicted above, we can see that the benefit in response-time one may obtain moving from FIFO to SPTF is larger than the one obtainable moving from On/Off to On/Idle.Indeed (see high ρ) moving from the (On/Off, FIFO) strategy to the (On/Idle, FIFO) the response time T changes from 2 to 1.8 (a 10% reduction).Moving, instead, from the (On/Idle, FIFO) strategy to the (On/Idle, SPTF) the response time T changes from 1.8 to 1.3 (an almost 30% reduction).• Seeing at the PT index, its values are a consequence of the P and the T ones.Tab.1 shows that the optimal PT is obtained for the (On/Off, FIFO)strategy and for the (On/Off, SPTF)strategy when ρ is low, while it is obtained for the (On/Idle, SPTF) strategy only when ρ is high.Tab.2 shows simulation results [8] that compare the server farm the Sfpower and QoS indices in the high setup case (T setup = 100sec): • Seeing at the power consumption P the table shows that there is no effect by the queueing discipline δ on the power consumption P, while there is an effect by the policy π for low ρ.Contrary from the low setup case (Tab.1),we see an increase of P (from 6000W to 6500W, for low ρ) when moving from On/Idle to On/Off, since, even though the queue is almost empty, the high setup time plays a preeminent role in the power consumption for low ρ.Indeed, as seen in the Sf parameters above, P setup is higher than P idle .For high ρ instead, the power consumption P remains unchanged (P = 7100W) with the discipline since the queue is always full and thus the servers remains always in the on state.• Seeing at response-time (T) there is an effect both by the queueing discipline δ and by the policy π.The effect holds both for low ρ and for high ρ.In the On/Idle case, when ρ is low, there is no waiting in the Sf queue and thus the mean response time T is of about the mean job service time (S = 1sec), while it increases (T = 1.8 sec for δ = FIFO and T = 1.3sec for δ = SPTF) for high ρ.
In the On/Off case, when ρ is low, the mean response time is higher (T = 12sec for FIFO, that reduces to 3.4 sec for SPTF due to the high setup time) since almost every arrival finds the servers in the off state, and thus every job incurs in the setup time.For high ρ, instead, the mean response time increases (T = 84 sec for δ = FIFO and T = 16.5 sec for δ = SPTF) due to large Sf queueing.
In this high setup case, the effect of the policy π shows to be dominant with respect to the effect of the discipline δ.Indeed, moving from the (On/Off, FIFO) strategy to the (On/Idle, FIFO) the response time T changes from 84 sec to 1.8 sec (an almost 97% reduction), while moving from the (On/Off FIFO) strategy to the (On/Idle, SPTF) the response time T changes from 84 sec to 16.5 sec (an almost 80% reduction).• Seeing at the PT index, its values are a consequence of the P and the T ones.Tab.2 shows that the optimal PT is obtained for the (On/Idle, FIFO) and the (On/Idle, SPTF) strategies when ρ is low, while it is obtained for the (On/Idle, SPTF) strategy only when ρ is high.In summary, making predictions of the Sfmanagement strategy, that optimizes • the Sf power consumption (minimum absorbed Watts), or • the Sf performance (minimum response_time), or • the Sf performance-per-Watt (minimum response_time-per-Watt) is a non trivial task.The most significant policies π are first to be drawn from the universe of all possible policies.Then, for each such a policy, the effects of time-dependent and time-independent queueing disciplines are to be studied.On the other hand, once the modeling work has been done, the work the server-farm manager has to perform to direct hisSf is greatly simplified, since the universe of all possible (π, δ) strategies he needs to choose from is drastically reduced to very a small set of most significant strategies.

Summary
The concept of server farm (Sf) management strategy (π, δ) has been introduced, with a queueing model of the Sf, to study the combined effect of the Sf power management policy π and of the Sf queueing discipline δ, in order to evaluate various Sfpower and QoS indices: 1) the optimal Sf power consumption, or minimumabsorbed Watts, 2) the optimal Sf performance, or minimum response_time, and 3) the optimal Sf performance-per-Watt, or minimum response_time-per-Watt.The set of the most significant policies π has been drawn from the universe of all possible policies, and for each such policy, the Sf model has been used to predict the effect of time-dependent and time-independent queueing disciplines.Model simulation results give the best management strategies the server-farm manager may use to minimize the farm power consumption while maintaining its QoS to its best.

458
Power and Energy Systems III

Applied Mechanics and Materials Vol. 492 Tab.2
Server farm results for high setup-time ( T setup = 100s)