Towards Self-organising Agent-based Resource Allocation in a Multi-Server Environment Tino Schlegel1 , Ryszard Kowalczyk2 Swinburne University of Technology Faculty of Information and Communication Technologies Hawthorn, 3122 Victoria, Australia {tschlegel1 ,rkowalczyk2 }@ict.swin.edu.au ABSTRACT Distributed applications require distributed techniques for efficient resource allocation. These techniques need to take into account the heterogeneity and potential unreliability of resources and resource consumers in a distributed environments. In this paper we propose a distributed algorithm that solves the resource allocation problem in distributed multiagent systems. Our solution is based on the self-organisation of agents, which does not require any facilitator or management layer. The resource allocation in the system is a purely emergent effect. We present results of the proposed resource allocation mechanism in the simulated static and dynamic multi-server environment. Categories and Subject Descriptors I.2.11 [Distributed Artificial Intelligence]: Coherence and coordination 1. INTRODUCTION With the increasing popularity of distributed computing technologies such as Grid [12] and Web services [20], the Internet is becoming a powerful computing platform where different software peers (e.g., agents) can use existing computing resources to perform tasks. In this sense, each agent is a resource consumer that acquires a certain amount of resources for the execution of its tasks. It is difficult for a central resource allocation mechanism to collect and manage the information about all shared resources and resource consumers to effectively perform the allocation of resources. Hence, distributed solutions of the resource allocation problem are required. Researchers have recognised these requirements [10] and proposed techniques for distributed resource allocation. A promising kind of such distributed approaches are based on economic market models [4], inspired by principles of real stock markets. Even if those approaches are distributed, they usually require a facilitator for pricing, resource discovery and dispatching jobs to resources [5, 9]. Another mainly unsolved problem of those approaches is the fine-tuning of price and time, budget constraints to enable efficient resource allocation in large, dynamic systems [22]. In this paper we propose a distributed solution of the resource allocation problem based on self-organisation of the resource consumers in a system with limited resources. In our approach, agents dynamically allocate tasks to servers that provide a limited amount of resources. In our approach, agents select autonomously the execution platform for the task rather than ask a resource broker to do the allocation. All control needed for our algorithm is distributed among the agents in the system. They optimise the resource allocation process continuously over their lifetime to changes in the availability of shared resources by learning from past allocation decisions. The only information available to all agents are resource load and allocation success information from past resource allocations. Additional resource load information about servers is not disseminated. The basic concept of our solution is inspired by inductive reasoning and bounded rationality introduced by W. Brian Arthur [2]. The proposed mechanism does not require a central controlling authority, resource management layer or introduce additional communication between agents to decide which task is allocated on which server. We demonstrate that this mechanism performs well dynamic systems with a large number of tasks and can easily be adapted to various system sizes. In addition, the overall system performance is not affected in case agents or servers fail or become unavailable. The proposed approach provides an easy way to implement distributed resource allocation and takes into account multi-agent system tendencies toward autonomy, heterogeneity and unreliability of resources and agents. This proposed technique can be easily supplemented by techniques for queuing or rejecting resource allocation requests of agents [11]. Such self-managing capabilities of software agents allow a reliable resource allocation even in an environment with unreliable resource providers. This can be achieved by the mutual interactions between agents by applying techniques from complex system theory. Selforganisation of all agents leads to a self-organisation of the 74 978-81-904262-7-5 (RPS) c 2007 IFAAMAS system resources and is an emergent property of the system [21]. The remainder of the paper is structured as follows: The next section gives an overview of the related work already done in the area of load balancing, resource allocation or scheduling. Section 3 describes the model of a multi-agent environment that was used to conduct simulations for a performance evaluation. Sections 4 and 5 describe the distributed resource allocation algorithm and presents various experimental results. A summary, conclusion and outlook to future work finish this paper. 2. RELATED WORK Resource allocation is an important problem in the area of computer science. Over the past years, solutions based on different assumptions and constraints have been proposed by different research groups [7, 3, 15, 10]. Generally speaking, resource allocation is a mechanism or policy for the efficient and effective management of the access to a limited resource or set of resources by its consumers. In the simplest case, resource consumers ask a central broker or dispatcher for available resources where the resource consumer will be allocated. The broker usually has full knowledge about all system resources. All incoming requests are directed to the broker who is the solely decision maker. In those approaches, the resource consumer cannot influence the allocation decision process. Load balancing [3] is a special case of the resource allocation problem using a broker that tries to be fair to all resources by balancing the system load equally among all resource providers. This mechanism works best in a homogeneous system. A simple distributed technique for resource management is capacity planning by refusing or queuing incoming agents to avoid resource overload [11]. From the resource owner perspective, this technique is important to prevent overload at the resource but it is not sufficient for effective resource allocation. This technique can only provide a good supplement for distributed resource allocation mechanisms. Most of today"s techniques for resource allocation in grid computing toolkits like Globus [12] or Condor-G [13] coordinate the resource allocation with an auctioneer, arbitrator, dispatcher, scheduler or manager. Those coordinators usually need to have global knowledge on the state of all system resources. An example of a dynamic resource allocation algorithm is the Cactus project [1] for the allocation of computational very expensive jobs. The value of distributed solutions for the resource allocation problem has been recognised by research [10]. Inspired by the principles in stock markets, economic market models have been developed for trading resources for the regulation of supply and demand in the grid. These approaches use different pricing strategies such as posted price models, different auction methods or a commodity market model. Users try to purchase cheap resources required to run the job while providers try to make as much profit as possible and operate the available resources at full capacity. A collection of different distributed resource allocation techniques based on market models is presented in Clearwater [10]. Buyya et al. developed a resource allocation framework based on the regulation of supply and demand [4] for Nimrod-G [6] with the main focus on job deadlines and budget constraints. The Agent based Resource Allocation Model (ARAM) for grids is designed to schedule computational expensive jobs using agents. Drawback of this model is the extensive use of message exchange between agents for periodic monitoring and information exchange within the hierarchical structure. Subtasks of a job migrate through the network until they find a resource that meets the price constraints. The job"s migration itinerary is determined by the resources in connecting them in different topologies [17]. The proposed mechanism in this paper eliminates the need of periodic information exchange about resource loads and does not need a connection topology between the resources. There has been considerable work on decentralised resource allocation techniques using game theory published over recent years. Most of them are formulated as repetitive games in an idealistic and simplified environment. For example, Arthur [2] introduced the so called El Farol bar problem that does not allow a perfect, logical and rational solution. It is an ill-defined decision problem that assumes and models inductive reasoning. It is probably one of the most studied examples of complex adaptive systems derived from the human way of deciding ill-defined problems. A variation of the El Farol problem is the so called minority game [8]. In this repetitive decision game, an odd number of agents have to choose between two resources based on past success information trying to allocate itself at the resource with the minority. Galstyan et al. [14] studied a variation with more than two resources, changing resource capacities and information from neighbour agents. They showed that agents can adapt effectively to changing capacities in this environment using a set of simple look-up tables (strategies) per agent. Another distributed technique that is employed for solving the resource allocation problem is based on reinforcement learning [18]. Similar to our approach, a set of agents compete for a limited number of resources based only on prior individual experience. In this paper, the system objective is to maximise system throughput while ensuring fairness to resources, measured as the average processing time per job unit. A resource allocation approach for sensor networks based on self-organisation techniques and reinforcement learning is presented in [16] with main focus on the optimisation of energy consumption of network nodes. We [19] proposed a self-organising load balancing approach for a single server with focus on optimising the communication costs of mobile agents. A mobile agent will reject a migration to a remote agent server, if it expects the destination server to be already overloaded by other agents or server tasks. Agents make their decisions themselves based on forecasts of the server utilisation. In this paper a solution for a multi-server environment is presented without consideration of communication or migration costs. 3. MODEL DESCRIPTION We model a distributed multi-agent system as a network of servers L = {l1, . . . , lm}, agents A = {a1, . . . , an} and tasks T = {T1, ..., Tm}. Each agent has a number of tasks Ti that needs to be executed during its lifetime. A task Ti requires U(Ti, t) resources for its execution at time t independent from its execution server. Resources for the execution of tasks are provided by each server li. The task"s execution location in general is specified by the map L : T ×t → L. An agent has to know about the existence of server resources in order to allocate tasks at those resources. We write LS (ai) The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) 75 Sysytem Resources Host l4Host l3Host l2 2a 3a 4a a Host l1 1a 6a5 T1 T2 T3 T4 T5 T6 Figure 1: An illustration of our multi-server model with exclusive and shared resources for the agent execution. to address the set of resources known by agent ai. Resources in the system can be used by all agents for the execution of tasks. The amount of provided resources C(li, t) of each server can vary over time. The resource utilisation of a server li at time t is calculated using equation 1, by adding the resource consumption U(Tj, t) of each task Tj that is executed at the resource at time t. All resource units used in our model represent real metrics such as memory or processor cycles. U(li, t) = n j=1 U(Tj, t)| L(Tj, t) = li (1) Additional to the case that the total amount of system resources is enough to execute all tasks, we are also interested in the case that not enough system resources are provided to fulfil all allocation requests. That is, the overall shared resource capacity is lower than the amount of requested resources by agents. In this case, some agents must wait with their allocation request until free resources are expected. The multi-agent system model used for our simulations is illustrated in Fig. 1. 4. SELF-ORGANISING RESOURCE ALLOCATION The resource allocation algorithm as described in this section is integrated in each agent. The only information required in order to make a resource allocation decision for a task is the server utilisation from completed task allocations at those servers. There is no additional information dissemination about server resource utilisation or information about free resources. Our solution demonstrates that agents can self-organise in a dynamic environment without active monitoring information that causes a lot of network traffic overhead. Additionally, we do not have any central controlling authority. All behaviour that leads to the resource allocation is created by the effective competition of the agents for shared resources and is a purely emergent effect. The agents in our multi-agent system compete for resources or a set of resources to execute tasks. The collective action of these agents change the environment and, as time goes by, they have to adapt to these changes to compete more effectively in the newly created environment. Our approach is based on different agent beliefs, represented by predictors and different information about their environment. Agents prefer a task allocation at a server with free resources. However, there is no way to be sure of the amount of free server resources in advance. All agents have the same preferences and a agent will allocate a task on a server if it expects enough free resources for its execution. There is no communication between agents. Actions taken by agents influence the actions of other agents indirectly. The applied mechanism is inspired by inductive reasoning and bounded rationality principles [2]. It is derived from the human way of deciding ill-defined problems. Humans tend to keep in mind many hypotheses and act on the most plausible one. Therefore, each agent keeps track of the performance of a private collection of its predictors and selects the one that is currently most promising for decision making. 4.1 Resource Allocation Algorithm This section describes the decision mechanism for our selforganising resource allocation. All necessary control is integrated in the agents themselves. There is no higher controlling authority, management layer for decision support or information distribution. All agents have a set of predictors for each resource to forecast the future resource utilisation of these servers for potential task allocation. To do so, agents use historical information from past task allocations at those resources. Based on the forecasted resource utilisation, the agent will make its resource allocation decision. After the task has finished its execution and returned the results back to the agent, the predictor performances are evaluated and history information is updated. Algorithm 1 shows the resource allocation algorithm for each agent. The agent first predicts the next step"s resource load for each server with historical information (line 3-7). If the predicted resource load plus the task"s resource consumption is below the last known server capacity, this server is added to the list of candidates for the allocation. The agent then evaluates if any free shared resources for the task allocation are expected. In the case, no free resources are expected (line 9), the agent will explore resources by allocating the task at a randomly selected server from all not predictable servers to gather resource load information. This is the standard case at the beginning of the agent life-cycle as there is no information about the environment available. The resource load prediction itself uses a set of r predictors P(a, l) := {pi|1 ≤ i ≤ r} per server. One predictor pA ∈ P of each set is called active predictor, which forecasts the next steps resource load. Each predictor is a function P : H → ℵ+ ∪ {0} from the space of history data H to a non-negative integer, which is the forecasted value. For example, a predictor could forecast a resource load equal to the average amount of occupied resources during the last execution at this resource. A history H of resource load information is a list of up to m history items hi = (xi, yi), comprising the observation date xi and the observed value yi. The most recent history item is h0. Hm(li) = ((x0, y0), ..., (xk, yk))| 0 ≤ k < m (2) Our algorithm uses a set of predictors rather than only one, to avoid that all agents make the same decision based on the predicted value leading to an invalidation of their beliefs. Imagine that only one shared resource is known by a number of agents using one predictor forecasting the 76 The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) ResourceLoad Time (a) Predictor6 Predictor7 Predictor 8 Predictor9 Predictor10 Predictor2 Predictor 4 Predictor 3 Predictor5 Predictor1 (b) Figure 2: (a) Collected resource load information from previous task allocations that is used for future predictions. (b) Predictor"s probability distribution for selecting the new active predictor. same value as the last known server resource utilisation. All agents that allocated a task at a server that was slightly overloaded would dismiss another allocation at this server as they expect the server to be overloaded again based on the predictions. As the result, the server would have a large amount of free resources. A set of different predictors that predict different values avoids this situation of invalidating the beliefs of the agents [19]. An example of a collected resource load information from the last 5 visits of an agent at a shared resource can be seen in Fig. 2(a). It shows that the resource was visited frequently, which means free resources for execution were available and an exploration of other servers was unnecessary. This may change in the future as the resource load has significantly increased recently. In the case where the set of servers predicted having free resources available is not empty (line 13), the agent selects one of those for allocation. We have implemented two alternative algorithms for the selection of a server for the task allocation. Algorithm 1 Resource Allocation algorithm of an agent 1 L ← ∅ //server with free resources 2 u ← U(T, t + 1) //task resource consumption 3 for all P(a, l)|l ∈ LS (a) do 4 U(l) ← resourceLoadPrediction(P(a, l), t + 1) 5 if U(l) + u ≤ C(l) then 6 L ← L ∪ {P(a, l)} 7 end if 8 end for 9 if L = ∅ then 10 //all unpredictable shared resources 11 E ← LS /{l ∈ LS (a)|P(a, l) ∈ L} 12 allocationServer ← a random element of E 13 else 14 allocationServer ← serverSelection (L) 15 end if 16 return allocationServer Algorithm 2 shows the first method, which is a non-deterministic selection according to the predictability of the server resource utilisation. A probability distribution is calculated from the confidence levels of the resource predictions. The confidence level depends on three factors: the accuracy of the active predictor, the amount of historical information about the server and the average age of the history information (see Eq. 3. The server with the highest confidence level has the biggest chance to be selected as the active server. G(P) = w1 · size(H) m + w2 · Age(H) max Age(H) + w3 · g(p) max (g(p)) (3) where: wi − weights size(H) − number of data in history m − maximal number of history values Age(H) − average age of historical data g(p) − see eq. 4 Algorithm 2 serverSelection(L)- best predictable server 1 for all P(a, l) ∈ L do 2 calculate G(P) 3 end for 4 transform all G(P) into a probability distribution 5 return l ∈ LS selected according to the probability distribution Algorithm 3 serverSelection(L) - most free resources 1 for all P(a, l) ∈ L do 2 c(l) ← C(l) − Ul 3 end for 4 return l ∈ LS |c(l) is maximum The second alternative selection method of a server from the set of predicted servers with free resources is deterministic and shown in algorithm 3. The server with most expected free resources from the set L of server with expected free resources is chosen. In the case where all agents predict the most free resources for one particular server, all agents will allocate the task at this server, which would invalidate the agent"s beliefs. However, our experiments show that different individual history information and the non-deterministic active predictor selection usually prevent this situation. In the case, the resource allocation algorithm does not return any server (Alg. 1, line 16), the allocation at a resource in not recommended. The agent will not allocate the task at a resource. This case happens only if a resource load prediction for all servers is possible but no free resources are expected. After the agent execution has finished, the evaluation process described in algorithm 4 is preformed. This process is divided into three cases. First, the task was not allocated at a resource. In this case, the agent cannot decide if the decision not to allocate the task was correct or not. The agent then removes old historical data. This is necessary for a successful adaptation in the future. If the agent would not delete old historical information, the prediction would always forecast that no free resources are available. The agent would never allocate a task at one of the resources in the future. Old historical information is removed from the agent"s resource history using a decay rate. The decay rate is a cumulative distribution function that calculates the probability that a history item is deleted after it has reached a certain The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) 77 Age of the history data Decayrate 0 1 older Figure 3: Decay rate of historical information age. The current implementation uses a constant probability density function in a configurable domain. Figure 3 shows an example of such a cumulative distribution function for the decay rate. Depending on the environment, the probability density function must be altered. If the number of potential server per agent is high, historical information must be kept longer to avoid the exploration of unexplored resources. In addition, a dynamic environment requires more up-to-date information to make more reliable predictions. The second case in the evaluation process (Alg. 4, line 5) describes the actions taken after a server was visited the first time. The agent creates a new predictor set for this server and records the historical information. All predictors for this set are chosen randomly from some predefined set. g(p) = l i=0 ri (4) where: ri = ⎧ ⎨ ⎩ 1 if ith correct decision 0 if ith unknown outcome −1 if ith wrong decision The general case (Alg. 4, line 8) is the evaluation after the agent allocated the task at a resource. The agent evaluates all predictors of the predictor set for this resource by predicting the resource load with all predictors based on the old historical data. Predictors that made a correct prediction meaning the resource allocation was correct, will receive a positive rating. This is the case that the resource was not overloaded and free resources for execution were predicted, or the resource was overloaded and this predictor would have prevented the allocation. All predictors that predicted values which would lead to wrong decisions will receive negative ratings. In all other cases, which includes that no prediction was possible, a neutral rating is given to the predictors. Based on these performance ratings, the confidence levels are calculated using equation 4. The confidence for all predictors that cannot predict with the current historical information about the server is set to zero to prevent the selection of those as the new active predictor. These values are transformed into a probability distribution. According to this probability distribution the new active predictor is chosen, implemented as a roulette wheel selection. Figure 2(b) illustrates the probabilities of a set of 10 predictors, which have been calculated from the predictor confidence levels. Even if predictor 9 has the highest selection probability, its was not chosen by roulette wheel selection process as the active predictor. This non-deterministic predictor selection prevents the invalidation of the agents" beliefs in case agents have the same set of predictors. The prediction accuracy that is the error of the prediction compared to the observed value is not taken into consideration. Suppose the active predictor predicts slightly above the resource capacity which leads not to a allocation on a resources. In fact, enough resources for the execution would be available. A less accurate prediction which is far below the capacity would lead to the correct decision and is therefore preferred. The last action of the evaluation algorithm (Alg. 4, line 22) updates the history with the latest resource load information of the server. The oldest history data is overwritten if already m history values are recorded for the server. Algorithm 4 Decision Evaluation 1 if l ∈ LE then 2 for all P(a, l)|l ∈ LS (a) do 3 evaporate old historical data 4 end for 5 else if P(a, l) = null then 6 create (P(a, l)) 7 update H(l) 8 else 9 for all p ∈ P(a, l) do 10 pred ← resourceLoadPrediction(p) 11 if (U(l) ≤ C(l) AND pred + U(a, t) ≤ C(l)) OR (U(l) > C(l) AND pred + U(a, t) > C(l)) then 12 addPositiveRating(p) 13 else if U(l) ≤ C(l) AND pred + U(a, t) > C(l) OR U(l) ≤ C(l) AND pred + U(a, t) > C(l) then 14 addNegativeRating(p) 15 else 16 addNeutralRating(p) 17 end if 18 end for 19 calculate all g(p); g(p) ← 0, if p is not working 20 transform all g(p) into a probability distribution 21 pA ← p ∈ P(a, l) is selected according to this probability distribution 22 update H(l) 23 end if 4.2 Remarks and Limitation of the Approach Our prediction mechanism uses a number of different types of simple predictors rather than of one sophisticated predictor. This method assures that agents can compete more effectively in a changing environment. Different types of predictors are suitable for different situations and environments. Therefore, all predictors are being evaluated after each decision and the active predictor is selected. This nondeterministic of the new active predictor supports that the agents" beliefs will not be invalidated, which happens in the case that all predictors are making the same decision. Especially if there is only one shared resource available and all agents have only the choice to go the this one shared resource or not [19]. Our self-organising approach is robust against failures of resources or agents in the system. If they join or leave, the system can self-organise quickly and adapts to the new conditions. There is no classical bottleneck or single point of failure like in centralised mechanisms. The limitations are the reliance on historical resource utilisation information about other servers. A forecast of the resource utilisation of a remote server is only possible if an agent has a number of historical information about a shared resource. If the number of servers per agent is very large, there is no efficient way to gather historical information about remote servers. This problem occurs if the amount of provided shared resources 78 The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) is limited and not enough for all resource consumers. In this case, the an agent would randomly try all known servers until it will find one with free resources or there is no one. In the worst case, by the time for trying all servers, historical information of the servers is already outdated. 5. EXPERIMENTAL EVALUATION The first part of this section gives a short overview of the setup of our simulation environment. In the rest of the section, results of the experiments are presented and discussed. All experiments are conducted in a special testbed that simulates and models a multi-agent system. We have implemented this test-bed in the Java programming language, independent from any specific agent toolkit. It allows a variety of experiments in in stable as well as dynamic environments with a configurable number of agents, tasks and servers. An event-driven model is used to trigger all activities in the system. For all simulations, we limited the number of history data for each server to 10, the number of performance ratings per predictor to 10 and assigned 10 predictors to every predictor set for each agent. All predictors are chosen randomly from a arbitrary predefined set of 32 predictors of the following type. Predictors differ in different cycles or window sizes. - n-cycle predictor: p(n) = yn uses the nth -last history value - n-mean predictor: p(n) = 1 n · n i=1 yi uses the mean value of the n-last history values - n-linear regression predictor: p(n, t) = a·t+b uses the linear regression value from the last n history values where a, b are calculated using linear regression with least squares fitting under consideration of the last n history data. - n-distribution predictor: uses a random value from the frequency distribution of the n last history values - n-mirror predictor: p(n) = 2 · H − yn uses the mirror image around the mean of all history values of the nth last history value The efficiency of our proposed self-organising resource allocation is assessed by the resource load development of each server over the simulation as well as the total resource load development cumulated over all shared resources. Resource loads for each server are calculated using equation 1 as the sum of the resource consumption of all currently executed agents at this server. The total resource load of the system is calculated as the sum of the resources load of all resources. The self-organising resource allocation algorithm has random elements. Therefore, the presented results show mean values and standard derivation calculated over 100 repeated experiments. 5.1 Experimental Setup The following parameters have an impact on the resource allocation process. We give an overview of the parameters and a short description. - Agents: The number of agents involved in the resource allocation. This number varies in the experiments between 650 and 750 dependent on the total amount of available system resources. - Resource consumption: Each task consumes server resources for its execution. The resource consumption is assigned randomly to each task prior to its allocation from an interval. Resource consumption is specified in resource units which corresponds to real world metrics like memory or processor cycles. - Agent home server: All agents are located on a home agent server. The resources of those servers not considered in our simulation and does not affect the resource allocation performance. - Server resources: Experiments use servers with different amount of available shared resources. The first experiment is conducted in a static server environment that provides the same amount of shared resources, while the other experiment varies the available server resource during the simulation. The total amount of resources remains constant in both experiments. - Execution time: The execution time of a task for the execution, independent from the execution platform. For this time the task consumes the assigned amount of server resources. This parameter is randomly assigned before the execution. - Task creation time: The time before the next task is created after successful or unsuccessful completion. This parameter influences the age of the historical information about resources and has a major influence on the length of the initial adaptation phase. This parameter is randomly assigned after the task was completed. 5.2 Experimental Results This section shows results from selected experiments that demonstrate the performance of our proposed resource allocation mechanism. The first experiment show the performance in a stable environment where a number of agents allocate tasks to servers that provide a constant amount of resources. The second experiment was conducted in a dynamic server environment with a constant number of agents. The first experiment shows our model in a stable 3-server environment that provide a total amount of 7000 resource units. The resource capacity of each server remains constant over the experiment. We used 650 agents with the parameters of the execution time between 1 and 15 time units and a task creation time in the interval [0 − 30] time units. The task"s resource consumption is randomly assigned from the interval [1 − 45] resource units. Figure 4 shows the results from 100 repetitions of this experiment. Figure 4(a) shows that the total amount of provided resources is larger than the demand of resource in average. At the beginning of the experiment, all agents allocate their tasks randomly at one of the available servers and explore the available capacities and resource utilisations for about 150 time units. This initial exploration phase shows that the average resource load of each server has a similar level. This causes an overload situation at server 1 because of its low capacity of shared resources, and a large amount of free resources on server 2. Agents that allocated tasks to server 1 detect the overload situation and explore randomly other available servers. They find free resources at server 2. After learning period, the agents have self-organised themselves in this stable environment and find a stable solution for the allocation of The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) 79 0 250 500 750 1,000 1,250 1,500 Time 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 ResourceLoad (a) Total resource load versus total shared resource capacity 0 250 500 750 1,000 1,250 1,500 Time 0 500 1,000 1,500 2,000 2,500 ResourceLoad (b) Resource load server 0 0 250 500 750 1,000 1,250 1,500 Time 0 500 1,000 1,500 2,000 2,500 ResourceLoad (c) Resource load server 1 0 250 500 750 1,000 1,250 1,500 Time 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 ResourceLoad (d) Resource load server 2 Figure 4: Results of experiment 1 in a static 3-server environment averaged over 100 repetitions. all tasks. The standard deviation of the resource loads are small for each server, which indicates that our distributed approach find stable solutions in almost every run. This experiment used algorithm 2 for the selection of the active server. We also ran the same experiment with the most free resources selection mechanism to select the active server. The resource allocation for each server is similar. The absolute amount of free resources per server is almost the same. Experiment 2 was conducted in a dynamic 3-server environment with a number of 750 agents. The amount of resources of server 0 and server 1 changes periodically, while the total amount of available resources remains constant. Server 0 has an initial capacity of 1000 units, server 1 start with a capacity of 4000 units. The change in capacity starts after 150 time units, which is approximately the end of the learning phase. Figure 5 (b, c, d) shows the behaviour of our self-organising resource allocation in this environment. All agents use the deterministic most free resources selection mechanism to select the active server. It can bee seen in Fig. 5(b) and 5(c) that the number of allocated resources to server 0 and server 1 changes periodically with the amount of provided resources. This shows that agents can sense available resources in this dynamic environment and are able to adapt to those changes. The resource load development of server 2 (see Fig. 5(d)) shows a periodic change because some agent try to be allocated tasks to this server in case their previously favoured server reduce the amount of shared resources. The total resource load of all shared resources is constant over the experiments, which indicates the all agents allocate their tasks to one of the shared resource (comp. Fig. 4(a)). 6. CONCLUSIONS AND FUTURE WORK In this paper a self-organising distributed resource allocation technique for multi-agent systems was presented. We enable agents to select the execution platform for their tasks themselves before each execution at run-time. In our approach the agents compete for an allocation at one of the 0 500 1,000 1,500 2,000 Time 0 2,500 5,000 7,500 ResourceLoad (a) Total resource load versus total shared resource capacity 0 500 1,000 1,500 2,000 Time 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 ResourceLoad (b) Resource load server 1 0 500 1,000 1,500 2,000 Time 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 ResourceLoad (c) Resource load server 2 0 500 1,000 1,500 2,000 Time 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 ResourceLoad (d) Resource load server 3 Figure 5: Results of experiment 2 in a dynamic server environment averaged over 100 repetitions. available shared resource. Agents sense their server environment and adopt their action to compete more efficient in the new created environment. This process is adaptive and has a strong feedback as allocation decisions influence indirectly decisions of other agents. The resource allocation is a purely emergent effect. Our mechanism demonstrates that resource allocation can be done by the effective competition of individual and autonomous agents. Neither do they need coordination or information from a higher authority nor is an additional direct communication between agents required. This mechanism was inspired by inductive reasoning and bounded rationality principles which enables the agents" adaptation of their strategies to compete effectively in a dynamic environment. In the case of a server becomes unavailable, the agents can adapt quickly to this new situation by exploring new resources or remain at the home server if an allocation is not possible. Especially in dynamic and scalable environments such as grid systems, a robust and distributed mechanism for resource allocation is required. Our self-organising resource allocation approach was evaluated with a number of simulation experiments in a dynamic environment of agents and server resources. The presented results for this new approach for strategic migration optimisation are very promising and justify further investigation in a real multi-agent system environment. It is a distributed, scalable and easy-to-understand policy for the regulation of supply and demand of resources. All control is implemented in the agents. A simple decision mechanism based on different beliefs of the agent creates an emergent behaviour that leads to effective resource allocation. This approach can be easily extended or supported by resource balancing/queuing mechanisms provided by resources. Our approach adapts to changes in the environment but it is not evolutionary. There is no discovery of new strategies by the agents. The set of predictors stays the same over the whole life. In fact, we believe that this could further improve the system"s behaviour over a long term period and could be 80 The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) investigated in the future. The evolution would be very slow and selective and will not influence the system behaviour in a short-term period that is covered by our experimental results. In the near future we will investigate if an automatic adaptation of the decay rate of historical information our algorithm is possible and can improve the resource allocation performance. The decay rate is currently predefined and must be altered manually depending on the environment. A large number of shared resources requires older historical information to avoid a too frequently resources exploration. In contrast, a dynamic environment with varying capacities requires more up-to-date information to make more reliable predictions. We are aware of the long learning phase in environments with a large number of shared resources known by each agent. In the case that more resources are requested by agents than shared resources are provided by all servers, all agents will randomly explore all known servers. This process of acquiring resource load information about all servers can take a long time in the case that no not enough shared resources for all tasks are provided. In the worst case, by the time for exploring all servers, historical information of some servers could be already outdated and the exploration starts again. In this situation, it is difficult for an agent to efficiently gather historical information about all remote servers. This issue needs more investigation in the future. 7. REFERENCES [1] G. Allen, W. Benger, T. Dramlitsch, T. Goodale, H.-C. Hege, G. Lanfermann, A. Merzky, T. Radke, E. Seidel, and J. Shalf. Cactus Tools for Grid Applications. In Cluster Computing, volume 4, pages 179-188, Hingham, MA, USA, 2001. Kluwer Academic Publishers. [2] W. B. Arthur. Inductive Reasoning and Bounded Rationality. American Economic Review (Papers and Proceedings), 84(2):406-411, May 1994. [3] T. Bourke. Server Load Balancing. O"Reilly Media, 1 edition, August 2001. [4] R. Buyya. Economic-based Distributed Resource Management and Scheduling for Grid Computing. PhD thesis, Monash University, Melbourne, Australia, May 2002. [5] R. Buyya, D. Abramson, J. Giddy, and H. Stockinger. Economic Models for Resource Management and Scheduling in Grid Computing. Special Issue on Grid Computing Environments of the Journal Concurrency and Computation, 13-15(14):1507-1542, 2002. [6] R. Buyya, S. Chapin, and D. DiNucci. Architectural Models for Resource Management in the Grid. In Proceedings of the First International Workshop on Grid Computing, pages 18-35. Springer LNCS, 2000. [7] T. L. Casavant and J. G. Kuhl. A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Transactions on Software Engineering, 14(2):141-154, February 1988. [8] D. Challet and Y. Zhang. Emergence of Cooperation and Organization in an Evolutionary Game. Physica A, 407(246), 1997. [9] K.-P. Chow and Y.-K. Kwok. On load balancing for distributed multiagent computing. In IEEE Transactions on Parallel and Distributed Systems, volume 13, pages 787- 801. IEEE, August 2002. [10] S. H. Clearwater. Market-based control. A Paradigm for Distributed Resource Allocation. World Scientific, Singapore, 1996. [11] C. Fl¨us. Capacity Planning of Mobile Agent Systems Designing Efficient Intranet Applications. PhD thesis, Universit¨at Duisburg-Essen (Germany), Feb. 2005. [12] I. Foster and C. Kesselman. Globus: A Metacomputing Infrastructure Toolkit. International Journal of Supercomputing Applications, 11(2):115-129, 1997. [13] J. Frey, T. Tannenbaum, I. Foster, M. Livny, and S. Tuecke. Condor-G: A Computation Management Agent for Multi-Institutional Grids. Cluster Computing, 5(3):237-246, 2002. [14] A. Galstyan, S. Kolar, and K. Lerman. Resource allocation games with changing resource capacities. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pages 145 - 152, Melbourne, Australia, 2003. ACM Press, New York, NY, USA. [15] C. Georgousopoulos and O. F. Rana. Combining state and model-based approaches for mobile agent load balancing. In SAC "03: Proceedings of the 2003 ACM symposium on Applied computing, pages 878-885, New York, NY, USA, 2003. ACM Press. [16] G. Mainland, D. C. Parkes, and M. Welsh. Decentralized Adaptive Resource Allocation for Sensor Networks. In Proceedings of the 2nd USENIX Symposium on Network Systems Design and Implementation(NSDI "05), May 2005. [17] S. Manvi, M. Birje, and B. Prasad. An Agent-based Resource Allocation Model for Computational Grids. Multiagent and Grid Systems - An International Journal, 1(1):17-27, 2005. [18] A. Schaerf, Y. Shoham, and M. Tennenholtz. Adaptive Load Balancing: A Study in Multi-Agent Learning. In Journal of Artificial Intelligence Research, volume 2, pages 475-500, 1995. [19] T. Schlegel, P. Braun, and R. Kowalczyk. Towards Autonomous Mobile Agents with Emergent Migration Behaviour. In Proceedings of the Fifth International Joint Conference on Autonomous Agents & Multi Agent Systems (AAMAS 2006), Hakodate (Japan), pages 585-592. ACM Press, May 2006. [20] W3C. Web services activity, 2002. http://www.w3.org/2002/ws - last visited 23.10.2006. [21] M. M. Waldrop. Complexity: The Emerging Science at the Edge of Order and Chaos. Simon & Schuster, 1st edition, 1992. [22] R. Wolsk, J. S. Plank, J. Brevik, and T. Bryan. Analyzing Market-Based Resource Allocation Strategies for the Computational Grid. In International Journal of High Performance Computing Applications, volume 15, pages 258-281. Sage Science Press, 2001. The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) 81