SMILE: Sound Multi-agent Incremental LEarning ;-)∗ Gauvain Bourgne LAMSADE, UMR 7024 CNRS, University Paris-Dauphine, 75775 Paris Cedex 16 Amal El Fallah Segrouchni LIP6, UMR 7606 CNRS, University Paris 6, 104, Av. du pr´esident Kennedy, 75116 Paris Henry Soldano LIPN, UMR 7030 CNRS, University Paris-Nord, 99 Av. J-B Clement, 93430, Villetaneuse ABSTRACT This article deals with the problem of collaborative learning in a multi-agent system. Here each agent can update incrementally its beliefs B (the concept representation) so that it is in a way kept consistent with the whole set of information K (the examples) that he has received from the environment or other agents. We extend this notion of consistency (or soundness) to the whole MAS and discuss how to obtain that, at any moment, a same consistent concept representation is present in each agent. The corresponding protocol is applied to supervised concept learning. The resulting method SMILE (standing for Sound Multiagent Incremental LEarning) is described and experimented here. Surprisingly some difficult boolean formulas are better learned, given the same learning set, by a Multi agent system than by a single agent. Categories and Subject Descriptors I.2.6 [Artificial Intelligence]: Learning-Concept learning; I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence-Multiagent system 1. INTRODUCTION This article deals with the problem of collaborative concept learning in a multi-agent system. [6] introduces a characterisation of learning in multi-agent system according to the level of awareness of the agents. At level 1, agents learn ∗The primary author of this paper is a student. in the system without taking into account the presence of other agents, except through the modification brought upon the environment by their action. Level 2 implies direct interaction between the agents as they can exchange messages to improve their learning. Level 3 would require agents to take into account the competencies of other agents, and be able to learn from observation of the other agents" behaviour (while considering them as independant entities and not indetermined part of the environment as in level 1). We focus in this paper on level 2, studying direct interaction between agents involved in a learning process. Each agent is assumed to be able to learn incrementally from the data he receives, meaning that each agent can update his belief set B to keep it consistent with the whole set of information K that he has received from the environment or from other agents. In such a case, we will say that he is a-consistent. Here, the belief set B represents hypothetical knowledge that can therefore be revised, whereas the set of information K represents certain knowledge, consisting of non revisable observations and facts. Moreover, we suppose that at least a part Bc of the beliefs of each agent is common to all agents and must stay that way. Therefore, an update of this common set Bc by agent r must provoke an update of Bc for the whole community of agents. It leads us to define what is the mas-consistency of an agent with respect to the community. The update process of the community beliefs when one of its members gets new information can then be defined as the consistency maintenance process ensuring that every agent in the community will stay masconsistent. This mas-consistency maintenance process of an agent getting new information gives him the role of a learner and implies communication with other agents acting as critics. However, agents are not specialised and can in turn be learners or critics, none of them being kept to a specific role. Pieces of information are distributed among the agents, but can be redundant. There is no central memory. The work described here has its origin in a former work concerning learning in an intentional multi-agent system using a BDI formalism [6]. In that work, agents had plans, each of them being associated with a context defining in which conditions it can be triggered. Plans (each of them having its own context) were common to the whole set of agents in the community. Agents had to adapt their plan contexts depending on the failure or success of executed plans, using a learning mechanism and asking other agents for examples (plans successes or failures). However this work lacked a collective learning protocol enabling a real autonomy of the multi-agent system. The study of such a protocol is the object of the present paper. In section 2 we formally define the mas-consistency of an update mechanism for the whole MAS and we propose a generic update mechanism proved to be mas consistent. In section 3 we describe SMILE, an incremental multi agent concept learner applying our mas consistent update mechanism to collaborative concept learning. Section 4 describes various experiments on SMILE and discusses various issues including how the accuracy and the simplicity of the current hypothesis vary when comparing single agent learning and mas learning. In section 5 we briefly present some related works and then conclude in section 6 by discussing further investigations on mas consistent learning. 2. FORMAL MODEL 2.1 Definitions and framework In this section, we present a general formulation of collective incremental learning in a cognitive multi agent system. We represent a MAS as a set of agents r1, ..., rn. Each agent ri has a belief set Bi consisting of all the revisable knowledge he has. Part of these knowledges must be shared with other agents. The part of Bi that is common to all agents is denoted as BC . This common part provokes a dependency between the agents. If an agent ri updates his belief set Bi to Bi, changing in the process BC into BC , all other agents rk must then update their belief set Bk to Bk so that BC ⊆ Bk. Moreover, each agent ri has stored some certain information Ki. We suppose that some consistency property Cons(Bi, Ki) can be verified by the agent itself between its beliefs Bi and its information Ki. As said before, Bi represents knowledge that might be revised whereas Ki represents observed facts, taken as being true, and which can possibly contradict Bi. Definition 1. a-consistency of an agent An agent ri is a-consistent iff Cons(Bi, Ki) is true. Example 1. Agent r1 has a set of plans which are in the common part BC of B1. Each plan P has a triggering context d(P) (which acts as a pre-condition) and a body. Some piece of information k could be plan P, triggered in situation s, has failed in spite of s being an instance of d(P). If this piece of information is added to K1, then agent r1 is not a-consistent anymore: Cons(B1, K1 ∪ k) is false. We also want to define some notion of consistency for the whole MAS depending on the belief and information sets of its constituting elements. We will first define the consistency of an agent ri with respect to its belief set Bi and its own information set Ki together with all information sets K1...Kn from the other agents of the MAS. We will simply do that by considering what would be the a-consistency of the agent if he has the information of all the other agents. We call this notion the mas-consistency: Definition 2. mas-consistency of an agent An agent ri is mas-consistent iff Cons(Bi, Ki ∪ K) is true, where K = ∪j∈{1,..,n}−{i}Kj 1 is the set of all information from other agents of the MAS. 1 We will note this ∪ Kj when the context is similar. Example 2. Using the previous example, suppose that the piece of information k is included in the information K2 of agent r2. As long as the piece of information is not transmitted to r1, and so added to K1 , r1 remains a-consistent. However, r1 is not mas-consistent as k is in the set K of all information of the MAS. The global consistency of the MAS is then simply the mas-consistency of all its agents. Definition 3. Consistency of a MAS A MAS r1,...,rn is consistent iff all its agents ri are masconsistent. We now define the required properties for a revision mechanism M updating an agent ri when it gets a piece of information k. In the following, we will suppose that: • Update is always possible, that is, an agent can always modify its belief set Bi in order to regain its a-consistency. We will say that each agent is locally efficient. • Considering two sets of information Cons(Bi, K1) and Cons(Bi, K2), we also have Cons(Bi, K1 ∪ K2). That is, a-consistency of the agents is additive. • If a piece of information k concerning the common set BC is consistent with an agent, it is consistent with all agents: for all pair of agents (ri,rj) such that Cons(Bi, Ki) and Cons(Bj, Kj) are true, we have, for all piece of information k: Cons(Bi, Ki ∪ k) iff Cons(Bj, Kj ∪ k). In such a case, we will say that the MAS is coherent. This last condition simply means that the common belief set BC is independent of the possible differences between the belief sets Bi of each agent ri. In the simplest case, B1 = ... = Bn = BC . M will also be viewed as an incremental learning mechanism and represented as an application changing Bi in Bi. In the following, we shall note ri(Bi, Ki) for ri when it is useful. Definition 4. a-consistency of a revision An update mechanism M is a-consistent iff for any agent ri and any piece of information k reaching ri, the a-consistency of this agent is preserved. In other words, iff: ri(Bi, Ki) a-consistent ⇒ ri(Bi, Ki) a-consistent, where Bi = M(Bi) and Ki = Ki ∪ k is the set of all information from other agents of the MAS. In the same way, we define the mas-consistency of a revision mechanism as the a-consistency of this mechanism should the agents dispose of all information in the MAS. In the following, we shall note, if needed, ri(Bi, Ki, K) for the agent ri in MAS r1 . . . rn. Definition 5. mas-consistency of a revision An update mechanism Ms is mas-consistent iff for all agent ri and all pieces of information k reaching ri, the masconsistency of this agent is preserved. In other words, if: ri(Bi, Ki, K) mas-consistent ⇒ ri(Bi, Ki, K) mas-consistent, where Bi = Ms(Bi), Ki = Ki ∪ k, and K = ∪Kj is the set of all information from the MAS. The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) 165 At last, when a mas-consistent mechanism is applied by an agent getting a new piece of information, a desirable sideeffect of the mechanism should be that all others agents remains mas-consistent after any modification of the common part BC , that is, the MAS itself should become consistent again. This property is defined as follows: Definition 6. Strong mas-consistency of a revision An update mechanism Ms is strongly mas-consistent iff - Ms is mas-consistent, and - the application of Ms by an agent preserves the consistency of the MAS. 2.2 A strongly mas-consistent update mechanism The general idea is that, since information is distributed among all the agents of the MAS, there must be some interaction between the learner agent and the other agents in a strongly mas-consistent update mechanism Ms. In order to ensure its mas-consistency, Ms will be constituted of reiterated applications by the learner agent ri of an internal a-consistent mechanism M, followed by some interactions between ri and the other agents, until ri regain its masconsistency. We describe below such a mechanism, first with a description of an interaction, then an iteration, and finally a statement of the termination condition of the mechanism. The mechanism is triggered by an agent ri upon receipt of a piece of information k disrupting the mas-consistency. We shall note M(Bi) the belief set of the learner agent ri after an update, BC the common part modified by ri, and Bj the belief set of another agent rj induced by the modification of its common part BC in BC . An interaction I(ri, rj) between the learner agent ri and another agent rj, acting as critic is constituted of the following steps: • agent ri sends the update BC of the common part of its beliefs. Having applied its update mechanism, ri is a-consistent. • agent rj checks the modification Bj of its beliefs induced by the update BC . If this modification preserve its a-consistency, rj adopts this modification. • agent rj sends either an acceptation of BC or a denial along with one (or more) piece(s) of information k such that Cons(Bj, k ) is false. An iteration of Ms will then be composed of: • the reception by the learner agent ri of a piece of information and the update M(Bi) restoring its aconsistency • a set of interactions I(ri, rj) (in which several critic agents can possibly participate). If at least one piece of information k is transmitted to ri, the addition of k will necessarily make ri a-inconsistent and a new iteration will then occur. This mechanism Ms ends when no agent can provide such a piece of information k . When it is the case, the masconsistency of the learner agent ri is restored. Proposition 1. Let r1,...,rn be a consistent MAS in which agent ri receives a piece of information k breaking its aconsistency, and M an a-consistent internal update mechanism. The update mechanism Ms described above is strongly mas-consistent. Proof. The proof directly derives from the mechanism description. This mechanism ensures that each time an agent receives an event, its mas-consistency will be restored. As the other agents all adopt the final update BC , they are all mas-consistent, and the MAS is consistent. Therefore Ms is a strongly consistent update mechanism. In the mechanism Ms described above, the learner agent is the only one that receives and memorizes information during the mechanism execution. It ensures that Ms terminates. The pieces of information transmitted by other agents and memorized by the learner agent are redundant as they are already present in the MAS, more precisely in the memory of the critic agents that transmitted them. Note that the mechanism Ms proposed here does not explicitly indicate the order nor the scope of the interactions. We will consider in the following that the modification proposal BC is sent sequentially to the different agents (synchronous mechanism). Moreover, the response of a critic agent will only contain one piece of information inconsistent with the proposed modification. We will say that the response of the agent is minimal. This mechanism Ms, being synchronous with minimal response, minimizes the amount of information transmitted by the agents. We will now illustrate it in the case of multi-agent concept learning. 3. SOUNDMULTI-AGENTINCREMENTAL LEARNING 3.1 The learning task We experiment the mechanism proposed above in the case of incremental MAS concept learning. We consider here a hypothesis language in which a hypothesis is a disjunction of terms. Each term is a conjunction of atoms from a set A. An example is represented by a tag + or − and a description 2 composed of a subset of atoms e ⊆ A. A term covers an example if its constituting atoms are included in the example. A hypothesis covers an example if one of its term covers it. This representation will be used below for learning boolean formulae. Negative literals are here represented by additional atoms, like not − a. The boolean formulae f =(a ∧ b) ∨ (b ∧ ¬c) will then be written (a ∧ b) ∨ (b ∧ not − c). A positive example of f, like {not − a, b, not − c}, represents a model for f. 3.2 Incremental learning process The learning process is an update mechanism that, given a current hypothesis H, a memory E = E+ ∪ E− filled with the previously received examples, and a new positive or negative example e, produces a new updated hypothesis. Before this update, the given hypothesis is complete, meaning that it covers all positive examples of E+ , and 2 When no confusion is possible, the word example will be used to refer to the pair (tag, description) as well as the description alone. 166 The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) coherent, meaning that it does not cover any negative example of E− . After the update, the new hypothesis must be complete and coherent with the new memory state E ∪ {e}. We describe below our single agent update mechanism, inspired from a previous work on incremental learning[7]. In the following, a hypothesis H for the target formula f is a list of terms h, each of them being a conjunction of atoms. H is coherent if all terms h are coherent, and H is complete if each element of E+ is covered by at least one term h of H. Each term is by construction the lgg (least general generalization) of a subset of positives instances {e1, ..., en}[5], that is the most specific term covering {e1, ..., en}. The lgg operator is defined by considering examples as terms, so we denote as lgg(e) the most specific term that covers e, and as lgg(h, e) the most specific term which is more general than h and that covers e. Restricting the term to lgg is the basis of a lot of Bottom-Up learning algorithms (for instance [5]). In the typology proposed by [9], our update mechanism is an incremental learner with full instance memory: learning is made by successive updates and all examples are stored. The update mechanism depends of the ongoing hypothesis H, the ongoing examples E+ and E− , and the new example e. There are three possible cases: • e is positive and H covers e, or e is negative and H does not cover e. No update is needed, H is already complete and coherent with E ∪ {e}. • e is positive and H does not cover e: e is denoted as a positive counterexample of H. Then we seek to generalize in turn the terms h of H. As soon as a correct generalization h = lgg(h, e) is found, h replaces h in H. If there is a term that is less general that h , it is discarded. If no generalization is correct (meaning here coherent), H ∪ lgg(e) replaces H. • e is negative and H covers e: e is denoted as a negative counterexample of H. Each term h covering e is then discarded from H and replaced by a set of terms {h1, ...., hn} that is, as a whole, coherent with E− ∪ {e} and that covers the examples of E+ uncovered by H − {h}. Terms of the final hypothesis H that are less general than others are discarded from H. We will now describe the case where e = e− is a covered negative example. The following functions are used here: • coveredOnlyBy(h, E+) gives the subset of E+ covered by h and no other term of H. • bestCover(h1, h2) gives h1 if h1 covers more examples from uncoveredPos than h2, otherwise it gives h2. • covered(h) gives the elements of uncoveredPos covered by h. // Specialization of each h covering e− for each h of H covering e− do H = H − {h} uncoveredPos = coveredOnlyBy(h, E+ ) Ar= atoms that are neither in e− nor in h while (uncoveredPos = ∅) do // seeking the best specialization of h hc=h best=⊥ // ⊥ covers no example for each a of Ar do hc= h ∧ a best = bestCover(hc, best) endfor Ar=Ar−{best} hi=lgg(covered(best)) H = H ∪ {hi} uncoveredPos=uncoveredPos - covered(best) endwhile endfor Terms of H that are less general than others are discarded. Note that this mechanism tends to both make a minimal update of the current hypothesis and minimize the number of terms in the hypothesis, in particular by discarding terms less general than other ones after updating a hypothesis. 3.3 Collective learning If H is the current hypothesis, Ei the current example memory of agent ri and E the set of all the examples received by the system, the notation of section 2 becomes Bi = BC = H, Ki = Ei and K = E. Cons(H, Ei) states that H is complete and coherent with Ei. In such a case, ri is a-consistent. The piece of information k received by agent ri is here simply an example e along with its tag. If e is such that the current hypothesis H is not complete or coherent with Ei ∪ {e}, e contradicts H: ri becomes a-inconsistent, and therefore the MAS is not consistent anymore. The update of a hypothesis when a new example arrives is an a- consistent mechanism. Following proposition 1 this mechanism can be used to produce a strong mas-consistent mechanism: upon reception of a new example in the MAS by an agent r, an update is possibly needed and, after a set of interactions between r and the other agents, results in a new hypothesis shared by all the agents and that restores the consistency of the MAS, that is which is complete and coherent with the set ES of all the examples present in the MAS. It is clear that by minimizing the number of hypothesis modifications, this synchronous and minimal mechanism minimize the number of examples received by the learner from other agents, and therefore, the total number of examples stored in the system. 4. EXPERIMENTS In the following, we will learn a boolean formula that is a difficult test for the learning method: the 11-multiplexer (see [4]). It concerns 3 address boolean attributes a0, a1, a2 and 8 data boolean attributes d0, ..., d7. Formulae f11 is satisfied if the number coded by the 3 address attributes is the number of a data attribute whose value is 1. Its formula is the following: f11 = (a0 ∧a1 ∧a2 ∧d7)∨(a0 ∧a1 ∧¬a2 ∧d6)∨(a0 ∧¬a1 ∧ a2 ∧d5)∨(a0 ∧¬a1 ∧¬a2 ∧d4)∨(¬a0 ∧a1 ∧a2 ∧d3)∨(¬a0 ∧ a1 ∧¬a2 ∧d2)∨(¬a0 ∧¬a1 ∧a2 ∧d1)∨(¬a0 ∧¬a1 ∧¬a2 ∧d0). There are 2048 = 211 possible examples, half of whom are positive (meaning they satisfy f11) while the other half is negative. An experiment is typically composed of 50 trials. Each run corresponds to a sequence of 600 examples that are incrementally learned by a Multi Agent System with n agents The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) 167 (n-MAS). A number of variables such as accuracy, (i.e. the frequency of correct classification of a set of unseen examples), hypothesis size (i.e. the number of terms in the current formula) or number of stored examples, is recorded each time 25 examples are received by the system during those runs. In the protocol that is used here, a new example is sent to a random agent when the MAS is consistent. The next example will be sent in turn to an other agent when the MAS consistency will have been restored. In such a way we simulate a kind of slow learning: the frequency of example arrivals is slow compared to the time taken by an update. 4.1 Efficiency of MAS concept learning 4.1.1 Execution time We briefly discuss here execution time of learning in the MAS. Note that the whole set of action and interaction in the MAS is simulated on a single processor. Figure 1 shows that time linearly depends on the number of agents. At the end of the most active part of learning (200 examples), a 16MAS has taken 4 times more learning time than a 4-MAS. This execution time represents the whole set of learning and Figure 1: Execution time of a n-MAS (from n = 2 at the bottom to n = 20 on the top). communication activity and hints at the cost of maintaining a consistent learning hypothesis in a MAS composed of autonomous agents. 4.1.2 Redundancy in the MAS memory We study now the distribution of the examples in the MAS memory. Redundancy is written RS = nS/ne, where nS is the total number of examples stored in the MAS, that is the sum of the sizes of agents examples memories Ei, and ne is the total number of examples received from the environment in the MAS. In figure 2, we compare redundancies in 2 to 20 agents MAS. There is a peak, slowly moving from 80 to 100 examples, that represents the number of examples for which the learning is most active. For 20 agents, maximal redundancy is no more than 6, which is far less than the maximal theoretical value of 20. Note that when learning becomes less active, redundancy tends towards its minimal value 1: when there is no more updates, examples are only Figure 2: Redundancy of examples stored in a nMAS (from n = 2 at the bottom to n = 20 on the top) . stored by the agent that receives them. 4.1.3 A n-MAS selects a simpler solution than a single agent The proposed mechanism tends to minimize the number of terms in the selected hypothesis. During learning, the size of the current hypothesis grows up beyond the optimum, and then decreases when the MAS converges. In the Multiplexer 11 testbed, the optimal number of terms is 8, but there also exist equivalent formulas with more terms. It is interesting to note that in this case the 10-MAS converges towards an exact solution closer to the optimal number of terms (here 8) (see Figure 3). After 1450 examples have been presented both 1-MAS and 10-MAS have exactly learned the concept (the respective accuracies are 0.9999 and 1) but the single agent expresses in average the result as a 11.0 terms DNF whereas the 10-MAS expresses it as a 8.8 terms DNF. However for some other boolean functions we found that during learning 1-MAS always produces larger hypotheses than 10-MAS but that both MAS converge to hypotheses with similar size results. 4.1.4 A n-MAS is more accurate than a single agent Figure 4 shows the improvement brought by a MAS with n agents compared to a single agent. This improvement was not especially expected, because whether we have one or n agents, when N examples are given to the MAS it has access to the same amount of information, maintains only on ongoing hypothesis and uses the same basic revision algorithm whenever an agent has to modify the current hypothesis. Note that if the accuracy of 1, 2, 4 and 10-MAS are significantly different, getting better as the number of agents increases, there is no clear difference beyond this point: the accuracy curve of the 100 agents MAS is very close to the one of the 10 agents MAS. 4.1.4.1 Boolean formulas. To evaluate this accuracy improvement, we have experimented our protocol on other problems of boolean function learning, As in the Multiplexer-11 case, these functions 168 The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) Figure 3: Size of the hypothesis built by 1 and 10MAS: the M11 case. Figure 4: Accuracy of a n-MAS: the M11 case (from bottom to top, n = 1, 2, 4, 10, 100). are learnt in the form of more or less syntactically complex DNF3 (that is with more or less conjunctive terms in the DNF), but are also more or less difficult to learn as it can be difficult to get its way in the hypothesis space to reach them. Furthermore, the presence in the description of irrelevant attributes (that is attributes that does not belong to the target DNF) makes the problem more difficult. The following problems have been selected to experiment our protocol: (i) the multiplexer-11 with 9 irrelevant attributes: M11 9, (ii) the 20-multiplexer M20 (with 4 address bits and 16 data bits), (iii) a difficult parity problem (see [4]) the Xorp m: there must be an odd number of bits with value 1 in the p first attributes for the instance to be positive, the p others bits being irrelevant, and (iv) a simple DNF formula (a ∧ b ∧ c) ∨ (c ∧ d ∧ e)(e ∧ f ∧ g) ∧ (g ∧ h ∧ i) with 19 irrelevant attributes. The following table sums up some information about these problems, giving the total number of attributes including irrelevant ones, the number of irrelevant 3 Disjunctive Normal Forms attributes, the minimal number of terms of the corresponding DNF, and the number of learning examples used. Pb att. irre. att. terms ex. M11 11 0 8 200 M11 9 20 9 8 200 M20 20 0 16 450 Xor3 25 28 25 4 200 Xor5 5 10 5 16 180 Xor5 15 20 15 16 600 Simple4-9 19 28 19 4 200 Below are given the accuracy results of our learning mechanism with a single agent and a 10 agents MAS, along with the results of two standard algorithms implemented with the learning environment WEKA[16]: JRip (an implementation of RIPPER[2]) and Id3[12]. For the experiments with JRip and Id3, we measured the mean accuracy on 50 trials, each time randomly separating examples in a learning set and a test set. JRip and Id3 parameters are default parameters, except that JRip is used without pruning. The following table shows the results: Pb JRip Id3 Sm 1 Sm 10 M11 88.3 80.7 88.7 95.5 M11 9 73.4 67.9 66.8 83.5 M20 67.7 62.7 64.6 78.2 Xor3 25 54.4 55.2 71.4 98.5 Xor5 5 52.6 60.8 71.1 78.3 Xor5 15 50.9 51.93 62.4 96.1 Simple4-9 19 99.9 92.3 87.89 98.21 It is clear that difficult problems are better solved with more agents (see for instance xor5 15). We think that these benefits, which can be important with an increasing number of agents, are due to the fact that each agent really memorizes only part of the total number of examples, and this part is partly selected by other agents as counter examples, which cause a greater number of current hypothesis updates and therefore, a better exploration of the hypotheses space. 4.1.4.2 ML database problems. We did also experiments with some non boolean problems. We considered only two classes (positive/negative) problems, taken from the UCI"s learning problems database[3]. In all these problems, examples are described as a vector of couples (attribute, value). The value domains can be either boolean, numeric (wholly ordered set), or nominal (non-ordered set). An adequate set of atoms A must be constituted for each problem. For instance, if a is a numeric attribute, we define at most k threshold si, giving k+1 intervals of uniform density4 . Therefore, each distinct threshold si gives two atoms a ≤ si and a > si. In our experiments, we took a maximal number of threshold k = 8. For instance, in the iono problem case, there were 34 numeric attributes, and an instance is described with 506 atoms. Below are given the accuracy results of our system along with previous results. The column Nb ex. refer to the 4 The probability for the value of a to be in any interval is constant The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) 169 number of examples used for learning5 . Column (1) represents minimal and maximal accuracy values for the thirty three classifiers tested in [8]. Column (2) represents the results of [13], where various learning methods are compared to ensemble learning methods using weighted classifiers sets. Column S-1 and S-10 gives the accuracy of SMILE with respectively 1 and 10 agents. Pb Nb ex. (1) (2) S-1 S-10 ttt 862/574 // 76.2-99.7 99.7 99.9 kr-vs-kp 2876/958 // 91.4-99.4 96.8 97.3 iono 315 // 88.0-91.8 87.2 88.1 bupa 310 57-72 58-69.3 62.5 63.3 breastw 614 91-97 94.3-97.3 94.7 94.7 vote 391 94-96 95.3-96 91.9 92.6 pima 691 // 71.5- 73.4 65.0 65.0 heart 243 66-86 77.1-84.1 69.5 70.7 This table shows that the incremental algorithm corresponding to the single agent case, gives honorable results relatively to non-incremental classical methods using larger and more complex hypotheses. In some cases, there is an accuracy improvement with a 10 agents MAS. However, with such benchmarks data, which are often noisy, the difficulty does not really come from the way in which the search space is explored, and therefore the improvement observed is not always significant. The same kind of phenomenon have been observed with methods dedicated to hard boolean problems [4]. 4.2 MAS synchronization Here we consider that n single agents learn without interactions and at a given time start interacting thus forming a MAS. The purpose is to observe how the agents take advantage of collaboration when they start from different states of beliefs and memories. We compare in this section a 1-MAS, a 10-MAS (ref) and a 10-MAS (100sync) whose agents did not communicate during the arrival of the first 100 examples (10 by agents). The three accuracy curves are shown in figure 5. By comparing the single agent curve and the synchronized 10-MAS, we can observe that after the beginning of the synchronization, that is at 125 examples, accuracies are identical. This was expected since as soon as an example e received by the MAS contradicts the current hypothesis of the agent ra receiving it, this agent makes an update and its new hypothesis is proposed to the others agents for criticism. Therefore, this first contradictory example brings the MAS to reach consistency relatively to the whole set of examples present in agents" memories. A higher accuracy, corresponding to a 10-MAS is obtained later, from the 175th example. In other words, the benefit of a better exploration of the research space is obtained slightly later in the learning process. Note that this synchronization happens naturally in all situations where agents have, for some reason, a divergence between their hypothesis and the system memory. This includes the fusion of two MAS into a single one or the arrival of new agents in an existing MAS. 4.3 Experiments on asynchronous learning: the effect of a large data stream 5 For ttt and kr-vs-kp, our protocol did not use more than respectively 574 and 958 learning examples, so we put another number in the column. Figure 5: Accuracies of a 1-MAS, a 10-MAS, and a 10-MAS synchronized after 100 examples. In this experiment we relax our slow learning mode: the examples are sent at a given rate to the MAS. The resulting example stream is measured in ms−1 , and represents the number of examples sent to the MAS each ms. Whenever the stream is too large, the MAS cannot reach MAS consistency on reception of an example from the environment before a new example arrives. This means that the update process, started by agent r0 as he received an example, may be unfinished when a new example is received by r0 or another agent r1. As a result, a critic agent may have at instant t to send counterexamples of hypotheses sent by various agents. However as far as the agents, in our setting, memorizes all the examples they receive whenever the stream ends, the MAS necessarily reaches MAS consistency with respect to all the examples received so far. In our experiments, though its learning curve is slowed down during the intense learning phase (corresponding to low accuracy of the current hypotheses), the MAS still reaches a satisfying hypothesis later on as there are less and less counterexamples in the example stream. In Figure 6 we compare the accuracies of two 11-MAS respectively submitted to example streams of different rates when learning the M11 formula. The learning curve of the MAS receiving an example at a 1/33 ms−1 rate is almost not altered (see Figure 4) whereas the 1/16 ms−1 MAS is first severely slowed down before catching up with the first one. 5. RELATED WORKS Since 96 [15], various work have been performed on learning in MAS, but rather few on concept learning. In [11] the MAS performs a form of ensemble learning in which the agents are lazy learners (no explicit representation is maintained) and sell useless examples to other agents. In [10] each agent observes all the examples but only perceive a part of their representation. In mutual online concept learning [14] the agents converge to a unique hypothesis, but each agent produces examples from its own concept representation, thus resulting in a kind of synchronization rather than in pure concept learning. 170 The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) Figure 6: Accuracies of two asynchronous 11-MAS (1/33ms−1 and 1/16ms−1 example rates) . 6. CONCLUSION We have presented here and experimented a protocol for MAS online concept learning. The main feature of this collaborative learning mechanism is that it maintains a consistency property: though during the learning process each agent only receives and stores, with some limited redundancy, part of the examples received by the MAS, at any moment the current hypothesis is consistent with the whole set of examples. The hypotheses of our experiments do not address the issues of distributed MAS such as faults (for instance messages could be lost or corrupted) or other failures in general (crash, byzantine faults, etc.). Nevertheless, our framework is open, i.e., the agents can leave the system or enter it while the consistency mechanism is preserved. For instance if we introduce a timeout mechanism, even when a critic agent crashes or omits to answer, the consistency with the other critics (within the remaining agents) is entailed. In [1], a similar approach has been applied to MAS abduction problems: the hypotheses to maintain, given an incomplete information, are then facts or statements. Further work concerns first coupling induction and abduction in order to perform collaborative concept learning when examples are only partially observed by each agent, and second, investigating partial memory learning: how learning is preserved whenever one agent or the whole MAS forgets some selected examples. Aknowledgments We are very grateful to Dominique Bouthinon for implementing late modifications in SMILE, so much easing our experiments. Part of this work has been performed during the first author"s visit to the Atelier De BioInformatique of Paris VI university, France. 7. REFERENCES [1] G. Bourgne, N. Maudet, and S. Pinson. When agents communicate hypotheses in critical situations. In DALT-2006, May 2006. [2] W. W. Cohen. Fast effective rule induction. In ICML, pages 115-123, 1995. [3] C. B. D.J. Newman, S. Hettich and C. Merz. UCI repository of machine learning databases, 1998. [4] S. Esmeir and S. Markovitch. Lookahead-based algorithms for anytime induction of decision trees. In ICML"O4, pages 257-264. Morgan Kaufmann, 2004. [5] J. F¨urnkranz. A pathology of bottom-up hill-climbing in inductive rule learning. In ALT, volume 2533 of LNCS, pages 263-277. Springer, 2002. [6] A. Guerra-Hern´andez, A. ElFallah-Seghrouchni, and H. Soldano. Learning in BDI multi-agent systems. In CLIMA IV, volume 3259, pages 218-233. Springer Verlag, 2004. [7] M. Henniche. Mgi: an incremental bottom-up algorithm. In IEEE Aust. and New Zealand Conference on Intelligent Information Systems, pages 347-351, 1994. [8] T.-S. Lim, W.-Y. Loh, and Y.-S. Shih. A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 40(3):203-228, 2000. [9] M. A. Maloof and R. S. Michalski. Incremental learning with partial instance memory. Artif. Intell., 154(1-2):95-126, 2004. [10] P. J. Modi and W.-M. Shen. Collaborative multiagent learning for classification tasks. In AGENTS "01, pages 37-38. ACM Press, 2001. [11] S. Onta˜non and E. Plaza. Recycling data for multi-agent learning. In ICML "05, pages 633-640. ACM Press, 2005. [12] J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81-106, 1986. [13] U. R¨uckert and S. Kramer. Towards tight bounds for rule learning. In ICML "04 (International conference on Machine learning), page 90, New York, NY, USA, 2004. ACM Press. [14] J. Wang and L. Gasser. Mutual online concept learning for multiple agents. In AAMAS, pages 362-369. ACM Press, 2002. [15] G. Weiß and S. Sen, editors. Adaption and Learning in Multi-Agent Systems, volume 1042 of Lecture Notes in Computer Science. Springer, 1996. [16] I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, October 1999. The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07) 171