Monday, June 22, 2009

Koubarakis (2003) Multi-Agent Systems and Peer to Peer Computing: Methods, Systems and Challenges

So I must have read this paper about 3 or 4 times now.  It was originally recommended to me by Gianluca Moro as being a good paper to read about what MultiAgent Systems (MAS) research might have to offer peer to peer (P2P) research.  In Koubarakis' opinion the Agents field was slower to pick up tools and techniques from the P2P community than some other fields and that this was disappointing given that:
deployed P2P systems can be considered an interesting case of MAS as pointed out originally [by Finin & Labrou (2000)].
Koubarakis further suggests specifically that:
MAS can readily offer concepts and techniques that can be useful to P2P computing at the application modeling and design level (e.g., ontologies for describing network resources in a semantically meaningful way, protocols for meaning negotiation, P2P system modeling and design methodologies etc.).
Interestingly I was at an Agents conference in 2000 where Tim Finin spoke at the scalability workshop as to how Napster might be "improved" using ontologies, and I remember comments afterwards to the effect that Napster was doing just fine without ontologies.  What I keep coming back to here is the question of whether ontologies and sophisticated protocols for meaning negotiation would provide any short-term benefit for P2P system users and developers.  I have several layers of comments about this in the paper from my various passes over this point, but the summary is that a sophisticated layer of middle agents on which to base applications could allow developers to avoid re-implementing lower layers again and again, but that any individual developer is not going to have much patience with that kind of system if they don't see some immediate benefit.  In addition in the first instance there are likely to be drawbacks in as much as P2P applications need to operate blindingly fast to produce the best results for the end user, and as such P2P protocols are simple and robust.  Support for complex negotiations seems like it would slow things down.

Nevertheless, Koubarakis presents a review of what P2P research might offer MAS research and vice versa.  The former makes sense to me; P2P systems can offer look up services that agent systems use, and basically be an infrastructure component for MAS.  It is in the latter case that I still struggle with understanding the benefits, i.e. what agent research can offer to P2P developers.  I guess it is important to distinguish between P2P researchers and P2P developers, but I will attempt that elsewhere.  Koubarakis' first example is of the application of agent based software engineering methodologies to P2P systems.  I think this is one that I glossed over in earlier readings, and now that I look up the referenced paper by Bertollini et al. (2002) I see that the Tropos software engineering methodology (from a very quick skim) allows a diagrammatic summary of the Napster and Gnutella architectural designs, which they then abstract away from to produce a generic peer to peer virtual community pattern which can then be used to support the implementation of particular P2P solutions using the JXTA framework (JXTA forums are still active, but not clear what the status of that project is, particularly since Oracle bought Sun).  Interestingly the example implementation used comes from health care, which is an application domain I compared Agents and P2P in myself (Tse et al., 2006).  I am not deeply familiar with the effectiveness of agent-oriented software design, but this does seem like an area where agent theory might have something to offer.  At least, some sort of formal approach could be helpful in the design of distributed systems.

Next up is the idea from Tim Finin; build ontologies on top of P2P systems.  The Edutella project is given as an example, although that project appears to have petered out; at least the Edutella website has not been updated since 2004, although there are more recent academic publications on Edutella.  Of course even if the Edutella project has not been a great success that doesn't mean there can't be some value to be derived from building ontologies on top of P2P systems, but I struggle to see what they are.  of course this relates to the whole question of the "Semantic Web", which I find a recent Tim O'Reilly post on.  O'Reilly is talking about rich data snippets that allow Google results to display more strucutre.  There is a whole bundle of ideas here, but I should try and finish my summary of Koubarakis' paper before straying into that territory.

Koubarakis specifically mentions the Semantic Web while refering to work by Karl Aberer on local onotologies and local translations among ontologies of neighboring peers.  Aberer's paper "The Chatty Web: Emergent Semantics Through Gossiping" is cited by 156 [ATGSATOP], and there certainly seems to be a rich research vein there.  I see some interesting articles looking at emergent semantics deriving from folksonomies - another area I have published in (Joseph et al., 2009).  A skim of Aberer's article indicates that the problem they are hoping to address is inter-ontology mapping so that, for example, one could send out a query to get project titles from multiple different data sources, where the meta-data format was potentially different in each case, e.g. multiple XML documents where in one case we have <project><title>My Project</title></project> and in another we have <project-title>Project X</project-title>.  Without reading that paper in more detail it is not clear to me to what extent schema/ontology authors have to provide mappings, and to what extent users are giving feedback on failed matches; but the authors reference other work on automated ontology matching which I guess is what this is all about.  Say I want to formulate my query for a flight and send it out to all the online travel sites, I don't have to force them to all use the same schema - there is some process that just handles the translation between the different terms used by each site so that everyone can agree on how to query on things like "departure time".  Still seems like the simple short term solution is to have translations provided by 3rd parties, if at all.  Not clear to me why the effort of automating ontology matching brings great bounty.

Next up is BestPeer, which apparently improves on a P2P system by adding mobile agents.  I have a long standing point (Joseph & Kawamura, 2001) about the unpredictable benefits of mobile agents, and I have the BestPeer paper on my reading list, so will discuss that in a future blog post.  The final work mentioned in the section on what agents could offer P2P systems is theoretical analysis of search in distributed agent systems by Shehory (1999), which I also have on my reading list, so more on that soon.  Overall I think I am being more and more persuaded that there is agent research that can inform P2P researchers, but the more complex question is whether agent research is useful for P2P developers.  My main gripe is that simply citing the list of properties that agents should have (e.g. autonomy, reactivity) etc. is not enough to explain their value.  One has to present mechanisms that support autonomy, reactivity and so forth, and then show how their use brings some specific benefit to the system they are being incorporated to.  I guess the alternative tack here is to say that the agent field has lots of analysis into the behaviour of systems comprised of multiple autonomous entities, and attempts at producing design guidelines to handle development of such systems.

The next section in the paper is on bottom-up approaches to MAS such as projects like DIET and BISON which are inspired by natural ecosystems.  This is certainly an interesting area of research and the suggestion seems to be that these lightweight multi-agents platforms could serve as a testbed for P2P systems, although it feels a little back to front given that the granularity of P2P systems is usually smaller than even the simplest multi-agent systems.  I think the challenge here is that laboratory based platforms like these are generally likely to be cut off from real P2P users, unless it achieves critical mass within the research community itself.  Clearly such things can be used as test-beds to provide theoretical results about distributed systems; but any P2P system that is hoping to be used by a non-trivial number of people is probably going to have to be built "close to the metal".  Again I am skirting up against this difference between P2P users, developers and researchers.

The final portion of this paper focuses on Koubarakis' own research of P2P publish and subscribe systems.  Koubarakis' approach is based on the idea that:
The next generation of P2P data sharing systems should be developed in a principled and formal way and classical results from logic and theoretical computer science should be applied
although that makes me think of a chapter in the book "The Next Fifty Years" where Paul Ewald talks about how in medicine fundamental achievements have occurred more through the testing of deductive leaps than by building-block induction, giving examples such as Edward Jenner's discovery of vaccination in the absence of knowledge of viruses as evidence that simply trying to understand the workings of disease at the cellular and biochemical levels may be insufficient to make great leaps.

Actually I'm not sure of the validity of my analogy here, since I had been thinking of Ewald's points as being related to the importance of accidental discovery versus theoretically informed developments, when actually they are slightly different, since the process of generating a hypothesis to test necessarily involves some theoretical input.  Although I think my concern stems from the plethora of available theories and the difficulty in assessing the extent to which different theories are experimentally grounded.  Developing P2P systems in a principled and formal way will certainly be attractive to those who are well versed in the principles and formal theories of computer science.  Having spent some time becoming more versed in them myself I am not convinced that they are purely virtuous.  I feel there is an extent to which theory can end up serving itself rather than serving the development of useful techniques and systems.

In conclusion Koubarakis cites results from his research where they calculate worst case upper bounds for the complexity of satisfying and filtering queries within their publish and subscribe networks.  I think a lot of my personal confusion in this area comes down to differentiating between systems that are simulations designed to provide support for theoretical results versus systems that are frameworks that one might hope to build applications for use in the real world.

One of the key things I realise re-reading all these papers is how I am not really interested in industrial software engineering.  I am not really interested in developing techniques that might be used in factories or supply chain management. I am interested in writing code that everyday end users (including myself) interact with.  It was the potential of the digital butler that got me interested in agents.  P2P systems and search engines were interesting because of the experience they delivered to the end user.  I think that's what I repeatedly struggle with regarding agents research - trying to find something of direct use to the end user.
ResearchBlogging.org
Cited by 12 [ATGSATOP]

Manolis Koubarakis (2003). Multi-agent systems and peer-to-peer computing: Methods, systems, and challenges Lecture notes in computer science, 2782, 46-61

References (my scholar system couldn't handle this papers reference format - didn't want to burn time on fixing that at the moment)

K. Aberer, P. Cudre-Mauroux, and M. Hauswirth. The Chatty Web: Emergent Semantics Through Gossiping. In Twelfth International World Wide Web Conference (WWW2003), May 2003.

D. Bertolini, P. Busetta, A. Molani, M. Nori, and A. Perini. Designing peerto- peer applications: an agent-oriented approach. In Proceeding of International Workshop on Agent Technology and Software Engineering (AgeS)-Net Object Days 2002 (NODe02), volume 2592 of Lecture Notes in Artificial Intelligence, pages 1–15. Springer, October 7–10 2002.

T.W. Finin and Y. Labrou. Napster as a Multi-Agent System. Presentation at the 18th FIPA meeting, University of Maryland Baltimore County, July 2000.

Joseph S.R.H. Yukawa J., Suthers D. & Harada V. (2009) Adapting to the Evolving Vocabularies of Learning Communities. International Journal of Knowledge and Learning.

Joseph S. & Kawamura T. (2001) Why Autonomy Makes the Agent.  In Agent Engineering, Eds. Liu, J, Zhong, N, Tang, Y.Y. and Wang P. World, Scientific Publishing.

O. Shehory. A Scalable Agent Location Mechanism. In Proceedings of ATAL 1999, pages 162–172, 1999.

Tse B., Raman P. & Joseph S. (2006) Information Flow Analysis in Autonomous Agent and Peer-to-Peer Systems for Self-Organizing Electronic Health Records In Agents and Peer to Peer Computing, Eds Joseph S.R.H., Despotovic Z., Moro G. & Bergamaschi S. Lecture Notes in Artificial Intelligence, Volume 4461.

No comments: