Thursday, June 21, 2007

Background (4): Web Personalization and User Profile


Personalization mechanisms in literature can be divided to three categories. These mechanisms try to predict user interest in a particular item [2].
Demographic: similarity of current item properties with items that users liked in the past [1].
Content-based: based on the similar properties of the items that user liked in the past [1].
Collaborative: based on the rating patterns of similar users (the choices of people that liked similar objects as the current users are recommended) [3].

Advantages and disadvantages of these mechanisms are discussed in [4]. Demographic filtering (recommended) is more adaptable to the preference changes comparing to content-based filtering but requires some information which sometimes user is not willing to provide [2]. Collaborative filtering is a good alternative to demographic filtering, as it does not rely on information about the users and the items.


Reference
[1] C. Basu, H. Hirsh, W. Cohen, Recommendation as classification: using social and content-based information in recommendation, in: Proceedings of AAAI-98, Menlo Park, CA, USA, 1998, pp. 714–720.

[2] Stegers R., Fekkes P. and StuckenschmidtH. MusiDB: A personalized search engine for music, Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 2006, Pp 267-275

[3] D. Goldberg, D. Nichols, M. Oki Douglas Terry, Using collaborative filtering to weave an information tapestry, in: Communications of the ACM, vol. 35, issue 12, ACM Press, New York, USA, 1992, pp. 61–70


[4] R. Burke, Hybrid recommender systems: survey and experiments, in: User Modeling and User-Adapted Interaction, vol. 12, issue 4, Springer, 2002, pp. 331–370.

Wednesday, June 6, 2007

Background (3) Web Personalization

The web is a huge information repository and finding relevant information in this environment is not a trivial task. Web personalization aims to help users find relevant information and services efficiently. The main issue is that the profile of the user must be recognized by the web server to provide him personalized services. Different approaches are proposed to overcome this problem. Current approaches could be divided to server-side accounts, cookies, and identity profiles (e.g Microsoft password). The disadvantage of server-side accounts is that the user should enter the same information in different websites. In addition you should remember lots of username and passwords. The problem with cookies is that they are based on the server technology which has a different standard and coding from one web server to the other, thus they are not applicable for different services. Also cookies are not meaningful for the user. Although identity profile can handle a few services that are using the same standard at a time but again it is not applicable to other services on the web. To summarize the above, current approaches are incapable of using and integrated information of the user for different services. Privacy and security is another issue in the current mechanisms. Semantic web introduces an architecture that is suitable for web personalization. There are different mechanisms using semantic web concepts for web personalization and user profiling. Ontologies were proved as a handy mean to represent user profiles and preferences. [1] introduces an extension of the GET method in HTTP to include a new parameter that points to the URL of the user’s FOAF [2] file. FOAF files are easy to understand and based on an open standard format. In this way the web server can understand the user preferences using the FOAF file. The user profile unlike the user-centric identity management is portable and can be accessed on the web by different web servers. Baoyao et. al. introduces a new web usage mining [4] approach to model web access behavior of users based on discovered user access patterns from client-side access logs [3]. This model is transformed to an ontology and can be used to provide personalized web services to the user. The ontology is generated using Formal Conceptual Analysis [5] based on fuzzy logic. [6] exploits ontologies with fuzzy relations to represent user profiles. This ontology-based personalization is very helpful for complex retrieval tasks in multimedia domain. It enhances RDF with novel characteristics and the proposed model is a graph with concepts as nodes, and the edge between two nodes that forms a contextual relation between concepts. Reference

Reference
[1] Ankolekar A., Varandecic D. Personalizing web surfing with semantically enriched personal profiles. In Makram Bouzid and Nicola Henze, Proceedings of the Semantic Web Personalization Workshop. Budva, Montenegro, June 2006.

[2] http://www.foaf-project.org/

[3] Zhou B., Hui C. S., and Fong A. Web Mining Research: A Survey. In: ACM SIGKDD Explorations, 2 (2000) 1-15.

[5] Stumme G., and Maedche A., Ontology Merging for Federated Ontologies on the Semantic Web. In: Workshop on Ontologies and Information Sharing, at IJCAI, Seattle, USA , (2001).

[6] Ph. Mylonas, D. Vallet, M. Fernández, P. Castells and Y. Avrithis. Ontology-based Personalization for Multimedia Content. 3rd European Semantic Web Conference - Semantic Web Personalization Workshop, Budva, Montenegro, 11-14 June 2006

Monday, June 4, 2007

Summary of Background 1 and 2

Below is a picture from the white board I drew to summarizes the previous Background part 1 and 2 in a presentation. This is trying to bring all recent attempts regarding utilizing approximate reasoning in the Semantic Web in a glance.

Saturday, May 26, 2007

Background (2): Aproximation and Semantic Web

2.1 Source of Uncertainty
Web is consisted of immense amount of data. Information retrieval from this extremely huge source is not immune for inconsistencies or uncertaintities. Uncertainty or imprecision on the web could be due to 2 main factors: First, even in extremely accurate measurements we are uncertain about the implications. Second, the human perception [9] is fundamentally unable to conduct completely accurate measurements.

2.2 Extensions to deal with Uncertainty
To deal with uncertainty many extensions have been proposed on OWL and Description knowledge languages. The proposed extensions could be divided to the following three categories:
1. Probabilistic Extension
2. Possibilistic Extension
3. Fuzzy Extension
Among the three types of extensions we are going to focus on fuzzy extensions which has been the most active category among researchers [3, 4, 5, 6].There is a fundamental difference in the semantics of fuzzy logic and probabilistic logic. In fuzzy logic, a statement can be true to a certain extent or an entity belongs to a class to a certain degree. This degree is assumed to be known with certainty. In probabilistic reasoning, there is a probability that a statement is true or false, but the statement itself is either true or false, but neither both nor something in between. Hence fuzzy logic sees the world as continuous instead of binary, while probabilistic logics make a claim about the randomness of the world or the observer’s state of certainty [8].
2006: In 2006 Fuzzy OWL was proposed in the National Technical University of Athens [1]. Fuzzy OWL is capable of capturing and reasoning about knowledge using their reasoning platform, Fuzzy Reasoning Engine (FiRE). Fuzzy OWL represents fuzzy classes and properties. A fuzzy class is defined by a membership function that returns the membership degree between [0,1] for a given object. Fuzzy OWL uses crisp OWL’s Syntax for class and property axioms and definitions, and FiRE uses RACER DL [2] engine syntax.
2005: Semantic Web Rule Language (SWRL) is a proposal that combines OWL (DL and Lite) with the Rule Markup Language (RuleML). Fuzzy-SWRL (f-SWRL) is a fuzzy extension of Semantic Web Rule Language [7]. In both the antecedent and consequent of SWRL rules atoms can have weights between [0, 1]. f-SWRL provides a powerful and flexible knowledge representation and very convenient for multimedia as well as semantic web.

2.3 Application of Fuzzy in Semantic Web Systems
2007: A recommender system using temporal ontologies is proposed in [10]. The agents in the system provide preference and uncertain lists to the user. The uncertain list is the same type of information in the preference list but the acquired data is based on the known products in the agent’s ontology. The agent’s ontology contains the previous feedbacks about the products.
2005: Haibin and Yan proposed a framework called soft Semantic Web Services agent (soft SWS agent) [11] providing high quality semantic web services using fuzzy neural networks with genetic algorithms. The core of soft SWS agent is the intelligent inference engine (IIE) which uses a four layer architecture fuzzy neural network. Linguistic variables in layer one change to output variables in layer four after the fuzzy process in the layered architecture.
[12] introduces a concept-matching information retrieval system that is capable of “retrieving web pages that are conceptually related to the implicit concepts of the query”. The system uses Fuzzy interrelations and Synonymy-Based Concept Representation Model (FIS-CRM) to extract the concepts. The vectors in FIS-CRM are fuzzy values representing “concept” occurrence instead of term occurrence.

Reference
[1] Stoilos G., Simou N., Stamou G., Kollias S. Uncertainty and the Semantic Web, IEEE Intelligent Systems, 2006

[2] www.sts.tu-harburg.de/~r.f.moeller/racer

[3] P. Vojtas. Fuzzy logic programming. Fuzzy Sets and Systems, 124:361-370, 2001.

[4] R. Ebrahim. Fuzzy logic programming. Fuzzy Sets and Systems, 117:215- 230, 2001.

[5] Cristinel Mateis. Extending disjunctive logic programming by t-norms. In LPNMR '99: Proceedings of the 5th International Conference on Logic Programming and Nonmonotonic Reasoning, pages 290-304, London, UK, 1999. Springer-Verlag.

[6] C. V. Damasio, L. M. Pereira. Antitonic logic programs. In 6th International Conference on Logic Programming and Nonmonotonic Reasoning, 2001.

[7] Jeff Z. Pan, Giorgos Stamou, Vassilis Tzouvaras, and Ian Horrocks. f- SWRL: A Fuzzy Extension of SWRL. In Proc. of the International Conference on Artificial Neural Networks (ICANN 2005), Special section on "Intelligent multimedia and semantics", 2005.

[8] Christopher Thomas and Amit Sheth. On the Expressiveness of the Languages for the Semantic Web – Making a Case for ‘A Little More’

[9] Lotfi A. Zadeh, Toward a perception-based theory of probabilistic reasoning with imprecise probabilities, In Journal of Statistical Planning and Inference 105 (2002) 233-264

[10] Trust based Recommender System for the Semantic Web

[11] Wang h., Zhang Y. Extensible Soft Semantic Web Services Agent

[12] Garces J., Olivas P. J., Romero F.P. Concept-Matching IR System Versus Word-Matching Information Retrieval Systems: Considering Fuzzy Interrelations for Indexing Web Pages. Journal of the American society for information science and technology, 57 (4). Pp. 564-576, 2006

Tuesday, May 15, 2007

Background (1): Approximation and Semantic Web

1.1 Representation of uncertainty in ontologies
Semantic web ontologies are based on crisp logic and cannot handle uncertainty. Therefore researches are focused on maintaining well defined means for semantic web ontologies to express uncertainty and handle incomplete or partial knowledge in a domain. Overlapping concepts, including the amount of overlap is one of the issues that could be addressed using different methods. Below are two methods to handle uncertainty in semantic web ontologies.
There is a wide range of researches working on developing a framework which augments and supplements the semantic Web Ontology Language OWL for representing and reasoning with uncertainty based on Bayesian Networks (BN) [26] and its application in ontology mapping. This framework, named Bayes OWL, has gone through several iterations since its conception in 2003 [8, 9]. Bayes OWL provides a set of rules and procedures for direct translation of an OWL ontology into a BN Directed Acyclic Graph (DAG). It also provides a method based on Iterative Proportional Fitting Procedure (IPFP) [19, 7, 6, 34, 2, 4] that incorporates available probability constraints when constructing the conditional probability tables (CPTs) of the BN. The translated BN, which preserves the semantics of the original ontology, and is consistent with all the given probability constraints, can support ontology reasoning across the ontologies as Bayesian interfaces [221]. At the present time, Bayes OWL is restricted to translating only OWL-DL concept taxonomies into BNs, an active research group at Department of computer science and electrical engineering at the University of Maryland are working on extending the framework to OWL ontologies with property restrictions.
If ontologies are translated to BNs, then concept mapping between ontologies can be accomplished by evidential reasoning across the translated BNs. This approach to ontology mapping is seen to be advantageous over many existing methods in handling uncertainty in the mapping. Markus Holi and Eero Hyvonen introduce a new probabilistic method to approach the problem of representing uncertainty in semantic web ontologies [222]. In their method, degrees of subsumption, i.e., overlap between concepts can be modeled and computed efficiently using Bayesian networks based on RDF(S) ontologies. Degrees of overlap indicate how well an individual data item matches the query concept, which can be used as a well-defined measure of relevance in information retrieval tasks.

1.2 Fuzziness for semantic web reasoning
Description logics (DL) are languages for knowledge representation to represent the terminological knowledge of an application domain in a structured and formally well-understood way. DL refers to concept descriptions and the logic-based semantics in a domain.The description logic [331] in the web ontology language (OWL-DL) corresponds to SHOIN(D) description logic. In other words OWL-DL is using SHOIN description logic to represent knowledge and reasoning about it. Straccia presented a fuzzy extension of SHOIN(D) showing that its representation and reasoning capabilities go clearly beyond classical SHOIN (D) [332].

1.3 Application of Fuzziness in the architectures proposed for different problems
Giovani and Vincenzo describe a multilayer architecture to design Ambient Intelligent (AmI) [441] systems providing efficient and uniform utilization of control activities [442]. This multiplayer architecture employs markup-based technologies to transform rough information on sensors, actuators and services towards “smart data”. In particular they are using Fuzzy Markup Language (FML) [417, 418] to provide fuzzy web services. FML language is a novel computer language used to model control systems based on fuzzy logic theories. The main feature of FML is the transparency property: the FML programs can be executed on different hardware without additional efforts. This property is fundamental in ubiquitous computing environment where computers are available throughout the physical environment and appear invisible and transparent to the user.Nikravesh introduces a new architecture for semantic web search engines based on Fuzzy Conceptual Model (FCM) [551, 552, 553, 554] handle the ambiguity and imprecision of the “concept” in the Internet. In the FCM approach, the “concept” is defined by a series of keywords with different weights depending on the importance of each keyword. Ambiguity in concepts can be defined by a set of imprecise concepts. Each imprecise concept in fact can be defined by a set of fuzzy concepts. The fuzzy concepts can then be related to a set of imprecise words given the context. Imprecise words can then be translated in to precise words given the ontology and ambiguity resolution through clarification dialog. By constructing the ontology and fine-tuning the strength of links (weights), they could construct a fuzzy set to integrate piecewise the imprecise concepts and precise words to define the ambiguous concept [556].

Reference
[2] Bock HH (1989) A Conditional Iterative Proportional Fitting (CIPF) Algo-rithm with Applications in the Statistical Analysis of Discrete Spatial Data. Bull. ISI, Contributed papers of 47th Session in Paris, 1:141-142

[4] Cramer E (2000) Probability Measures with Given Marginals and Condition-als: I-projections and Conditional Iterative Proportional Fitting. Statistics and Decisions, 18:311-329

[6] Csiszar I (1975) I-divergence Geometry of Probability Distributions and Mini-mization Problems. The Annuals of Probability, 3(1):146-158

[7] Deming WE, Stephan FF (1940) On a Least Square Adjustment of a Sampled Frequency Table when the Expected Marginal Totals are Known. Ann. Math. Statist. 11:427-444

[8] Ding Z, Peng Y (2004) A Probabilistic Extension to Ontology Language OWL. In Proceedings of the 37th Hawaii International Conference on System Sciences. Big Island, HI

[9] Ding Z, Peng Y, Pan R (2004) A Bayesian Approach to Uncertainty Modeling in OWL Ontology. In Proceedings of 2004 International Conference on Advances in Intelligent Systems - Theory and Applications (AISTA2004). Luxembourg-Kirchberg, Luxembourg

[19] Bayes OWL: Uncertainty Modeling in Semantic Web. Kruithof R (1937) Telefoonverkeersrekening. De Ingenieur 52:E15-E25

[26] Pearl J (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plau-sible Inference. Morgan Kaufman, San Mateo, CA

[34] Vomlel J (1999) Methods of Probabilistic Knowledge Integration. PhD Thesis, Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University

[221] Z. Ding et al. BayesOWL: Uncertainty Modeling in Semantic Web Ontologies. StudFuzz 204, pp 3-29, 2006

[222] M. Holi and E.Hyvonen: Modeling Uncertainty in Semantic Web Taxonomies, StudFuzz 204, pp 31-46, 2006

[331] Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter F. Patel- Schneider, editors. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, 2003.

[332] U. Straccia. Towards a fuzzy Description Logic for the Semantic Web. pp 167-181, 2005

[417] Acampora G., Loia V., Fuzzy Control Interoperability for Adaptive Domotic Framework. In Proceedings of 2nd IEEE International Conference on Industrial Informatics, (INDIN04), 24-26 June 2004, Berlin, Germany, pp. 184-189.

[418] Acampora G., Loia V., Fuzzy Control Interoperability and Scalability for Adaptive Domotic Framework. In IEEE Transactions on Industrial Informatics, vol.1, issue 2, pages 97-111

[441] Basten, T., Geilen, M., de Groot, H. Ambient Intelligence: Impact on Embedded System Design. Kluwer Academic Publishers, Boston, 2003

[442] Acampora, G.; Loia, V. Enhancing the FML vision for the design of open ambient intelligence environment. In proceedings of Systems, Man and Cybernetics, 2005 IEEE International Conference on Volume 3, 10-12 Oct. 2005 Page(s):2578 - 2583 Vol. 3

[556] M. Nikravesh. Beyond the Semantic Web: Fuzzy Logic-Based Web Intelligence, StudFuzz 204, pp 149-242, 2006
Below is my poster "Semantic Web - the Web Evolution" in iCore 2007 in May. I discussed my researches about intelligent user agent capable of composition and execution of web services in order to arrange a trip based on the user preferences.

Friday, April 6, 2007

Semantic Web and Approximation

In this post I try to cover the research issues and open problems of Semantic Web with respect to approximation and especially approximate reasoning.
There are a few number of researches in literature and in my point of view they are far from practical applications. One of the most interesting issues for me is considering fuzziness in Description Logics (DLs) for ontologies. Pavel and Lawrence discuss this in "Fuzzy Rough Approach to Handling Imprecision in Semantic Web Ontologies". This paper introduces the way that rough set methods can handle uncertainty in DLs.
Bartley and Lawrence discuss merging ontologies to reach a single ontology using beyes's theorem "Approximate Metrics For Autonomous Semantic Web Ontology Merging". In addition to each ontology they use a thesaurus of that ontology to "specify the acceptable values for merging" within their architecture.
In conclusion since many languages such as OWL DL are based on DLs (OWL DL is based on SHOIN(D)) therefore I believe that a better understanding of DLs is indispensable for Semantic Web researchers.

Friday, March 30, 2007

Semantic Web - the Web Eevolution (submitted abstract)

The Internet we are using every day has over 300 million users and 3 billion statistic documents however, its utilization is rather primitive. Tim Berners Lee the inventor of the web and director of World Wide Web Consortium (W3C) envisions a new future of the web. This future is called Semantic Web. The new web represents the attempt to “bring web to its full potential” by developing interpretable technologies such as specifications and protocols governing knowledge representation and services of the web.
Today’s web is a huge repository of information for human use and Semantic Web is its extension where the web contents can be used and interpreted by software agents. The agents will enable automatic finding, sharing and integration of information on the web. Semantic Web also provides a new level of services. For example you will be able to find all restaurants that serve pizza in a 10 kilometer distance which are open until 12pm, even if a restaurant uses “Italian food” instead of pizza will appear in your list!
Semantic Web embraces multiple technologies to perform its tasks such as manipulation and extraction of information, and execution of complex services. The concept Resource Description Framework (RDF) is a fundamental method of representing basic pieces of information in semantic web. Ontology – “specification of conceptualization” is built upon RDF and used to represent information and its semantic in a machine-processable manner. A special XML based language called Ontology Web Language (OWL), is used to represent knowledge and services as Ontology.
A number of approaches have already been proposed to address the challenges in planning, composition, and execution of web services. However, the planning problem is far from trivial. An active group of researchers, led by Dr. Marek Reformat in Electrical and Computer Engineering (ECE) Department at the University of Alberta is contributing to semantic web technologies. The studies in the ECE are focused on the automatic web service execution and service composition in the presence of uncertainty. The case studies in the domain of travel planning are being constructed. The expected outcome would be intelligent user agents, which can perform complex services in travel domain based on user preferences. The results will be applicable in other domains.

Wednesday, March 28, 2007

Pellet Reasoner

Pellet is an OWL-DL reasoner developed in mindswap at University of Maryland. It is written in Java and is open source. Unfortunately there is not any constructive documentation for Pellet tool but the Pellet user mailing list is quite active and you can find a lot there. Two papers (1, 2) from mindswap might be the only papers which directly discuss Pellet.
W3C recommended two test cases for OWL document checkers in 2004: OWL syntax checkers and OWL consistency checkers. W3C also defined complete consistency checkers and complete means that the document is a decision procedure with respect to semantics of its concepts. Pellet is a complete OWL-DL consistency checker.
Consistency checking is checking for contradictory facts in an ontology more technically it checks Assertional Box (ABox: OWL facts e.g. equality assertions, property values and types) for consistency with respect to the Terminological Box (TBox: axioms about classes e.g. disjoint classes and subclasses). It is an important task but can not do or execute anything interesting within your ontology.
In addition to standard inference services (consistency checking, concept satisfiability, classification and realization) Pellet covers W3C recommendations (consistency, entailment, and conjunctive query answering). Pellet supports queries in SPARQL and RDQL language through its query engine. Pellet also supports detection of syntactic and semantic defects in ontologies in terms of "species coercion" (e.g. guess the correct type for a resource without defined type), and debugging meaning finding contradictions and the axioms which cause them.

Thursday, March 8, 2007

Thinking about Travel Ontology

In this post I want to have a high level view of travel ontology. Lets look at the issue from the point of view that everything is a service. Hence we can consider three services: ticket reservation, accommodation and car renting as the main sub services of any travel service (combining these three has a long history in web services).
The above classification is based on that the user has chosen the destination and duration however most of the time choosing the most appropriate travel destination and duration for a user based on his own preferences is a big issue itself (This issue is not well studied in literature).
We should add activities as a new category to the traditional categories in order to differentiate travels (in particular destination, date and duration of a travel). Activities involve anything which can make a travel more interesting. Obviously all information about activities (place, date, cost etc) would be represented in the subclasses and slots of the activity concept.
Until now we reviewed the most important parts for any travel service in terms of knowledge representation but service execution is directly related to the constraints which is applied by the user. These parameters should be defined in the travel ontology and later on is used by the user agent (user agent is an agent at user side to perform autonomous or semi autonomous web service execution). User agent compares the information retrieved from the travel ontology, to the information stored in user preferences ontology. User agent can drill in/out within the travel ontology based on the user preferences. A very nice example of filtering data based on user preferences is implemented in Oracle technology Network (OTN) at OTN Semantic Web recently also there is a short nice document as white paper.

Monday, March 5, 2007

Another way of searching for an Ontology

Today I found a very interesting way of searching for Ontologies. This time you can use Google! by typing the word you are looking an Ontology for, and putting filetype:owl. In comparison to Swoogle this approach will give only owl files (no daml and no rdf).
For example you can search hotel filetype:owl in google getting more than 80 owl files and enjoy it!

Saturday, March 3, 2007

Finding an Ontology

Although there is not a large number of ontologies online for different domains, it is certainly recommended to search for the ontology for your interested domain before developing it from scratch.
I think the best place to search ontologies is the Swoogle however there is a library in Protege website.