Ontology Server research

On this page you can find all available information related to the ontology server being developed at STAR Lab.

Introduction
The base model
The ontology server architecture
The ontology server implementation
Future work

Downloads

Introduction

The development of the ontology server falls within the DOGMA research framework, for further information please click the link.  This page will provide all information about the work that is done here at STAR Lab, which goes from theoretical issues to implementation details.  At the moment, we are still experimenting with the base model, and an early (and incomplete) version of ontology server is being implemented.  Once this first version of the ontology server is established, we will be able to validate the theoretical model and experiment with the ontology server and ontologies in general.

The base model

The basic idea behind the model that is used for the ontology server is the fact that semantics, just like in database, should  be kept outside the ontology (e.g. in a layer around it).  Consequently, we should come to a representation which is as simple as possible, and a (possibly different for different applications within the domain) semantics interpretation function to interpret the data in the ontology.  Constraints and derivation rules are intentionally left outside of the ontology.  

Essentially, our ontology model consists of 5 basic elements: context, terms, concepts, roles and lexons, as can be seen in the ORM scheme below.  

 

In the full model there are some extra entities, such as user and version, mainly for administrative reasons.  The first prototype we are building now will not include version control nor user control, but, in a later version, these (and especially version control) might become a crucial issue.

Element by element, the model is explained below:

  • Ontology: the ontology is the topmost entity, necessary because it is the intention of the ontology server to contain several ontologies, likely to be contributed by different persons.  The ontology contains a set of contexts (see further), which form the ontology itself.  As attributes, the ontology has a name (mandatory and unique in the ontology server), a contributor, an owner, a status ("under development",  "finished", ...) and a documentation (an arbitrary string in which the contributor or the owner can specify relevant information).

  • Context: a context is actually a grouping entity, it is used to group terms and lexons (see further) in the ontology.  Within one ontology, every context should have a unique name.  It is annotated with a term within the meta context for this ontology.  The existence of a meta context also makes it possible to define inter-context relations in a very similar way to relations between ordinary terms (see further, lexons).

  • Concept:  a concept is an entity representing some "thing", the actual entity in the real world.  We use id's to identify the concept: every concept has a unique id.  A concept also has a triple "source-key-value", which is the description(s) for that concept.  The source identifies the source from which the description originates, the key is a string which gives a hint to the user on how he should interpret the value, and finally the value is the description of the concept.  One concept can have more than one source-key-value triple, and thus have it's meaning described in different ways.  As an example, let's consider Wordnet.  In Wordnet synsets denote a set of terms (with their "senses") which are equivallent.  Every term also has a glossary, which is an informal description of the meaning for that (particular sense of the) term.  In this respect, we could extract from Wordnet 2 different descriptions for a concept, 2 different source-key-value triples, namely the glossary (Source: Wordnet - Key: Glossary - Value: "<informal description denoted as a glossary in wordnet>") and the synset (Source: Wordnet - Key: Glossary - Value: <enumeration of synonyms forming the synset>)

  • Term: a term is an entity representing a lexical representation of a concept.  Within one context, a term is unambiguous and, consequently, it can only be associated with one concept.  Off course, several different terms within one context can refer to the same concept, implicitly defining these terms as synonyms for this context.   Terms in different context can also refer to the same concept, and in this way implicitly establish a connection between these two context.  The terms can be the same (meaning that the two terms have the same meaning within the different contexts) or they can be different (meaning that some other term in some other context has the same meaning).  Note that the implicit definition of synonyms is indeed dependent on the context (for example, wield and handle are synonyms for the context "weaponry", but they are clearly not when we consider handle to be in the context "tools").

  • Lexon:  a lexon is a grouping element, it is a triple consisting of a starting term (also called the "headword" of the lexon), a role (relation) and a second term (also called the "tail" of the lexon).  A lexon always appears in a context, and describes certain relations which are valid in this context (but not necessarily in another context).  Because the lexons appear in a context, and terms are unambiguous in a context, the lexons can also be considered as relations between concepts.

The ontology server architecture

In the figure below you can find the general ontology server architecture.

As we can see in the figure, the general architecture components are:

  • Storage: evidently, the storage system is the place where the data is stored.  Ideally, storage would be an elementary system that operates on disk level (rather than build upon the existing file system) and thus implement an efficient storage which allows fast access to the different elements of the ontology base model as described above (similar to the way DBMS's manage data).

  • Storage API: provides a unified access to the basic structures of the ontology server.  The API should be accessible from any high level programming language.

  • Higher level ontology objects: the ontology object are expressed in a data description language format, or as objects in any high level programming language.  They are obtained from the storage API, and can also be stored by the storage api.

  • Applications:  applications can use the ontology server by integrating the ontology objects returned from the storage API in their program code.  

The ontology server implementation

As can be observer the general architecture from the previous section is implemented by the following components:

  • DBMS: we use a Database Management System to implement the Storage.  Currently we are using MSQL Server 7.0, but this DBMS can be easily replaced by any other database product (see also Database API).  The ontology server database schema was obtained by transforming the ORM model using Visio 2000 into a relational database schema, which can be seen here.  The SQL DDL statements which construct a this database schema are giving in this SQL script file.

  • Database API:  Access to the storage is provided by a database API, which we wrote in Java (JDK 1.3).  The connection to the DBMS is made using JDBC 2.0 and the jdbc-odbc driver of sun delivered with the JDK 1.3.  The API itself is specified as three different java interfaces IDatabaseInsertionAPI ,  IDatabaseRetrievalAPI and IDatabaseModificationAPI, all inheriting from the general super class IDatabaseAPI.  (Note that all database interfaces start with an I)

    • IDatabaseAPI: as a common super class for all other interfaces listed below, the IDatabaseAPI contains the methods to establish and close the connection to the database.  Every use of the Database API should be initiated by calling the establishConnection method.  Whenever the Database API is not longer required, the closeConnection method should be called.

    • IDatabaseInsertionAPI:  the insertion API provided all basic functionality to add information to the ontology server.  Specific methods for adding ontology, context, terms, concept, lexons, users and versions are included.  Remember that methods for establishing and closing a connection to the database are provided through inheritance from the IDatabaseAPI.

    • IDatabaseRetrievalAPI: the retrieval interface provides all basic functionality to retrieve information from the ontology server.  The specific methods can be divided in two groups, those for retrieving detailed information about ontologies, contexts, terms, concepts, lexons, users and versions , and those for retrieving grouped information, such as retrieving all ontologies from the ontology server, all contexts from an ontology, all terms from a context, all lexons from a context and all users from the ontology server.  For more information on how ontology elements are returned, please see the Java Persistent Objects.  Remember that methods for establishing and closing a connection to the database are provided through inheritance from the IDatabaseAPI.

    • IDatabaseModificationAPI: the modification interface provides all basic functionality to modify information already present in the ontology server.  This interface is not yet implemented.  It will include specific methods for modifying ontologies, contexts, terms, concepts, lexons, users and version.  As every specific database interface, the modification interface should inherit from IDatabaseAPI.

    Above interfaces are, unless stated otherwise, implemented by the concrete class DatabaseAPI.

    Full java implementation is available here: OntologyServer.zip.
    Full Java Documentation on the java implementation is also available.

    There are three notes to be made.  First off all, we want to note that currently, only the basic interfaces are implemented, but other specific interfaces may easily be added.  The IDatabaseModificationAPI is under development, and more advanced retrieval and insertion interfaces are also under consideration.  The creation of these advanced interfaces will happen in close correlation with the development of a specific ontology query language (similar to SQL for databases).  Secondly, we also want to note that the combination of the jdbc-odbc driver and potential poor garbage collection of JDBC objects, can lead to increasing delays when using the database API.  This can be avoided by re-establishing the connection to the database when access slows down.  And finally, we want to mention that although the interface API is expressed and implemented as java interfaces and objects, it can be accessed from any programming language using Corba or similar technology.

  • Java Persistent Objects:  the higher level ontology objects are implemented as java object (see package ontologyobjects), implementing the specific Java interfaces.  The specific interfaces are IOntology, IContext, ITerm, IConcept, ILexon, IUser and IVersion.  All these interface inherit from the general ontology object interface IOntologyObject, currently only containing one method for pretty printing, but intended to contain all methods related to persistency (not yet implemented in the current version).  All concrete implementation of these interfaces are also provided.  When the ontology server needs to return chains of objects (when retrieved from the retrieval API), our implementation uses OntologyCollections (see package util).  Ontology collections work similar to java collections, in addition they do delayed retrieval from the ontology database storage.

  • Ontology objects expressed in XML format:  ontology objects can also be expressed in XML format, using this DTD.  This representation in the popular XML data description language enables us to use the whole variety of tools available for XML (see also Ontology manager)

  • Applications: any kind of application can make use of an ontology stored in the ontology server, simply by using the Database API and the ontology objects described above.  We ourselves provide 2 applications which are invaluable to the ontology server:

    • Ontology manager: at this point, the ontology manager only provides support for storing ontologies expressed in XML.  We use the Xerces Java Parser for parsing the XML documents containing ontologies.  This XML parser uses the Database API to access to ontology server.  We added WordNet to the ontology server using the ontology manager (see later).

    • Ontology browser: Currently, there is an experimental ontology browser developed, but we feel it is not yet  mature enought to make it public.  Once the ontology browser matured, it will become available here.

 

Future work

  • How to store ontologies described in XML (already implemented, still to be added on the site)

  • Wordnet in our ontology server (already implemented, still to be added on the site)

  • Persistence in the real sense of the word for the java ontology objects

  • Ontology browser

  • Ontology query language

Downloads