May, 30 2010
Until financing of the ontology development is provided, no more updates are planned.
Frequently asked questions
The answers to the questions have not the intention to provide an academically correct reply; instead the purpose is to explain to a person, not specialized in semantic web technologies what semantic data is about.
Of course there are many other items to tell and to go in depth. The purpose is thus surely not to be complete but to encourage the reader to search for more.
In reading the rest of the pages on this website he will find more questions answered.
Here is the list of questions:
Suppose we implement a semantic web of data. What impacts are we to expect from this strategic decision ?
- As we know from the business case section of the website: production processes are not well controlled in the service sector. Most companies post yearly for millions of EUR/USD losses due to inadequate production control. Since production gets controlled through the modeling approach, these losses must be reduced to 0.
- With modeling of operations we do not think anymore in terms of STP (Straight Through Processing).
About the “Processing” in the past it was questionable what was processed and about the “Through” it was unclear until where.
Instead we think in terms of integrated end-to-end processing and control.
This activity approach generates a tremendous competitive cost advantage.
- The control of operations in the service industry shifts from IT departments to business departments.
Thereby the role of programming is very much reduced and replaced by model building.
The knowledge in the model permits processing of data, through techniques accessible to people with no programming background or knowledge.
IT departments will be more focussed on the technical side of the infrastructure than on the business side.
Imagine the actual process departments merged with the projects, operational and IT departments with no distinction between those departments anymore.
The cost efficiency increases again.
- The implementation tools in back- and front office come out of the box with little customizations.
To find the benefit, please consider calculating what 1 interface page costs in terms of: functional analysis, technical analysis, building of the interface, transaction building, connections to data sources, translations between the different data sources, testing, regression testing, roll out, documentation, support and project management. Depending on the technical and organizational environment, an estimated cost saving by 75 % is minimally obtained.
Multiplying the amount with the number of interface pages gives an idea of the benefits.
When we consider most interfaces in front office have a counterpart on mainframe systems, the achievable benefits are again tremendous.
- Risk management is inherent to the better knowledge of products, processes and procedures.
It is difficult to measure the risk management benefits but the shareholders of most financial institutions can tell us how high the benefits could have been.
- Auditors, shareholders, authorities and the public in general will get a picture of the high professionalism in process, risk, product and production control. The cost advantage will enable a marketing competitive advantage.
The result is a higher confidence rate, which increases finaly share prices, credit rating, number of customers, the profit.
- New skills need to be acquired by the co-workers whereby modelling needs a thorough understanding of the business processes and, complementary, of the modelling techniques.
How do you store in a semantic data environment following knowledge: Eddy Vanderlinden is holder of securities account 1234 ?
The machine stores following information:
What will be the output to the user of the information included in the previous question ?
The output will be:
eddy vanderlinden is holder of 1234
This output is obtained through the generation of labels based on stored information.
This generation can be automated.
How does the machine know that EddyVanderlinden is a person having a role in securities handling ? How does the machine know 1234 is a securities account ?
EddyVanderlinden and 1234, in the context of the example above, are called individuals.
Individuals with similar characteristics can be regrouped in classes.
We turn the individuals into a concept.
These classes can get a name.
A named class in semantic data systems is registered as:
To explicitly inform the machine that the individual EddyVanderlinden is an element of the Party class the registration is:
To explicitly inform the machine the individual 1234 is an element of the SecuritiesAccount class the registration is:
From the examples above we learn the handling of information through semantic data systems is really different from e.g. entity/relationship procedures. What principles are followed ?
We may imagine all data in one table with 3 core fields: the triples.
The first part of the triple is optional: the subject. The predicate and object are mandatory.
There are other configured fields though for data management and technical reasons: the status of a record the validity start and end-date, index tables to speed up performance,...
Databases designed specifically to store this kind of data are called “triple stores”. They are optimized and structured for performance.
How great can those databases grow ? How performing are they ?
These databases can grow de facto indefinitely, limited only by the number of computers linked to each other. In production there are already stores with several billions of triples.
Can we add some knowledge about the concepts to the classes ?
Yes, we can.
This is done through adding properties to the class (concept).
The example above “is holder of” is an example of an object property.
An object property links 2 individuals.
In the above case “eddy vanderlinden is holder of 1234”, [eddy vanderlinden] is one individual on the left side of the statement, [1234] is the other individual on the right side of the statement.
Thus, the response to the first question is an example of allocation of a (object) property to an individual:
To be able to use the property, it has to be created as a property concept in the ontology.
The creation of an object property is registered as follows:
Besides the property where the value is found as an individual of a class, there are properties where the value has to be fed with each attribution of a property: the data properties.
Imagine the property “is born on”.
We provide the information of my birthday as follows in semantic data:
“is born on” does not refer to an individual of any class, instead, it refers to a data type.
In the following registration entries also a format is allocated to the datatype property.
The creation of a data property is registered as follows:
What is the common name of the process of conceptualization of individual occurrences? How do we call the organization and management of knowledge about these concepts ?
The building of an ontology.
How does an ontology fulfill the function of lexicon ?
The concepts get a 3rd kind of properties: annotation properties.
These can be in the form of labels or comments but one can create his own annotation properties or import existing libraries (typically you will hear about the Dublin Core).
The storage:
How does an ontology fulfill the function of thesaurus ?
The first part consists in the fulfillment of the lexicon function.
Secondly the concepts are classified into an ordering system whereby the top level is a general concept while the subsequent levels are specialized concepts of the above.
We refer to higher levels as superclasses. We refer to lower levels as subclasses.
The criterion to allow a super-/subclass ordering is the answer to the question: are all individuals in the subclass also individuals of the superclass.
In earlier days the criteria were much more strict, today the conceptual meaning in a context is prevailing.
The class accounts shall thus in the Finance ontology start with the “yearly accounts” consisting of the balance sheet, the profit and loss, and the comments to those account statements.
Under the balance sheet liabilities class we will find the accounts payable. Under the accounts payable we will find the third party account in the double posting system.
Such an ordering is called a taxonomy, recorded as such:
Third, there are object- and datatype properties.
Those relate classes with each other or allocate values to properties.
The properties are bearing relevant names and get annotations (labels and comments).
Fourth, we can set restrictions on the individuals allowed to be member of a class.
Example:
Consider the class LiquidityRiskProfile which profiles the risk of insufficient liquidity in the trade of the security.
A condition which has to be fulfilled for an individual (profile) to be a member of the class LiquidityRiskProfile is : there exists a property hasRiskArea with a member of either the class Holder, either the class Issuer.
An unnamed class has been created by imposing the condition. You find the definition in the condition, or the object of the triple above.
The named class LiquidityRiskProfile is declared a subclass of the unnamed class.
Fifth, we can define restrictions at the level of the class whereby we make member of the class all individuals bearing a particular property.
These relations are called “necessary and sufficient”.
Example 1: whenever an individual bares the property is custodian for financial instrument` , the individual will be made member of an anonymous class, equivalent to the class “Custodian”.
The scope is called existential, the restriction will read: `is custodian for financial instrument` all `Financial instrument`.
Example 2: whenever an individual bares the property is custodian for financial instrument` but only when the right side of the property relation is a member of the class FinancialInstrument, the individual will be made member of an anonymous class, equivalent to the class “Custodian”.
The scope is called universal, the restriction will read: `is custodian for financial instrument` all `Financial instrument`
The storage of both conditions, when both existential and universal restriction are combined:
Stored is: there exists an unnamed class (Predicate equivalentClass) whereby all members belong to the intersection of :
the collection of individuals which have at least one property ‘isCustodianForFinancialInstrument’
the collection of individuals which, if they have a property ‘isCustodianForFinancialInstrument’, the right side is a member of the class FinancialInstrument.
Example 3: we can define the number of property-occurrences necessary for the individual to be considered member of the class.
Sixth, we can enrich the properties to allow more expressiveness on the concepts.
- By declaring the range for which a property is valid.
E.g. with the object property “is custodian for financial instrument”, we say the right side of the expression must be a member of the class FinancialInstrument:
What is RDF ?
RDF stands for Resource Description Framework.
The resources being any kind of Subject in our triples store.
The description is made through the predicates and objects.
RDF is a W3C standard, at the basis of the technologies allowing a knowledge modelling whereby the description is computer interpretable and human readable. On top of RDF, OWL is build to enhance the expressivity of RDF.
Through inferences of reasoning robots, we get more out of the machine than we put in.
Where can I learn more about semantic data ?
Outside our website, there are:
- Publication "Semantic web for the working ontologist" by Dean Allemang and Jim Hendler.
Residents of EU: I had to pay an additional 10 EUR as duty clearing costs, probably I used the regular (Belgian) postal services (cheapest).
Therefore suggestion to ask for an express delivery and/or compare the local Amazon sites. There are quite some price differences too.
- After that aperitif, get professional by attending courses with e.g. TopQuadrant.