There seem to be two fundamentally different roles that apply to the use of the term semantics within the business community. These two roles can be identified as the dictionary role and the classification role.
When applying semantics to business processes we need to identify the relationships which exist between members of a set of data objects that form a message within a business process. The failure to describe such relationships as part of the semantics associated with business messages has been one of the causes for misunderstandings between users of existing message sets. This paper seeks to identify what needs to be recorded about the relationships between semantics that are members of a set of semantics.
In the dictionary role of semantics each semantic is considered as a single object. The description of the object is derived without reference to the context in which the object is to be used. Therefore the description itself contains information relating to the context(s) in which it is expected to be employed. For example, the ISO Basic Semantic Register (BSR) defines the Basic Semantic Unit (BSU) AccountsPayableDepartment.Contact.Telephone.Identifier as "The identifier of the telephone for the contact in the accounts payable department."
Data dictionaries of the type developed for BSR make it difficult to reuse data in different contexts. Even if there is an underlying, reusable, Basic Semantic Component (BSC), such as Contact.Telephone.Identifier, the fact that there are different dictionary entries for each context in which the identifier is applied makes it difficult to recognize the fact that there is a relationship that exists between all entries that share the BSC.
Note: In the case of BSR this is not helped by the introduction of alternative BSCs, as in Party.Person.Telephone.DirectLine.Identifier, and alternative forms, as in Person.Internal.Telephone.Number.
When defining data dictionaries that uniquely identify the role played by a data object the maximum number of distinguishing properties have to be defined.
In the classification role of semantics each term is considered as a member of a hierarchy of concepts. The description of the term identifies a set of characteristics that distinguish this classification from its peers within the higher levels of classification for which it has been designated as a part. In technical terms the classification identifies those properties that distinguish that this term is a hyponym (sub-classification) for the parent hypernym (super-classification). For example, within ebXML core components the definition of Quantity within the context of Packaging is given as "Number of packages".
For classification schemes the properties defined at each level in the hierarchy are additive. Therefore the total definition of the Quantity concept within a specific message is dependent not only on the fact that we are classifying packages, but also on the fact that packaging is a sub-classification (hyponym) of the properties that apply to Products, which are in turn a sub-classification of the properties that apply to the business process of (say) Ordering.
When defining a semantics for use within a classification scheme only the minimum set of properties needed to distinguish this concept from its peers should be defined.
When a semantic concept is to be applied as part of a business message it is important to record the relationships between the members (meronyms) of a set of semantics (holonyms).
For example, in most business processes there are at least two parties. Different business processes require that these parties be assigned different role identifiers. Each party needs to be assigned a role within the message. Some of these roles will need to be unique. For example, we cannot have two purchasers for an order, but we may be able to have two clients for a single insurance policy. Therefore for each of the party members within our message we need to be able to determine the relationship of that party to the parent message.
Other types of relationships are more dependent of the business process being carried out. For example, when asking for a description of products or services prior to purchasing, either in the form of a catalogue or a quotation, more information needs to be provided than is the case when you are referring to the same items for the purpose of ordering or invoicing. Therefore the set of members from a particular "whole" that is required is often dependent on the context.
We therefore need to consider how to record the relationship between a term and those terms it forms a part of, and those terms that form a part of it. The way we do this must depend on the way in which the terms are to be used. For what follows it will be presumed that the terms will be used in a hierarchical environment, such as that provided by the eXtensible Markup Language (XML). This restriction has been chosen because in such environments it is possible to determine the set of properties that applies at any point in a message by interrogating the properties of each of the ancestors of the term being applied in turn to produce an additive list of applicable properties.
Let us try to identify the properties of a simple purchase order. The order consists of the following subcomponents:
order |- purchaser |- supplier |- product |- product |- ...
The properties of order could be
identifier, date and total-value.
The properties of purchaser could be
name, address, contact-numbers and
delivery-details.
The properties of supplier could be
name, address and contact-numbers.
The properties of product could be
identifier, description, quantity and
item-cost.
The total set of properties of the order is the sum of all the properties of its component parts.
The set of properties of the subcomponents, on the other hand,
is not the sum of the properties of the part and those of its parent, because I
have deliberately chosen, for this illustrative case, to assign
total-value as a property of the order rather than a component part
of it. Despite this, it should be noted that to create a unique reference to
this instance of the details provided for the purchaser you need to include the
fact that these are the details provided for the order with the unique
identifier specified for its parent.
But if we go down a level we find another set of issues. For
example, the address component could have the following set of properties:
room, floor, building,
street, place, region and
country. Given this, how do we distinguish between the street name
assigned to the supplier and that assigned to the purchaser, and does either of
them have any relevance on its own? The only way to distinguish between a
property assigned to a purchaser from a similarly named property of the supplier
is to treat the role played by the party as a property of the term. Again, I
have been deliberately provocative in my choice of terms. If I had chosen to
call them both a party, and assigned that a property of
role the fact that this property would be part of the cumulative
set of inherited properties would have been obvious. But this is not the most
efficient way to code the message using XML. If we use the role name as the name
of the container, rather than the name of its generic type, we can simplify the
message without losing any processing power as long as we are aware that the
name assigned to a container is one of its properties. Therefore the set of
properties used to uniquely identify a street entry within our
simplified order are: order, identifier,
purchaser or supplier, and address. Note
that this name is as fully qualified as any of those in the data dictionary
examples taken from the Basic Semantic Register. Yet we do not need to create a
separate dictionary entry for each occurrence of a street, as the context in
which each occurrence is found automatically defines the full dictionary
definition.
By now you may have identified what is happening in the above
example. The names assigned to each of the elements in the hierarchy shown above
are, in fact, names of relationships. This may not be always clear. For example,
whilst it is clear that purchaser and supplier have a
relationship with the process of ordering it is not immediately clear that the
same holds true for the repeatable product object. But remember
that purchaser and supplier are actually relationships
assigned as the roles of a generic object, a party. If we look at
the properties inherited by product we see that it is part of an
order with a specific identifier. But what distinguishes the
properties of one product from those of the same property of its siblings? In
this case we need to record not only the name of the relationship, but its
numeric position within the set of siblings. This number in turn defines a
relationship. It defines the first product ordered, the second product ordered,
etc. So products do have a relationship with an order, but in this case the
relationship is not recorded by the name assigned to the term, but from its
position with respect to its siblings.
What can we conclude from the above example? I would contend that:
The properties of a business object include the properties of its children.
The properties of each part of the message include some, but not necessarily all, of the properties of its parents.
Relationships between parts of messages can be recorded as the names of container terms within the hierarchy of terms
Relationship names form synonyms for generic property sets.
Generic property sets should only be used directly within a message where there is only a single role that the property set can play.
File created: 24th January 2001