Formal Definition of Semantic Concepts

Martin Bryan, The SGML Centre

This paper suggests how the definition of a set of semantics can be based on an understanding of semantics currently employed by ontologists. The paper consists of:

  1. An explanation of the ontologoical concepts involved
  2. A form for recording the semantics of a concept
  3. An XML DTD for exchanging semantic records

Creating Formal Definitions of Semantic Concepts

The meaning of some of the technical terms used in this paper are defined here to ensure that all readers understand the meaning I am applying to these terms:

concept
An idea or thought that corresponds to some distinct entity or class of entities, or to its essential features, or determines the application of a term, and thus plays a part in the use of reason or language (The New Oxford Dictionary of English)
holonym
A concept of which this concept forms a part
hypernym
Word with a broad meaning which more specific words fall under: a superordinate (Ibid)
hyponym
Word of more specific meaning (Ibid)
meronym
A term that denotes part of something: a member of an information set
ontology
The branch of metaphysics dealing with the nature of being (Ibid)
semantic
Relating to meaning in language or logic (Ibid)
synonym
A word or phrase that means exactly or nearly the same as another word or phrase in the same language (Ibid)
Note: Also used in this paper to cover words in another language which have exactly or nearly the same meaning.
whole
All of something (Ibid): a term used to identify a concept that consists of multiple parts
XML
The Extensible Markup Language defined by the World Wide Web Consortium (W3C)
XML Links
The XML Linking Language defined by W3C
XML Simple Link
XML Link that identifies the location of a single resource
XML Extended Link
XML Link that can identify data from more than one resource

The terms synonym, meronym and holonym are not being used in their strict dictionary definition. This is because this paper is specifically concerned with the development of multilingual semantic sets, and is therefore concerned with the relationship between terms in different languages. It is also caused by the inconsistency with which the relatively new terms meronym and holonym are applied. (The latter term is not even defined in any of the current Oxford Dictionaries!) For the purposes of this paper the narrow definitions sometimes applied to these terms have been extended to cover the concept of "parts" and "wholes" respectively.

The following diagram illustrates the relationship between the component parts of the semantic model used in this paper:

Relationship between Concept, Synonym, Hypernym, Hyponym, Holonym and Meronym

2. Form for the Definition of Semantic Concepts

The form shown in this section is designed to formalize the capture of semantics that are to be shared across semantic registries. It is designed to be used as an input source for information that will be used to complete an interchangeable XML description of the concept.

Semantic Concept Definition Form

 1. Title (Principle Name of Concept)

 Language  Domains Concept Is Used In


 2. Alternative Names (Synonyms)
Alternative Name  Language  Domains Concept Is Used In






 
 
 3. Formal Definition of Concept








 4. Titles of Broader Terms (Hypernyms)




 5. Titles of Narrower Terms (Hyponyms)




 6. Titles of Concepts this Concept Forms a Part Of (Holonyms)




 7. Titles of Concepts that Form a Part of this Concept (Meronyms)




The following is an example of a completed definition:

 1. Title (Principle Name of Concept)

  Address
  Adresse
 Language

EN
DE
 Domains Concept Is Used In

 Commerce
 Correspondence
 2. Alternative Names (Synonyms)
 Alternative Name  Language  Domains Concept Is Used In
 Deliver To
 Anschrift
 Privatadresse
 EN
 DE
 DE
 Transportation
 Correspondence
 Personnel Details



 3. Formal Definition of Concept
  Information objects used to identify where a person, organization or building
 is located.






 4. Titles of Broader Terms (Hypernyms)
  Location
  Place


 5. Titles of Narrower Terms (Hyponyms)
  Postal Address
  Delivery Address


 6. Titles of Concepts this Concept Forms a Part Of (Holonyms)
  Order Invoice Statement Letter
  Delivery Note
  Personnel Details


 7. Titles of Concepts that Form a Part of this Concept (Meronyms)
  RoomID BuildingID Street City Region Country



3. XML DTD for Exchanging Concept Definitions

The following XML document type declarations (DTDs) define a set of elements that can be used to interchange the contents of a semantic register. Two DTDs are currently defined, based on the use of two different types of XML Links, Simple Links and Extended Links. It is anticipated that a third DTD, showing how the model can be expressed as an ISO 13250 Topic Map, will be defined later.

For the purposes of interchange it is presumed that each reference to a related concept entered into boxes 4 to 7 will be replaced by a link to the formal definition of the appropriate concept. It is further presumed that each reference to a domain a term is used in will be replaced by a link to a formal definition of the domain.

DTD using Simple XML Links

<!DOCTYPE ConceptList [

<!-- Document type for the definition of concepts
        Version 1 - Using XML Simple Links -->
<!-- First Draft developed by Martin Bryan, The SGML Centre, 1999-9-18 -->
<!-- This DTD contains two options at most points in the content model.
        The first name is designed to indicate to an untrained user what
        information is required at this point in the model.
        The second name indicates the formal name for the information
        as used by ontologists. This is intended to provide a more
        formal alternative to the popular element name.         -->

<!ENTITY % Text "(div|p|ul|ol|table|h3|h4|h5|h6)*"                >
<!ENTITY % HTML-text-elements SYSTEM  "XHTML.DTD"                 >
<!ENTITY % link-atts '
  xmlns:xlink   CDATA                         #FIXED "http://www.w3.org/XML/XLink/0.9"
  xlink:type    (simple|extended|locator|arc) #FIXED "simple"	
  xlink:role    CDATA                         #IMPLIED
  xlink:title   CDATA                         #IMPLIED
  xlink:show    (new|parsed|replace)          "replace"
  xlink:actuate (user|auto)                   "user"         '    >
<!ENTITY % ISO8601Date "CDATA"                                    >
<!ENTITY % metadata '
  RecordedBy    CDATA                         #IMPLIED
  WhenRecorded  %ISO8601Date;                 #IMPLIED       '    >


<!ELEMENT ConceptList        (Concept+, Domain*)                  >

<!ELEMENT Concept            ((Title|PrincipleName)+, ConceptDomainRef*,
                              ((AlternativeName|Synonym), AlternateDomainRef?)*,
                              Definition,
                              (BroaderTerm|Hypernym)*,
                              (NarrowerTerm|Hyponym)*,
                              (FormsPartOf|Holonym)*,
                              (HasPart|Meronym)*)                 >
<!ATTLIST Concept            ID              ID        #REQUIRED
                             %metadata;                           >

<!-- No two titles for the same entry may have the same value for
     the xml:lang attribute -->
<!ELEMENT Title              (PCDATA)                             >
<!ATTLIST Title              xml:lang        CDATA     "EN"
                             %metadata;                           >
<!ELEMENT PrinicipleName     (PCDATA)                             >
<!ATTLIST PrinicipleName     xml:lang        CDATA     "EN"
                             %metadata;                           >

<!ELEMENT ConceptDomainRef   (#PCDATA)                            >
<!ATTLIST ConceptDomainRef   xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >

<!ELEMENT AlternativeName    (PCDATA)                             >
<!ATTLIST AlternativeName    xml:lang        CDATA     "EN"
                             %metadata;                           >
<!ELEMENT Synonym            (PCDATA)                             >
<!ATTLIST Synonym            xml:lang        CDATA     "EN"
                             %metadata;                           >

<!ELEMENT AlternateDomainRef (#PCDATA)                            >
<!ATTLIST AlternateDomainRef xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >

<!ELEMENT Definition         %Text;                               >
<!ATTLIST Definition         %metadata;                           >

<!ELEMENT BroaderTerm        (#PCDATA)                            >
<!ATTLIST BroaderTerm        xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >
<!ELEMENT Hypernym           (#PCDATA)                            >
<!ATTLIST Hypernym           xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >

<!ELEMENT NarrowerTerm       (#PCDATA)                            >
<!ATTLIST NarrowerTerm       xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >
<!ELEMENT Hyponym            (#PCDATA)                            >
<!ATTLIST Hyponym            xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >

<!ELEMENT FormsPartOf        (#PCDATA)                            >
<!ATTLIST FormsPartOf        xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >
<!ELEMENT Holonym            (#PCDATA)                            >
<!ATTLIST Holonym            xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >

<!ELEMENT HasPart            (#PCDATA)                            >
<!ATTLIST HasPart            xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >
<!ELEMENT Meronym            (#PCDATA)                            >
<!ATTLIST Meronym            xlink:href      CDATA     #REQUIRED
                             %link-atts;
                             %metadata;                           >

<!ELEMENT Domain             (Title+, Definition?)                >
<!ATTLIST Domain             ID              IDREFS    #REQUIRED
                             %metadata;                           >

%HTML-text-elements;

]>

The following example shows how the completed form shown above would be recorded with Version 1 of the DTD, using XML Simple Links:

<ConceptList>
<Concept ID="Address" RecordedBy="Martin Bryan" WhenRecorded="19990918">
<PrinicpleName>Address</PrincipleName>
<Title xml:lang="DE">Adresse</Title>
<ConceptDomainRef xlink:href="#id('Domain1')">Commerce</ConceptDomainRef>
<ConceptDomainRef xlink:href="#id('Domain2')">Correspondence</ConceptDomainRef>
<Synonym RecordedBy="Matti Nystrom" WhenRecorded="19990922">Deliver To</Synonym>
<AlternateDomainRef xlink:href="#id('Domain3')">Transportation</AlternateDomainRef>
<AlternativeName RecordedBy="Gerhard Heine" WhenRecorded="19991002" xml-lang="DE">
Anschrift</AlternativeName>
<AlternateDomainRef xlink:href="#id('Domain2')">Correspondence</AlternateDomainRef>
<AlternativeName RecordedBy="Gerhard Heine" WhenRecorded="19991002" xml-lang="DE">
Privatadresse</AlternativeName>
<AlternateDomainRef xlink:href="#id('Domain4')">Personnel Details</AlternateDomainRef>
<Definition>Information objects used to identify where a person, organization
or building is located.</Definition>
<BroaderTerm xlink:href="GenericConcepts.xml#id('Location')">
Location</BroaderTerm>
<Hypernym xlink:href="GeographicConcepts.xml#id('Place')">
Place</Hypernym>
<NarrowerTerm xlink:href="GenericConcepts.xml#id('Post')">
Postal Address</NarrowerTerm>
<Hyponym xlink:href="TransportationConcepts.xml#id('DeliveryPoint')">
Delivery Address</Hyponym>
<FormsPartOf xlink:href="CommercialConcepts.xml#id('Order')">
Order</FormsPartOf>
<FormsPartOf xlink:href="CommercialConcepts.xml#id('Invoice')">
Invoice</FormsPartOf>
<FormsPartOf xlink:href="CommercialConcepts.xml#id('Statement')">
Statement</FormsPartOf>
<FormsPartOf xlink:href="CommercialConcepts.xml#id('Letter')">
Letter</FormsPartOf>
<FormsPartOf xlink:href="CommercialConcepts.xml#id('DeliveryNote')">
Delivery Note</FormsPartOf>
<Holonym xlink:href="PersonnelConcepts.xml#id('PrivateAddress')">
Personnel Details</Holonym>
<HasPart xlink:href="LocationConcepts.xml#id('RoomID')">
RoomID</HasPart>
<HasPart xlink:href="LocationConcepts.xml#id('BuildingID')">
BulidingID</HasPart>
<HasPart xlink:href="LocationConcepts.xml#id('RoomID')">
RoomID</HasPart>
<HasPart xlink:href="LocationConcepts.xml#id('Street')">
Street</HasPart>
<HasPart xlink:href="LocationConcepts.xml#id('City')">
City</HasPart>
<HasPart xlink:href="LocationConcepts.xml#id('Region')">
Region</HasPart>
<Meronym xlink:href="GeographicConcepts.xml#id('Country')">
Country</Meronym>
</Concept>
...
<Domain ID="Domain1">
<Title>Commerce</Title>
<Definition>Information relating to the exchange of goods or services for
financial consideration</Definition>
</Domain>
<Domain ID="Domain2">
<Title>Correspodence</Title>
</Domain>
<Domain ID="Domain3">
<Title>Transportation</Title>
<Definition>Information relating to the movement of goods or equipment</Definition>
</Domain>
<Domain ID="Domain4">
<Title>Personnel Details</Title>
<Definition>Information relating to employees</Definition>
</Domain>
</ConceptList>

DTD using Extended XML Links

<!DOCTYPE ConceptList [

<!-- Document type for the definition of concepts
        Version 2 - Using XML Extended Links -->
<!-- First Draft developed by Martin Bryan, The SGML Centre, 1999-9-18 -->
<!-- This DTD contains two options at most points in the content model.
        The first name is designed to indicate to an untrained user what
        information is required at this point in the model.
        The second name indicates the formal name for the information
        as used by ontologists. This is intended to provide a more formal
        alternative to the popular element name.                -->

<!ENTITY % Text "(div|p|ul|ol|table|h3|h4|h5|h6)*"             >
<!ENTITY % HTML-text-elements SYSTEM  "XHTML.DTD"              >
<!ENTITY % link-atts '
  xmlns:xlink   CDATA                         #FIXED "http://www.w3.org/XML/XLink/0.9"
  xlink:type    (simple|extended|locator|arc) #FIXED "extended"	
  xlink:role    CDATA                         #IMPLIED
  xlink:title   CDATA                         #IMPLIED
  xlink:show    (new|parsed|replace)          "replace"
  xlink:actuate (user|auto)                   "user"         ' >
<!ENTITY % locator-atts '
  xmlns:xlink   CDATA                         #FIXED "http://www.w3.org/XML/XLink/0.9"
  xlink:type    (locator)                     #FIXED "locator"
  id            ID                            #REQUIRED
  xlink:href    CDATA                         #REQUIRED
  xlink:role    CDATA                         #IMPLIED
  xlink:title   CDATA                         #IMPLIED        ' >
<!ENTITY % ISO8601Date "CDATA"                                  >
<!ENTITY % metadata '
  RecordedBy    CDATA                         #IMPLIED
  WhenRecorded  %ISO8601Date;                 #IMPLIED       '  >

<!ELEMENT ConceptList        (Concept+, Domain*)                >

<!ELEMENT Concept            ((Title|PrincipleName)+, ConceptDomainRefs,
                              (AlternativeName|Synonym)*, AlternateDomainRefs,
                              Definition,
                              (BroaderTerms|Hypernyms)*,
                              (NarrowerTerms|Hyponyms)*,
                              (FormsPartOf|Holonyms)*,
                              (HasParts|Meronyms)*)               >
<!ATTLIST Concept            ID              ID        #REQUIRED
                             %metadata;                           >

<!-- No two titles for the same entry may have the same value for
     the xml:lang attribute -->
<!ELEMENT Title              (PCDATA)                             >
<!ATTLIST Title              xml:lang        CDATA     "EN"
                             %metadata;                           >
<!ELEMENT PrinicipleName     (PCDATA)                             >
<!ATTLIST PrinicipleName     xml:lang        CDATA     "EN"
                             %metadata;                           >

<!ELEMENT ConceptDomainRefs  (DomainRef+)                         >
<!ATTLIST ConceptDomainRefs  %link-atts;                          >

<!ELEMENT DomainRef          (#PCDATA)                            >
<!ELEMENT DomainRef          %locator-atts;
                             %metadata;                           >

<!ELEMENT AlternativeName    (PCDATA)                             >
<!ATTLIST AlternativeName    xml:lang        CDATA     "EN"
                             %metadata;                           >
<!ELEMENT Synonym            (PCDATA)                             >
<!ATTLIST Synonym            xml:lang        CDATA     "EN"
                             %metadata;                           >

<!ELEMENT AlternateDomainRefs (DomainRef+)                        >
<!ATTLIST AlternateDomainRefs %link-atts;                         >

<!ELEMENT Definition         %Text;                               >
<!ATTLIST Definition         %metadata;                           >

<!ELEMENT BroaderTerms       (BroaderTermRef|HypernymRef)+        >
<!ATTLIST BroaderTerm        %link-atts;                          >
<!ELEMENT Hypernyms          (BroaderTermRef|HypernymRef)+        >
<!ATTLIST Hypernyms          %link-atts;                          >

<!ELEMENT BroaderTermRef     (#PCDATA)                            >
<!ATTLIST BroaderTermRef     %locator-atts;
                             %metadata;                           >
<!ELEMENT HypernymRef        (#PCDATA)                            >
<!ATTLIST HypernymRef        %locator-atts;
                             %metadata;                           >

<!ELEMENT NarrowerTerms      (NarrowerTermRef|HyponymRef)+        >
<!ATTLIST NarrowerTerms      %link-atts;                          >
<!ELEMENT Hyponyms           (HyponymRef|NarrowerTermRef)+        >
<!ATTLIST Hyponyms           %link-atts;                          >

<!ELEMENT NarrowerTermRef    (#PCDATA)                            >
<!ATTLIST NarrowerTermRef    %locator-atts;
                             %metadata;                           >
<!ELEMENT HyponymRef         (#PCDATA)                            >
<!ATTLIST HyponymRef         %locator-atts;
                             %metadata;                           >


<!ELEMENT FormsPartOf        (PartOfRef|HolonymRef)+              >
<!ATTLIST FormsPartOf        %link-atts;                          >
<!ELEMENT Holonyms           (HolonymRef|PartOfRef)+              >
<!ATTLIST Holonyms           %link-atts;                          >

<!ELEMENT PartOfRef          (#PCDATA)                            >
<!ATTLIST PartOfRef          %locator-atts;
                             %metadata;                           >
<!ELEMENT HolonymRef         (#PCDATA)                            >
<!ATTLIST HolonymRef         %locator-atts;
                             %metadata;                           >

<!ELEMENT HasParts           (HasPartsRef|MeronymRef)+            >
<!ATTLIST HasParts           %link-atts;                          >
<!ELEMENT Meronyms           (MeronymRef|HasPartsRef)+                            >
<!ATTLIST Meronyms           %link-atts;                          >

<!ELEMENT PartRef            (#PCDATA)                            >
<!ATTLIST PartRef            %locator-atts;
                             %metadata;                           >
<!ELEMENT MeronymRef         (#PCDATA)                            >
<!ATTLIST MeronymRef         %locator-atts;
                             %metadata;                           >

<!ELEMENT Domain             (Title+, Definition?)                >
<!ATTLIST Domain             ID              IDREFS    #REQUIRED
                             %metadata;                           >

%HTML-text-elements;

]>

Using XML Extended Links the example above becomes:

<ConceptList>
<Concept ID="Address" RecordedBy="Martin Bryan" WhenRecorded="19990918">
<PrinicpleName>Address</PrincipleName>
<Title xml:lang="DE">Adresse</Title>
<ConceptDomainRefs>
<DomainRef xlink:href="#id('Domain1')">Commerce</DomainRef>
<DomainRef xlink:href="#id('Domain2')">Correspondence</DomainRef>
</ConceptDomainRefs>
<Synonym RecordedBy="Matti Nystrom" WhenRecorded="19990922">
Deliver To</Synonym>
<AlternateDomainRefs>
<DomainRef xlink:href="#id('Domain3')">Transportation</DomainRef>
</AlternateDomainRefs>
<AlternativeName RecordedBy="Gerhard Heine" WhenRecorded="19991002" xml-lang="DE">
Anschrift</AlternativeName>
<AlternateDomainRefs>
<DomainRef xlink:href="#id('Domain2')">Correspondence</DomainRef>
</AlternateDomainRefs>
<AlternativeName RecordedBy="Gerhard Heine" WhenRecorded="19991002" xml-lang="DE">
Privatadresse</AlternativeName>
<AlternateDomainRefs>
<DomainRef xlink:href="#id('Domain4')">Personnel Details</DomainRef>
</AlternateDomainRefs>
<Definition>
Information objects used to identify where a person, organization
or building is located.</Definition>
<BroaderTerms>
<BroaderTerm xlink:href="GenericConcepts.xml#id('Location')">
Location</BroaderTerm>
<Hypernym xlink:href="GeographicConcepts.xml#id('Place')">
Place</Hypernym>
</BroaderTerms>
<NarrowerTerms>
<NarrowerTerm xlink:href="GenericConcepts.xml#id('Post')">
Postal Address</NarrowerTerm>
<Hyponym xlink:href="TransportationConcepts.xml#id('DeliveryPoint')">
Delivery Address</Hyponym>
</NarrowerTerms>
<FormsPartOf>
<PartOfRef xlink:href="CommercialConcepts.xml#id('Order')">
Order</PartOfRef>
<PartOfRef xlink:href="CommercialConcepts.xml#id('Invoice')">
Invoice</PartOfRef>
<PartOfRef xlink:href="CommercialConcepts.xml#id('Statement')">
Statement</PartOfRef>
<PartOfRef xlink:href="CommercialConcepts.xml#id('Letter')">
Letter</PartOfRef>
<PartOfRef xlink:href="CommercialConcepts.xml#id('DeliveryNote')">
Delivery Note</PartOfRef>
<HolonymRef xlink:href="PersonnelConcepts.xml#id('PrivateAddress')">
Personnel Details</HolonymRef>
</FormsPartOf>
<HasParts>
<PartRefRef xlink:href="LocationConcepts.xml#id('RoomID')">
RoomID</PartRef>
<PartRef xlink:href="LocationConcepts.xml#id('BuildingID')">
BulidingID</PartRef>
<PartRef xlink:href="LocationConcepts.xml#id('RoomID')">
RoomID</PartRef>
<PartRef xlink:href="LocationConcepts.xml#id('Street')">
Street</PartRef>
<PartRef xlink:href="LocationConcepts.xml#id('City')">
City</PartRef>
<PartRef xlink:href="LocationConcepts.xml#id('Region')">
Region</PartRef>
<Meronym xlink:href="GeographicConcepts.xml#id('Country')">
Country</Meronym>
</HasParts>
</Concept>
...
<Domain ID="Domain1">
<Title>Commerce</Title>
<Definition>Information relating to the exchange of goods or services for
financial consideration</Definition>
</Domain>
<Domain ID="Domain2">
<Title>Correspondence</Title>
</Domain>
<Domain ID="Domain3">
<Title>Transportation</Title>
<Definition>Information relating to the movement of goods or equipment</Definition>
</Domain>
<Domain ID="Domain4">
<Title>Personnel Details</Title>
<Definition>Information relating to employees</Definition>
</Domain>
</ConceptList>