Measuring the Dimensions of a Language

© Martin Bryan, IS-Thought, 2003

This paper looks at how we comprehend the relationships between words, and how different types of words have different types of measurable dimensions. It considers:

The Time of Verbs

Verbs are the parts of language used to describe actions and events. Actions and events have a start time and an end time, which can be contiguous, and may be difficult to exactly determine, but will always be relative to some perception of the event they describe. To see what this means, consider the ways in which the period of rain that occurred yesterday between 11:00 and 14:00 was described. In the morning the local weather forecast stated that "It will rain around lunchtime today". When I rang someone at lunchtime I told them "It is raining". Now I am telling you that "It rained yesterday". Note that the form of the verb used depends on the relative time of the statement with respect to the fixed time of the event. For verbs the form does not depend, in English, on the actual time of the event, though there are a few phrases that do. For example, whether I refer to an event as happening in the morning, afternoon or evening depends on the actual time it took place. But some events cannot be described so accurately. For example, the rain referred to above started in the morning and ended in the afternoon, but as the weather forecast predicted, it occurred around, and during, lunchtime, an amorphous category of time that occurs on the borders of mornings and afternoons. And when it is lunchtime in England it is breakfast time in America, and supper time in Australia. The timing of events is, therefore, nearly always relative to the lifetime of the person describing the events, and to get any statement of events in context we have to understand the timeline of the person describing the event.

Given this, how can we record the timing of events accurately? What is the relationship between the tense of a verb and the actual timing of events and how can this be used to position events in a timeline? The use of the future tense tells us that the event will start and end at a time after the time at which the statement was made. The use of the present tense tells us that the start of the event preceded the time at which the statement was made, but will end after the statement was made. The use of the past tense tells us that both the start and end of the event occurred before the statement was made. But what is the relationship between the statement being made and it being recorded? Is the time of the documentary record evidence of the time of the actual event? While in factual material the tense of the documentary record is often sufficient to determine the relative time of an event, for fictional material, where a timeline is being created by an author, the tense of verbs only allows us to determine the relative position of events with respect to other events. Unless one or more of these events is associated with an absolute time we have difficulty in placing them accurately on a timeline.

The Space of Nouns

Nouns are the way in which we identify sets of similar instances of "things". They are the labels that allow us to be able to generalize about the world around us rather than having to refer to each specific instance of a thing by a separate identifier. Prisoner 1234 is an individual, but the fact that he is classified as a prisoner tells us how we should react to his presence. Nouns do not have a tense associated with them. In English we do not change the way in which we refer to a prisoner depending on whether he will be a prisoner, is a prisoner or has been a prisoner, even though many who have been prisoners might wish that we could distinguish between these states. As the saying goes, "Once a prisoner always a prisoner".

1 It would be interesting to study how the language of astronauts changes when they have lived for a while in an environment in which there is no "up" or "down". Will they start referring to positions differently? Will they start to assign absolute positions rather than relative ones?

While they are not time-dependent, nouns are space dependent. How we describe individuals with respect to their nouns is clearly determined by their spatial position, which in turn is determined relative to the observer recording a particular instance. To see what this means, consider a stepped pyramid of cubes, with a base with 5 cubes on each side, followed by a step with 4 cubes per side, a step with 3 cubes per side, and a step with 2 cubes per side capped by a single cube. The centre of the pyramid is clearly the middle cube of the third step. Above that are the four corners of the cubes making up the fourth step, and immediately above these is the single cube that forms the top of the pyramid. Immediately underneath the centre cube are the four corners of the cubes that form the centre of the second step, and the centre cube of the lowermost step. It should be noted, however, that whether something is above or below something else is determined by its position with respect to the direction of gravity. Something that is above something else can fall from, or on, it. Something that is below something else cannot fall on it, but can only serve to resist its fall. If you don't find this convincing try building the stepped pyramid with only one cube at the bottom: you will find it a much more complicated task that building it the natural way up! Humans invariably refer to things that are up and down relative to each other with respect to the way in which gravity affects the things being referred to.1

But what about the other direction? Does that get treated in exactly the same way? Consider our fourth step, the one made up of four cubes. If we look at the pyramid from one side we see two cubes on the right of the centre line, and two on the left and we can make a clear distinction between the cube in front and the one at the back on each side. But what happens if we look at the pyramid from the opposite side? Now the cube that was clearly on the right at the back before is equally clearly at the front on the left of the square. If we look at the same brick from the other sides it will never be both on the right and at the back at the same time. Only an observer standing on one side can ever see this particular combination of relative positions. Observations made in the horizontal plane are always relative to the position of the observer.

If we consider the bottom step we can identify another interesting phenomenom. The central cube on either side is neither to the left or the right of the pyramid. The blocks adjacent to the central cube are clearly either to the left or the right of this cube, but they have the opposite relationship to that edge of the pyramid, which are defined by the left or right edge of the cubes to the left or right of these intermediate cubes. Here we start to come across the problems related to the relative positioning of related things. Relative position is related to individual occurrences of instances of nouns, rather than to the nouns themselves.

Nouns describe things, but things can be "parts of" a larger thing. Whether something is considered to be part of something or a whole in its own right depends on the worldview of the observer, which often changes over time. For example, to the manufacturer of carburetors the carburetor is a whole consisting of a set of individually named parts. To the car manufacturer it is simply a part of his car that needs no further subdivision to describe it. The space used to describe the relative position of the components of the whole changes depending on your viewpoint with respect to the object being described.

The Relative Position of Properties

Adjectives and nouns are used to assign properties to nouns and verbs. Properties have their own sets of dimensions, some of which are obvious, some of which are not. Properties such as colour and length have dimensions that can explained scientifically, using commonly agreed measurement systems such as wavelength, hue and saturation, and feet or metres. Equating the terms used to express the properties to such measurements is, however, no easy matter. How can you distinguish the boundaries of "sunflower yellow" from those of "mustard"? What is the exact length, in metres, of a league? 

When it comes to the properties of  immeasurable concepts used to describe the moral qualities of things a different set of problems arise. For example, consider the relative positions of a "terrible choice", a "bad choice", an "poor choice", an "irrelevant choice", a "sensible choice", a "reasonable choice", a "good choice" and an "ideal choice". While there is a clear progression along an axis that could have its axis marked with good at one side and bad at the other, placing the qualifying adjectives along this axis will be somewhat subjective. What is the relationship of the words "irrelevant" and "reasonable" to the terms "good" and "bad"? Would a "diabolical choice" be worse or better than a "terrible choice"? Is a "perfect choice" better or worse than an "ideal choice"? Where would an "unwise choice" or an "excellent choice" fit in the sequence? How can we relate the different terms used qualify the noun "choice", and how do their use in this context compare with the use of the same terms to qualify more concrete nouns?

Adverbs also have clear dimensions. Between the slowest and the fastest members of a group we have slower and faster members relative to some mean. When we say we are travelling more slowly we than before we are talking of relative speeds rather than absolute ones. Until we establish what it is that the measurement is being made with respect to we cannot determine the absolute speed of an object. For example, something travelling more slowly than the speed of light can be travelling much faster than the speed of sound. The fact that there is clearly an axis that links the fastest measurements to the slowest does not mean that we can position the speed of any action on a single axis.

While ontologies allow us to describe the relationships between sets of nouns and verbs, they rarely allow us to define the relationships between adjectives and nouns, or adverbs and the verbs they are used to qualify. They do not currently allow us to create ordered sequences of terms that indicate the relative significance of qualifying properties, or to indicate how ordering affects the interpretation of phrases such as "a pretty ordinary slow train".

Adding Dimensions to Ontologies

The W3C Web Ontology Language (OWL) is the most recently discussed representation for describing the relationship between classes and the individuals that conform to each class. An OWL ontology consists of an annotated set of axioms and facts. Facts can relate to classes, properties or individual instances. In addition you can define individual values for properties, which can conform to a named datatype. Each of these components can be assigned a unique identifier that allows cross-references to be created between terms. Facts and axioms can both be restricted to applying to specific classes with controlled ranges of data values. Object restrictions can be applied when relating one object to another. The following examples suggest how some of the dimensions identified above could be defined using simple extensions to the OWL syntax.

Because OWL is an application of the W3C Resource Description Framework (RDF) the outermost element of any OWL ontology, which defines the set of namespaces applied to the ontology, has the form:

<rdf:RDF
 xmlns      = "http://www.is-thought.co.uk/words"
 xmlns:word = "http://www.is-thought.co.uk/words"
 xmlns:eval = "http://www.is-thought.co.uk/words/dimensions"
 xmlns:dc   = "http://purl.org/dc/elements/1.1/"
 xmlns:owl  = "http://www.w3.org/2002/07/owl#"
 xmlns:rdf  = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:rdfs = "http://www.w3.org/2000/01/rdf-schema#"
 xmlns:xsd  = "http://www.w3.org/2000/10/XMLSchema#" 
>

where the top three namespaces are concerned with extensions proposed in this paper. This "start tag" must be matched by an </rdf:RDF> "end tag" at the end of the file. 

The ontology can be given a name and some descriptive metadata using a heading with the following structure:

<owl:Ontology 
 rdf:about = "http://www.is-thought.co.uk/words"
 dc:Creator = "Bryan, Martin"
 dc:Publisher = "IS-Thought"
 dc:Identifier = "http://www.is-thought.co.uk/dimensions.ont"> 
  <rdfs:comment>
    An example of defining word sets using OWL and Evaluation extensions
  </rdfs:comment>
  <owl:versionInfo>
    $Id: dimensions.htm, v1.0 2003/01/01 10:45 Bryan $
  </owl:versionInfo>
</owl:Ontology>

The header can be followed by a set of basic class definitions such as:

<owl:Class rdf:ID="verb"/> 
<owl:Class rdf:ID="noun"/> 
<owl:Class rdf:ID="adjective"/>
<owl:Class rdf:ID="adverb"/>

We can also create subclasses for these classes, such as:

<owl:Class rdf:ID="presentTense">
  <rdf:subClassOf rdf:resource="#verb"/>
  <rdfs:label xml:lang="en">Present Tense Form</rdfs:label>
</owlClass>
<owl:Class rdf:ID="pastTense">
  <rdf:subClassOf rdf:resource="#verb"/>
  <rdfs:label xml:lang="en">Past Tense Form</rdfs:label>
</owlClass>
<owl:Class rdf:ID="futureTense">
  <rdf:subClassOf rdf:resource="#verb"/>
  <rdfs:label xml:lang="en">Future Tense Form</rdfs:label>
</owlClass>

From the class definitions we can create individual definitions for the different occurrences of words, using declarations of the form:

<noun rdf:ID="choice">
  <rdfs:label xml:lang="en">choice</rdfs:label>
  <rdfs:label xml:lang="fr">choix</rdfs:label>
<noun>
<adjective rdf:ID="good">
  <rdfs:label xml:lang="en">good</rdfs:label>
  <rdfs:label xml:lang="fr">bon</rdfs:label>
</adjective>
<adjective rdf:ID="bad">
  <rdfs:label xml:lang="en">bad</rdfs:label>
  <rdfs:label xml:lang="fr">mauvais</rdfs:label>
</adjective>

Unfortunately, however OWL does not provide operators that work on instances rather than classes, so you cannot say:

<owl:Class rdf:ID="to-rain"> 
 <rdf:subClassOf rdf:resource="#verb"/>
 <owl:unionOf rdf:parseType="Collection"> 
  <pastTense rdf:ID="has_rained">
   <rdfs:label xml:lang=en">has rained</rdfs:label>
   <rdfs:label xml:lang="fr">pleuvaient</rdfs:label>
  </pastTense>
  <presentTense rdf:ID="is_raining">
   <rdfs:label xml:lang=en">is raining</rdfs:label>
   <rdfs:label xml:lang="fr">pleut</rdfs:label>
  </presentTense>
  <futureTense rdf:id="will_rain"/>
   <rdfs:label xml:lang=en">will rain</rdfs:label>
   <rdfs:label xml:lang="fr">pluies ont commenc&eacute;</rdfs:label>
  </pastTense>
 </owl:unionOf>
</owl:Class>

To overcome this restriction, this paper proposes that an extension should be made to allow the evaluation of sets of instances, with the following format:

<owl:Class rdf:ID="to-rain"> 
 <rdf:subClassOf rdf:resource="#verb"/>
 <word:unionOf rdf:parseType="Collection"> 
  <pastTense rdf:ID="has_rained">
   <rdfs:label xml:lang=en">has rained</rdfs:label>
   <rdfs:label xml:lang="fr">pleuvaient</rdfs:label>
  </pastTense>
  <presentTense rdf:ID="is_raining">
   <rdfs:label xml:lang=en">is raining</rdfs:label>
   <rdfs:label xml:lang="fr">pleut</rdfs:label>
  </presentTense>
  <futureTense rdf:id="will_rain"/>
   <rdfs:label xml:lang=en">will rain</rdfs:label>
   <rdfs:label xml:lang="fr">pluies ont commenc&eacute;</rdfs:label>
  </pastTense>
 </word:unionOf>
</owl:Class>

To define valid relationships between terms we can define OWL properties that can be used to link classes of objects by using declarations of the form:

<owl:ObjectProperty rdf:ID="noun-qualifier">
  <rdfs:domain rdf:resource="#noun"/>
  <rdfs:range rdf:resource="#adjective"/>
</owl:ObjectProperty>

If we are forced to retain the OWL restriction that properties can only apply to classes we would need a declaration along the following lines to link together nouns and the adjectives relevant to them:

<owl:Class rdf:ID="choice-phrases">
 <owl:unionOf rdf:parseType="Collection">
  <owl:Class>
   <owl:oneOf rdf:parseType="Collection">
    <owl:Thing rdf:about="#choice"> 
      <noun-qualifier rdf:resource="#good"/>
    </owl:Thing>
    <owl:Thing rdf:about="#choice"> 
      <noun-qualifier rdf:resource="#bad"/>
    </owl:Thing>
   </owl:oneOf>
  </owl:Class>
  <owl:Class rdf:ID="choice">
   <rdfs:subClassOf rdf:resource="#noun"/>
  </owl:Class>
 </owl:unionOf>
</owl:Class>

To reduce the complexity of such declarations, we propose that an equivalent of the oneOf operator be introduced that allows properties to be applied to instances, using declarations of the form:

<noun rdf:ID="choice">
 <word:oneOf rdf:parseType="Collection">
  <noun-qualifier rdf:resource="#good" eval:weighting="50"/>
  <noun-qualifier rdf:resource="#bad" eval:weighting="-50"/>
 </word:oneOf>
</noun>

Note that, in addition to linking the noun to the two adjectives, an evaluation weighting (eval:weighting) has been assigned to each of the descriptive adjectives to indicate where the property value lies along an axis of evaluation that is relative to the noun being qualified by the collection.

For related adverbs a similar solution is available, as illustrated in the following example, for which an evaluation weighted OWL enumerated class has been defined as:

<owl:Class rdf:ID="relative-speed">
 <owl:oneOf rdf:parseType="Collection">
  <adverb rdf:ID="fastest" eval:weighting="75">
   <rdfs:label xml:lang="en">fastest</rdfs:label>
   <rdfs:label xml:lang="fr">plus rapidement</rdfs:label>
  </adverb>
  <adverb rdf:ID="faster" eval:weighting="50">
   <rdfs:label xml:lang="en">faster</rdfs:label>
   <rdfs:label xml:lang="fr">tres rapidement</rdfs:label>
  </adverb>
  <adverb rdf:ID="fast" eval:weighting="25">
   <rdfs:label xml:lang="en">fast</rdfs:label>
   <rdfs:label xml:lang="fr">rapidement</rdfs:label>
  </adverb>
  <adverb rdf:ID="slow" eval:weighting="-25">
   <rdfs:label xml:lang="en">slow</rdfs:label>
   <rdfs:label xml:lang="fr">ralenti</rdfs:label>
  </adverb>
  <adverb rdf:ID="slower" eval:weighting="-50">
   <rdfs:label xml:lang="en">slower</rdfs:label>
   <rdfs:label xml:lang="fr">lentement</rdfs:label>
  </adverb>
  <adverb rdf:ID="slowest" eval:weighting="-75">
   <rdfs:label xml:lang="en">fastest</rdfs:label>
   <rdfs:label xml:lang="fr">tres lentement</rdfs:label>
  </adverb>
 </owl:oneOf>
</owl.Class>

This class may need to be combined with many different words, but OWL fails to provide a technique for associating classes of qualifying words with an individual noun or verb. Two extensions are required for this to be possible, one that defines a fixed sequence of word types, such as:

<word:sequence>
  <owl:Class rdf:about="relative-speed"/>
  <owl:Class rdf:about="form-of-transport"/>
</word:sequence>

and one which allows the classes to occur in any order:

<word:anyOf>
  <owl:Class rdf:about="colours" word:usage="optional"/>
  <owl:class rdf:about="relative-speed" word:usage="optional"/>
  <owl:Class rdf:about="form-of-transport" word:usage="required"/>
</word:anyOf>

The word:usage attribute allows users to define whether or not a class or instance is optional within the set of words, or whether its presence is required. It is used to simplify, and extend to classes and instances, the owl:Cardinality construct.

At present formal definitions of the extensions have not been completed, but when they are it is expected that other OWL complex class constructs will be adapted for use with instances as well as classes.