to the top publications

PhD Thesis © Andreas Dieberger 1994, 2000
Navigation in Textual Virtual Environments using a City Metaphor

[Contents]--[Abstract]--[1]--[2]--[3]--[4]--[5]--[6]--[7]--[8]--[References]


2. Spatial cognition of humans

To become completely lost is perhaps a rather rare experience for most people in the modern city. We are supported by the presence of others and by special way-finding devices: maps, street numbers, route signs, bus placards. But let the mishap of disorientation once occur, and the sense of anxiety and even terror that accompanies it reveals to us how closely it is linked to our sense of balance and well-being. The very word "lost" in our language means much more than simple geographical uncertainty; it carries overtones of utter disaster.
[LYNC60, p.4]

This chapter overviews how the view of the concept "space" has evolved in philosophy and mathematics. The mathematical view of space will provide a terminology useful to distinguish between the properties of real and virtual environments. The remainder of the chapter is devoted to psychological issues in the perception of space and how humans memorize spatial relationships. Different sources of spatial perception are covered, like direct visual perception, perception using the other senses and perception using language. The section about spatial memory reviews several models of spatial memory and describes how these models try to explain systematic distortions in spatial memory. The final section of this chapter has a look at navigation and if it is possible without any spatial memory.

2.1. What is "space"?

Perhaps the most widely accepted conception of space is that of a container or framework where things exist. It is so accepted because it agrees with what people have been taught in school and with what people expect in every day experience - both in common life and in technical work such as engineering, architecture and planning. People can put things in some space or place, can move them along that space from place to place, and thus get the idea that space is something that is there before things begin to fill it. Something that exists regardless of things and previously to their existence: it exists even if empty. [NUNE91, p.12]

2.1.1. Philosophical and physical concepts of space

Space is often seen as a relation on a set of objects. It is possible to create different types of space dependent upon how this relation is defined. The naive view of space sees it as an empty container. Modern philosophy sees space either as a necessary form of perception or as a continuum that is bound to the existence of matter and is structured by the presence of that matter.

In the European philosophical tradition space is seen as a three-dimensional continuous and homogenous entity. Philosophy focuses on three problems: whether space is finite or infinite, the possibility or impossibility of empty space and the "reality" or "subjectivity" of space.

The concept of space is closely related to geometry. Euclid systematized and generalized the entire geometrical knowledge of antiquity in his "elements". Space that follows those basic geometric rules and man's everyday experiences is called Euclidean space. The Euclidean space concept is not consistent with contemporary knowledge from cosmology and physics - still it is an adequate concept of space for most every-day phenomena.

Demokrit postulated an empty space with moving atoms. This theory accepts an objective existence of space and time. His theory involved the belief in the existence of minimal, indivisible units of space and time. This belief was grounded it the theory of numbers which states that everything can be reduced to series of natural numbers and ratios between them. Demokrit sees space as infinite and time as eternal.

In the middle-ages it was assumed that empty space could not exist. This view (horror vacui) was based on Aristotle's view which sees space as the place of all things and therefore as a container.

Bruno and Galilei advocated infinite space which brought them into conflict with the church. The principle of relativity, formulated by Galilei, states that space and time are homogenous for mechanic processes in every inertial system.

Descartes saw "extension" as the basic principle of space and all things and reduces space to small particles which are called Corpuscles. This theory comes close to the theory of a world-ether from the 19th century.

According to Newton three-dimensional space has no inherent physical but only Euclidean geometric properties and is therefore absolute and independent from matter therein. Absolute space thereofore is seen as an immaterial container for objects.

Toland, an English materialist, however rejected absolute space that can exist independently from matter. Also Leibnitz saw the absolute conception of space as one of the shortcomings of the materialistic conception. He postulated that space and time are relations between objects and processes. Space and time therefore are bound closely to matter and possess no absolute reality in the absence of it. As a last consequence Leibnitz sees space and time as subjective perceptions although they correspond an objective order of the things in the world.

Kant saw space and time as a purely subjective form of perception. His ideas on space and time were a strong criticism against both the absolute and relative conceptions of space. His ideas expressed in his "Critque of Pure Reason" can therefore be seen as one attempt to shed new light on an old debate.

Kant's central these is that space and time can not be thought of as if they were ordinary - physical, empirical - objects or events. Space and time are rather structures or systems to record and order the observations about things and events. As such, space and time do not belong in themselves to the empirical world, but are part of the mental equipment developed to capture and reason about the real world. [NUNE94, p.18] In other words, Kant thought that the order we find in nature is the order that exists in our minds, an order which is embedded or reflects the own structure of mind. [NUNE94, p.19] Therefore only space and time together make perception possible. In Kant's concept of space the space is independent from the contents of the space.

Einstein later disproved this assumption of independence of space and time from matter in his general theory of relativity. The contents of space influence space's structure. It should be mentioned that the theory of relativity does not imply that space without matter is without structure, but space is bent by the presence of matter. This leads to phenomena which are seen as empirical proof of the general theory of relativity, for example the bending of light that passes close to a star.

The development of the concept space started out with the general concept of a container. Of main interest for philosophy always was if space without content is still space or not. The current view is that matter influences the structure of space. Space is not seen any more as a three-dimensional concept but instead as a four-dimensional space-time continuum. The philosophical view of space in heavily influenced by physics.

2.1.2. The mathematical view of space

This section looks at the mathematical definition of space which is more general than the philosophical or physical view. Everyday space we are used to navigate is basically a three-dimensional Euclidean space which is an example of a metric space. A more general space concept can be defined by means of set theory and the term "Neighborhood". This leads to the topological space explained in section 2.1.2.2. A formal way to treat topological spaces is the graph theory which is presented in a separate section.

Mathematically "Euclidean" space is characterized by orthogonal dimensions, where the number of dimensions is three for the space we are used to live in. Every point in space is defined by coordinates in an orthogonal coordinate system. Using analytical geometry distances between points can be defined. A space with more or less then 3 dimensions is called vector space.

Space is a relation defined on a set of objects. Spatial objects can be defined as a set of spatial locations. Relations on a set of such objects often have the form of a "distance". The concept of distance seems to be clear on first sight but distance is a very general concept that not necessarily has much in common with what we see as "distance" in everyday life.

2.1.2.1. Metric spaces

Metric spaces are spaces that define a metric distance relation ("metric") on the objects of that space. If the elements of a set are n-tuples those tuples can be seen as points in n-dimensional space. A function d(a,b) -> R defined on an ordered pair of such tuples is called a metric if it obeys the following rules for every triple of points a, b, c in the space looked at: An example for a metric is the Euclidean metric or Euclidean distance which is used to define Euclidean space. For 2-dimensional space it is defined by Pythagoras' theorem.

Euclidean metric formula

The next formula shows a generalization of this theorem for n-dimensional space. The index i signifies the i-th dimension of the n-dimensional vector representing a point.

Euclidean metric generalized

Another metric is the manhattan metric. It is sometimes called the cab-driver metric. For n-dimensional space it is defined by

Manhattan metrics

This distance function calculates the distance a cab driver has to drive to reach a certain house in a rectangular grid of houses. This function has the properties of a metric, but its results may differ from those of the Euclidean metric (see figure 1).


Fig.1: Euclidean (a) and Manhattan metric (b).

A e-neighborhood of a point x in a metric space M is defined as the set

e-neighborhood

A neighborhood defines a set of points in space, which fulfill a closeness-property according to the distance function defined on that space.

The concept of space as defined in this section is very close to our everyday experiences. The world we live in is viewed as a three-dimensional Euclidean space. Generalization of this space concept to more dimensions is possible. Euclidean space is defined using the distance relation of the Euclidean metric. Properties of distance metrics have been defined.

2.1.2.2. Topological space

Topological spaces are based on sets and a neighborhood concept. A Topology T on a non-empty set X is a system of subsets of X, that fulfills the following axioms: The pair (X,T) is called a topological space. Neighborhoods in topological space follow four neighborhood axioms:

2.1.2.3. Graph theory

An example of a visual representation of a topological space is the map of a subway system. It contains information needed to get from a point a to a point b. The locations of the points shown in the map do not necessarily correspond to the locations according to a geographical map of the city. Instead the locations are somewhat arbitrary. The relation or distance function defined on this non-metric space is one of connectedness. There still is a "distance" conveyed in this map as the number of stops to pass from one point on the map to another point gives a rough indication on the time needed to travel. This distance assessment is subjective, based on real-world experience and possible changes according to the time of the day. It is not a metric distance function as one or more of the metric properties may not hold.

This example shows one general principle of topological spaces: A topological space is one in which there is some arbitrariness in the positioning of locations and arcs and where the only relation that matters is contiguity. [GATR91, p. 122]

A graph G is a collection of nodes and a collection of edges. Nodes are sometimes called points or vertices. Two edges can meet only in their ending nodes. If edges have no direction they are sometimes called links. If they have a direction they are called arcs. A graph consisting of directed arcs is called a directed graph.

There are several ways to describe graphs. One of which is to describe them in graphical form. The left graphs in figure 2 is a directed graph, the right one an undirected graph:


Fig. 2: A directed (left) and an undirected graph (right).

Matrix representation of graphs

A graph can be described as a set of nodes and a set of correspondences, like: A is connected to B. Another way to describe a graph is by giving its adjacency matrix. Table 1 shows the adjacency matrix for the left graph in figure 2. In this matrix a value of 1 means "x has an arc to y".
ABCD
A 0 1 1 1
B 0 0 0 1
C 0 0 0 0
D 0 1 1 0

Table 1: Adjacency matrix for the left graph in figure 2.

A path is any sequence of arcs where the final node of one arc is the initial node of the next one. In the left graph of figure 2 a possible path is A, B, D, C. A "simple path" is a path that does not use the same arc more than once and an "elementary path" is one that does not use the same node more than once. A path is closed if the starting node and the ending node of the path are the same node. A chain is the undirected counterpart to a path. The order of a node is the number of edges ending in a node. A node is even if the order of that node is even and uneven if the order of the node is uneven.

A graph is said to be complete if for every pair of nodes there is a link between those nodes. A graph is symmetric if for every nodes a and b where there is an arc from a to b there is also an arc from b to a. A graph is a strong graph if for every two distinct nodes a and b there is at least one path to reach b from a.

The reachability matrix shows if a node b is reachable from a node a. In the case of the left graph of figure 2 the reachability matrix is almost identical to the adjacency matrix (see table 2).
ABCD
A 0 1 1 1
B 0 0 1 1
C 0 0 0 0
D 0 1 1 0

Table 2: Reachability matrix for the left graph in figure 2.

Directed graphs

Every transport network is a directed graph as the good to be transported can move only into one direction. Arcs or links in graphs described so far have no "length". Graph theory is sufficiently general to describe graphs which show directed links or weighted links.

A distance between nodes in the graph could be calculated by a reachability function which yields a distance of 1 if A is reachable from b and 0 if it is not. Another distance function can be defined by using the length of the shortest path from a to b. It is possible to attach weights to arcs and thus to introduce arc-lengths.

A typical example of weighted graphs are hypertext systems which will be examined in section 5.1. and textual virtual environments (section 4.4.2.). In those systems pieces of information, called nodes, are interlinked by connections called links. Ordinary those links are directed links.

In graphs it is possible to define distance functions. Simple examples are again simple reachability or adjacency functions. Attaching distances to arcs is also possible. In the case of a textual virtual environment where nodes show a distinct position, the distances between two such nodes should be determined by the spatial concept of the virtual space and by the positions of the nodes. Another possibility to define distances is by calculating the similarity of the contents of nodes. This approach is becoming common practice in hypertext systems.

Conclusion

Philosophically space is seen as one of the prerequisites of perception and as a mirror of our thought processes. While Kant assumed space to be independent from its contents, Einstein proved in his general theory of relativity that matter influences and distorts space.

A more general concept of space is the mathematical view of space. Mathematically spaces are defined as a system of relation on a set of elements. In case those set-elements are n-tuples the elements can be interpreted as points in an n-dimensional space. An example for a system of relation of points in such a space is the distance relation. Distance functions showing properties of a metric are called a "metric". Spaces defined using a metric distance function are called metric spaces. A commonly known example is the Euclidean distance function which defines Euclidean space.

Topological space is defined on sets and a neighborhood concept. In topological spaces the relation of contiguity is considered. Graphs can be represented graphically or in table-form as an adjacency matrix. Directed graphs are graphs where the edges have a directionality. Distances can be defined in graphs. Examples for directed graphs are hypertext systems and virtual environments.

2.2. Spatial perception

This section focuses on spatial perception. Spatial perception is different from other perception as it is always possible to verify it by making use of several senses [FREK91]. This fact makes us take any spatial perception for particularly real. Spatial perception is possible using the sense of vision or using the other senses. It is possible to perceive spatially using abstractions and using language. These issues are covered in separate subsections.

2.2.1. Experiencing space using vision

The visual system enables humans to perceive objects and the space they are contained in as three-dimensional objects. Spatial vision is not the only possibility to see an object spatially. Using perspective it is possible to represent spatial objects on a flat surface as a drawing. Such a drawing is a transformation of spatial objects into a representation in a flat space. Our knowledge of the real world helps us in perceiving such a flat picture as a spatial construct.

A flat representation contains less spatial information than the real object - the missing information is provided by the picture-processing done in our heads. By delivering wrong cues to this processing it is possible to deceive spatial perception, as is done in many pictures of M. C. Escher. Spatial perception from a picture alone therefore can be misleading (see figure 3).


Fig. 3: Waterfall by M. C. Escher.

There is an even less spatial representation of spaces that can be perceived using the visual system: abstract representation of space using language. This possibility will be examined in section 2.2.3.

The human visual system is built to see spatially. Using spatial knowledge humans are able to see spatially when confronted with a representation of a spatial scene - for instance a perspective drawing. Such a drawing is an abstraction and can deceive the spatial perception.

2.2.2. Experiencing space using different senses

It is possible to perceive spatially without using the eyes. As Freksa points out we are able to perceive spatially with several of our senses which makes us more confident in our spatial perception:
Our knowledge about physical space differs from all other knowledge in a very significant way: we can perceive space directly through various channels conveying distinct modalities. Unlike in the case of other perceivable domains, spatial knowledge obtained through one channel can be verified or refuted by perception through the other channels. As a consequence, we are disproportionately confident about what we know about space: we take it for real. [FREK91, p.361]

Hearing sound spatially is perhaps the strongest spatial perception besides the visual one. The resolution of this perception is much coarser than the visual system but we are able to "hear" what place we are in and we can locate sound sources approximately. The acoustic sense allows humans to monitor the environment as it perceives from all directions at the same time.

Spatial perception can be induced by the kinesthetic sense when moving at high velocity. When moving in our environment the various senses allowing for spatial perception work together to give us a multimedial spatial perception of that environment.

2.2.3. Experiencing space using abstractions and language

Space can be perceived from abstractions. Iconics are very abstract drawings of an object or a situation. Those iconics often do not represent one particular object or situation but a class of objects or situations.

Another commonly used abstraction to represent space are maps. They represent large spatial constructs at a high level of abstraction. While everybody can perceive spatially from pictures and simple drawings the reading of maps is a skill that has to be learnt.

The most abstract way to represent objects is through the use of language either in spoken or written form. Written language evolved from iconics [BOLT91] [MCCL94]. Kanji or the Chinese language even today use symbolic writing.

Sharing spatial information about dangerous places and places to find food and shelter probably was one of the first uses of language and so we have a very rich vocabulary to describe environments and spatial relations.

When reading textual descriptions of spaces or maps we are able to construct a mental representation of the space described. Perceiving spatially using descriptions is possible because the descriptions can invoke spatial images in our mind which induce a spatial perception.

Language does not specify space as completely or precisely as (direct) perception. Descriptions often do not exactly specify distances between objects or exact orientation. Therefore the spatial perception from a description relies on general knowledge and assumptions.

2.2.3.1. Reference frames

Descriptions of relations between objects need to be related to a reference frame. Reference frames are tightly coupled to either the human or the environment the individual moves in.

An example for reference frames was described by Kevin Lynch: "The system used on the North China plain is a strictly regular one. It has deep magical connotations: north being equated with black and evil, south with red, joy, life, and the sun. It controls very strictly the placing of all religious objects and permanent structures. Indeed, the chief use of the "south-pointing needle", a Chinese invention was not for navigation at sea, but for the orientation of buildings. So pervasive is this system that the country people on this flat land give their direction by compass points, and not by right or left, as would be natural to us. The organizing system does not center on the individual, moving and turning with him, but is fixed, universal, and outside the person." [LYNC60, p.128]

The system Lynch describes is a NSEW-reference frame (see below). Bryant proposed a classification of reference frames for his Spatial Representation Scheme (SRS). He sees reference frames are "coordinate systems in which locations can be specified along three dimensions" [BRYA92, p.3]:

The allocentric reference frame can be further classified into several common types of reference frames. These are according to [PEDE93]: Reference frames can be symmetric or asymmetric. Asymmetric systems use one dominant axis, for instance uphill, whereas the symmetric systems use equivalent axes.

The choice of which reference frame to use in a description depends on the language used, on cultural background, and on the situation at hand. This was demonstrated in a study by Pederson. The study looked at two Tamil linguistic systems, one of which is used mainly in cities whereas the other system is in use mainly in the countryside. These two systems show differences in the use of reference frames and therefore in the reference to objects in space [PEDE93]. This shows that descriptions of spatial relationships often depend on context and cultural background of the provider of the description.

By means of formal types of reference it is possible to point out objects and spatial relationships precisely. Everyday language is by far not as precise. Our language mostly provides words for coarse level descriptions of the environment like: "next to", "between", "to the left of" or "on top of". Those words always describe location as a relation to another element. Other words like "within", "contains" or "borders" are descriptions in relation to a reference frame, that often is assumed only implicitly.

When describing spatial relationships relations between objects must be described according to a frame of reference. According to a classification scheme proposed by Bryant there are several types of reference frames. The most important of them is the allocentric reference frame. The choice of which reference frame to use in a description depends on the situation and on the cultural background of the describer.

2.2.3.2. Scale

Closely related to spatial perception and to the choice of reference frames is scale. Properties of space, or relations between objects in space are normally seen as scale-independent when studied as formal problems. Thought and behavior in space is not scale independent when studied as problems of perception.

Montello defines scale as the ratio between the dimensions of a representation and those of the thing that it represents [MONT93, p.313]. He sees scale therefore a relative size of a representation. According to Montello there can be confusion concerning large-scale and large-size because in one instance the size relative to the real environment whereas in the other instance the size relative to a person is considered. Montello therefore introduces a different terminology that avoids this ambiguity.

He uses a four level classification of psychological spaces. He bases this classification on the earlier scale concepts of Ittelson, Mandler and Zubin. Montello distinguishes classes of spaces according to the "projective size" of the space relative to the human body and not according to the actual or apparent absolute size. Therefore a large scale space viewed from a distance may become a smaller-scale space:

  1. Figural space is projectively smaller than the body. The properties of this type of space can be perceived directly from one place without movement. It can be subdivided into object space and pictorial space. Pictorial spaces are flat, whereas object spaces are three-dimensional. Examples for objects in figural space are pictures, small objects and distant landmarks. Sometimes objects in figural space may be haptically manipulated to apprehend their spatial properties. However no movement of the body is necessary to apprehend those spaces.
  2. Vista space is projectively as large or larger than the body. Still it may be visually perceived from a single location without movement. Single rooms, but also a town square, and the horizon are examples for vista spaces. Examples for vista spaces are single rooms, town squares, small valleys and the horizon.
  3. Environmental space is projectively larger than the body and surround it. It is too large and partially to obscured to be apprehended without locomotion. It requires integration of information over significant periods of time. Examples for such spaces are buildings, neighborhoods, and cities. Although they cannot be perceived in a brief amount of time they still can be apprehended from direct experience.
  4. Geographical space is projectively much larger than the body and must be learned using a symbolic representation like a map or a model. Those symbolic representations often are objects in figural space - that is they are haptically manipulable or are pictorial spaces. Examples for geographical spaces are countries or the solar system. A similar situation is given when considering very small scale spaces: molecular or atomic spaces may be apprehended only symbolically but they are much smaller than any haptic space.
This distinction is important in the context of spatial communication and it can play an important role in the optimal design of spatial information systems.

An important task involving spatial information is to communicate spatial relationships to other people. Communication of spatial relationships is closely related to the issue of scale as many verbal and gestural descriptions of space contain an assessment of scale.

The simulation of spaces on a computer relies on a clear concept of scale too. A representation of a space on a computer screen is a tabletop-scale space representing larger or smaller spaces. Computer screens normally cannot be seen as environments but they are spatial simulations of environments. This is not evident in systems that try to immerse the user in a synthetic environment using a stereo-representation of a space (see section 4.4.). The way information is represented in those systems may well influence the perception of scale.

The scale of spaces can influence the choice of reference frames. Montello distinguishes four types of spaces based on the "projective size" of those spaces. Verbal or gestural descriptions often contain implicit assessments of scale. A representation of a space can influence the perceived scale of the space and therefore the way it is described.

2.2.3.3. Perspective in descriptions of space

Besides reference frames and scales a description of space depends on the "perspective" of the describer. This perspective is influenced by the goals of the describer.

An common example of a textual description of an environment is a guide-book. In an informal review of guide-books done by Tversky two main types of perspective were discovered:

A route perspective takes readers on a mental tour of the environment, describing landmarks with respect to the (mentally) changing position of the reader in terms of the reader's front, back, left, and right. A survey perspective gives the readers a bird's eye view, and describes landmarks relative to one another in terms of north, south, east and west. These two perspectives have parallels with two major means of learning about environments, the first through exploration, and the second through maps. [TVER93, p.19]

The route perspective often uses a different reference frame in the description than does the survey perspective. It is very likely that route perspective uses reference frames like those proposed by Bryant (egocentric, allocentric or external reference frames) whereas a survey perspective relies on a fixed reference frames like for instance NSEW.

Conclusion

Space can be perceived in a variety of ways. This makes it possible to verify spatial perception and makes people very confident in spatial perception. It is possible to perceive space from abstractions, like maps, perspective drawings and language descriptions. Language descriptions of space make use of reference frames. The choice of reference frame in a description depends on cultural issues, on the situation, and on the scale of the space described. Representing space can change the perceived scale of the space. The scale of the space and the perspective of the describer influence how the space is described. Representations of space have to be designed by keeping these issues in mind because otherwise the representation can lead to a wrong perception of the space.

2.3. Spatial memory

Human spatial behavior is dependent on the individual's mental representation of the spatial environment [DoSt73], [SHUM90]. The representation is used to direct action and the experiences are used to further modify the representation.

The various models of the ability to learn an environment and to remember spatial relationships form the basis to understanding misjudgments in spatial relationships. This section looks at diverse mental representations. Most models differ in the sequence in which properties of spaces are learned. This sequence is of importance to virtual environments which are designed to ease navigational tasks.

2.3.1. Cognitive maps

The common view of how people represent spatial relationships in an area mentally is based on the concept of the cognitive map. This term hints towards a map-like construct in our minds that we are able to look at to answer questions about the area represented. This concept is not generally accepted today. The following citation avoids the term altogether and outlines the tasks the mental representation is needed for - be it a cognitive map or something else.

A property of the physical environment of distinguished psychological importance is the fact that the environment completely surround us. Thus it is not possible for us to experience or perceive all of it at any one instance. We can only turn our attention to discrete aspects of the environment at successive points in time. However, in order for our behavior to be appropriate, effective, or adequate in relation to the physical environment, it is necessary for it to proceed in a continuous fashion. To explain the way in which this discrete experience can produce continuous interaction it is necessary to postulate some representational process on the part of the individual. This "representation" must amalgamate experience into a form which links discontinuities in perception and allows extrapolations to facilitate preparation for future action.

In the case of an individual coping with a city, it is further necessary to assume that the postulated amalgamation of experiences is somehow summarized. This summary consists of a pattern or structure, the resultant effect of which is to organize the 'representations' of the experiences of the city and their implications.

Note that the term 'representation' is in quotation marks. This is because there is nothing within our description of it to say that it must have any direct relationship to the usual methods of representing cities. [CaTa75, pp. 59-60]

Learning route-knowledge

Way-finding is the ability to learn a route through the environment. The work of Piaget and others on learning this route-knowledge forms a basis for most theories of way-finding which were accepted for a long time [BLAD93]. According to those theories the cognitive map is learned in small steps:

  1. First landmarks are learned. Those are objects or prominent points that act as orienteering aid - they are points or objects which have a special meaning, which can be seen from farther away and which can be distinguished easily from other points in the environment.
  2. Only after landmarks are learnt, linear elements, like routes, are learnt.
  3. Finally metric survey information is learnt.
This information is called cognitive map to distinguish them from true maps. Piaget's work is based mainly on research on children and suggest several stages of the development of environmental knowledge. Partly because of the focus on non-adults this theory and its offsprings are not entirely accepted anymore and the need for a new approach is advocated [BLAD93].

Today it is accepted that children and adults can perform way-finding tasks successfully with little experience and that landmarks and routes are learnt conjointly and very quickly.

As maps they [the cognitive maps] are presumed to be coherent wholes that reflect spatial relations among elements. As mental constructs available to mental inspection, cognitive maps are presumed to be like real maps available to real inspection, as well as like images which, according to the classical view of mental imagery, are like internalized representations. [TVER93, p.14].

This description states, that cognitive maps are presumed to be coherent wholes. Mental spatial memory however has certain flaws which are not explainable using a single coherent map-like construct like the cognitive map. Tversky reviews several studies which indicate that such a map-like construct is not very probable [TVER93] (see also section 2.3.5.). She therefore describes two other, constructivist views of how spatial environments could be represented mentally: the cognitive collage and the spatial mental model.

This section introduced the concept of the cognitive map and explained why it is assumed that the cognitive map is not a very adequate model for spatial memory. Piaget's model of how children acquire route-knowledge was introduced.

2.3.2. Cognitive collage

The constructivist view assumes that people acquire disparate pieces of knowledge about environments. This knowledge is used when describing routes and when making judgments about locations, directions and distances. Those pieces of knowledge include various kinds of information like memories of maps, recollections of journeys, directions, facts and more. For environments, that are not known in full detail the information may be in different forms, some of them not map-like at all.

Tversky describes this representation like that:
In these cases, rather than resembling maps, people's internal representations seem to be more like collages. Collages are thematic overlays of multimedia from different points of view. [TVER93, p.15].

As these constructs represent spatial relationships from various points of view they do not contain coherent metric information. The term metric here not necessarily hints at the properties of a metric distance-function like those presented in section 2.1.2.

The cognitive collage as an alternative representation of spatial knowledge sees spatial memory as a multimedial representation of disparate pieces of spatial information. Cognitive collages do not contain coherent metric information.

2.3.3. Spatial mental models

People seem to have a rather accurate representation of spatial layouts when environments are simple or well-learnt. This layout information can not be explained well using the cognitive collage model. Therefore a spatial mental model was proposed. Such mental models capture the spatial relations coherently and allow perspective-taking, reorientation and spatial inferences.

Unlike cognitive maps however they [mental models] may not preserve metric information. Unlike cognitive collages, they do preserve coarse spatial relations coherently. These are relations that are easily comprehended from language as well as from direct experience. [TVER93, p.15].

Spatial mental model captures the inexact way people often speak about spatial relationships. That is they contain metric information but only on a coarse level. The spatial mental model is an adequate model for representing spatial knowledge used in descriptions of spaces where terms like "next to", "near" and so forth are commonly used.

2.3.4. The TOUR model

Kuipers bases his TOUR model of spatial memory on five different categories of spatial knowledge: routes, topological structure of street networks, the relative position of two places, dividing boundaries and regions. This knowledge is enhanced with a set of inference rules for way-finding in the environment. [KUIP78]

Kuipers suggests in his TOUR model that the mental representation of spatial knowledge is more like many maps in the head, loosely related, for the cognitive map certainly lacks the global consistency of a single printed map [KUIP78, p.132].

This mental representation is like a network made up of streets and intersections where the exact shapes and lengths of the links in the network are often unimportant. A third component of the mental representation he proposed is something like a catalog of routes. The TOUR model shows well that the mental representation frequently contains distortions as the exact lengths and shapes of links are often unimportant.

Yet another mental representation, the Spatial Representation System (SRS) is proposed by Bryant [BRYA92]. It is separated from any other memory processes. The basic idea is that people create the same sorts of cognitive maps and mental spatial models from verbal descriptions and direct observation. This suggests that people have a distinct spatial representation system that creates spatial models from disparate sources of input and is independent of memory systems for other domains of knowledge. The primary role of the Spatial Representation System is to organize spatial information in a general form that can be accessed by either perceptual or linguistic mechanisms. [BRYA92]

2.3.5. Distortions in spatial memory

Depending on the model of spatial memory used systematic mistakes in spatial memory can be explained. As spatial knowledge is acquired in small discontinuous steps the pieces of the cognitive representation are not always strongly related to each other. This leads to various systematic errors in distance estimations and the estimation of alignment of objects. This section describes two of these systematic errors.

Systematic errors in distance perception

In [TVER93] Tversky reviews several studies about systematic errors in distance perception. For example distances between functionally grouped buildings are perceived as being smaller than distances between buildings that do not belong to the same functional class. This fact is seen as evidence that the mental representation is structured hierarchically. Similarly distances between landmarks located farther away are perceived as being smaller than distances between landmarks that are close by.

Particularly interesting is an asymmetric distortion in the perception of distances between a landmark and another point of reference: The distance from the point of reference to the landmark is perceived as being smaller than the distance from the landmark to the point of reference.

Another systematic error concerning the estimation of distances occurs when the length of a route is estimated, that features many barriers, detours, has many turns or has more "clutter". Typically such a route is perceived as being longer than a equally long route which features less turns and "clutter".

Systematic errors in the perception of alignment

Another typical error in the spatial memory concerns the rotation of areas according to a frame of reference. Several models of spatial memory assume partly or only weakly related items in the mental representation. The relation of such pieces to each other is learnt only vaguely. As a consequence people show a tendency to straighten objects according to axes even if these objects are not aligned properly in reality. It is also common to straighten out irregular features like rivers and streets.

These systematic mistakes partly are observable only in environments that show much freedom in alignment of objects. If an environment is described mainly using language the degree of freedom for rotation is reduced or the rotation is not mentioned explicitly at all. To be exact - language itself does not impose restrictions on how this kind of information is described. However in natural language the possible set of values for orientations and positions is restricted by the descriptions. This can be overcome only by application of a more formal language.

Another way to communicate this inexact type of information is by making use of gestures. This type of mixed-mode communication is used in a system under development at the MIT MediaLab. The system uses verbal and gestural information in conjunction to control an application moving objects in a virtual space. The system is controlled mainly by spoken language but directions, distances and amount of rotation is input into the system using gestures. A typical command in this system is "Tilt the floor like that". Using the gesture the amount and plane of tilt is "shown" to the system. The system thus avoids using a formal command language to state an exact amount of degrees to tilt the floor [KoSp94].

This type of communication is very common in human communication. A typical example of such gestures is given by Desmond Morris in his book "Peoplewatching". People when asked about directions commonly use a pointing gesture. The angle the pointing arm takes in relation to the floor, that is how much people point up, is directly related to the distance of the location pointed out.

Conclusion

This section describes several models of spatial memory. The various models try to explain systematic mistakes in spatial memory. The most commonly used model - the cognitive map is seen as a map-like construct. It does not explain systematic errors in spatial memory well. The cognitive collage sees the mental representation as a loose collection of multimedial information with little metric information. The spatial mental model assumes metric information of a coarse kind to be present. This model explains the coarse use of spatial relationships in language descriptions well. The TOUR is a computational model of spatial memory. It is based on five different kinds of spatial information. It shows well that spatial memory often contains distortions.

2.4. Navigation as an activity in space

According to a lexicon navigation is the task of plotting a route and to steer a ship or aircraft along it. A simpler definition of navigation is: Navigation is to answer the following two questions: The second question is similar to the question The questions "where" and "how to get to" directly hint that navigation is concerned with spatial concepts. Navigation therefore is an activity in space.

The lexicon definition of navigation sees the navigational task mainly as a planning task. In this thesis navigation is not always associated with planning however. In the context of computers it is common to use the term navigation also for a less planful and more or less tentative way finding. That means every task which aims at locating objects in space or to "navigate to that object" is an act of navigation even if this activity is not planned. It is even seen as navigation to proceed using random decisions about where to navigate at various points on a route. The term navigation is used in the computing domain even without a clear spatial aspect of the task.

Is navigation without memory possible?

Navigation not necessarily is a planning activity. Simply following a red line painted on the floor certainly is a task of navigation although this type of navigation requires neither memory nor intelligence - it is the same kind of navigation robots delivering raw materials in factories sometimes make use of. (In Boston such a red line indicates the so-called freedom trail -- a route almost every tourist walks as it contains most major sights in the city.)

Another example of navigation without memory is finding an exit in a maze when always choosing the left turn. As long as the maze has no loops this strategy is bound to find the exit of the maze.

According to Montello (personal communication) Rodney Brooks from the MIT argues that robots can be designed without programming any "cognitive map" or similar type of memory. Also according to Montello James J. Gibson, a psychologists, proposed a "direct" theory of perception in which information is "picked-up" from the environment without the need for comparison to memory, and without the need for interpretation with respect to internal representations stored in memory. The examples of the maze and of the robot are examples of navigation using this "direct" perception.

2.5. Conclusion

This chapter describes human's spatial perception and memory. Spatial perception can occur from direct perception or from abstractions and language. Language descriptions of space depend on the use of reference frames. These again are dependent on the situation, the cultural background, and the perspective of the describer and on the scale of the space. Diverse representations of space can change the perception of scale and therefore influence the description of space.

Several models for spatial memory have been described. The various models try to explain systematic mistakes in human spatial memory in distance and alignment estimations. The frequently used concept of the cognitive map is not an adequate model as it cannot explain these errors well. Other models explained are cognitive collages, spatial mental models, the TOUR model and the Spatial Representation System. These manage to explain systematic errors better as they do not see spatial memory as a coherent whole but as collections of various types of information with more or less metric information.


[Contents]--[Abstract]--[1]--[2]--[3]--[4]--[5]--[6]--[7]--[8]--[References]

last modified on 10/31/96
Andreas Dieberger
andreas.dieberger@acm.org