[Contents]--[Abstract]--[1]--[2]--[3]--[4]--[5]--[6]--[7]--[8]--[References]
This chapter overviews how the view of the concept "space" has evolved in philosophy and mathematics. The mathematical view of space will provide a terminology useful to distinguish between the properties of real and virtual environments. The remainder of the chapter is devoted to psychological issues in the perception of space and how humans memorize spatial relationships. Different sources of spatial perception are covered, like direct visual perception, perception using the other senses and perception using language. The section about spatial memory reviews several models of spatial memory and describes how these models try to explain systematic distortions in spatial memory. The final section of this chapter has a look at navigation and if it is possible without any spatial memory.
In the European philosophical tradition space is seen as a three-dimensional continuous and homogenous entity. Philosophy focuses on three problems: whether space is finite or infinite, the possibility or impossibility of empty space and the "reality" or "subjectivity" of space.
The concept of space is closely related to geometry. Euclid systematized and generalized the entire geometrical knowledge of antiquity in his "elements". Space that follows those basic geometric rules and man's everyday experiences is called Euclidean space. The Euclidean space concept is not consistent with contemporary knowledge from cosmology and physics - still it is an adequate concept of space for most every-day phenomena.
Demokrit postulated an empty space with moving atoms. This theory accepts an objective existence of space and time. His theory involved the belief in the existence of minimal, indivisible units of space and time. This belief was grounded it the theory of numbers which states that everything can be reduced to series of natural numbers and ratios between them. Demokrit sees space as infinite and time as eternal.
In the middle-ages it was assumed that empty space could not exist. This view (horror vacui) was based on Aristotle's view which sees space as the place of all things and therefore as a container.
Bruno and Galilei advocated infinite space which brought them into conflict with the church. The principle of relativity, formulated by Galilei, states that space and time are homogenous for mechanic processes in every inertial system.
Descartes saw "extension" as the basic principle of space and all things and reduces space to small particles which are called Corpuscles. This theory comes close to the theory of a world-ether from the 19th century.
According to Newton three-dimensional space has no inherent physical but only Euclidean geometric properties and is therefore absolute and independent from matter therein. Absolute space thereofore is seen as an immaterial container for objects.
Toland, an English materialist, however rejected absolute space that can exist independently from matter. Also Leibnitz saw the absolute conception of space as one of the shortcomings of the materialistic conception. He postulated that space and time are relations between objects and processes. Space and time therefore are bound closely to matter and possess no absolute reality in the absence of it. As a last consequence Leibnitz sees space and time as subjective perceptions although they correspond an objective order of the things in the world.
Kant saw space and time as a purely subjective form of perception. His ideas on space and time were a strong criticism against both the absolute and relative conceptions of space. His ideas expressed in his "Critque of Pure Reason" can therefore be seen as one attempt to shed new light on an old debate.
Kant's central these is that space and time can not be thought of as if they were ordinary - physical, empirical - objects or events. Space and time are rather structures or systems to record and order the observations about things and events. As such, space and time do not belong in themselves to the empirical world, but are part of the mental equipment developed to capture and reason about the real world. [NUNE94, p.18] In other words, Kant thought that the order we find in nature is the order that exists in our minds, an order which is embedded or reflects the own structure of mind. [NUNE94, p.19] Therefore only space and time together make perception possible. In Kant's concept of space the space is independent from the contents of the space.
Einstein later disproved this assumption of independence of space and time from matter in his general theory of relativity. The contents of space influence space's structure. It should be mentioned that the theory of relativity does not imply that space without matter is without structure, but space is bent by the presence of matter. This leads to phenomena which are seen as empirical proof of the general theory of relativity, for example the bending of light that passes close to a star.
The development of the concept space started out with the general concept of a container. Of main interest for philosophy always was if space without content is still space or not. The current view is that matter influences the structure of space. Space is not seen any more as a three-dimensional concept but instead as a four-dimensional space-time continuum. The philosophical view of space in heavily influenced by physics.
Mathematically "Euclidean" space is characterized by orthogonal dimensions, where the number of dimensions is three for the space we are used to live in. Every point in space is defined by coordinates in an orthogonal coordinate system. Using analytical geometry distances between points can be defined. A space with more or less then 3 dimensions is called vector space.
Space is a relation defined on a set of objects. Spatial objects can be defined as a set of spatial locations. Relations on a set of such objects often have the form of a "distance". The concept of distance seems to be clear on first sight but distance is a very general concept that not necessarily has much in common with what we see as "distance" in everyday life.
The next formula shows a generalization of this theorem for n-dimensional space. The index i signifies the i-th dimension of the n-dimensional vector representing a point.
Another metric is the manhattan metric. It is sometimes called the cab-driver metric. For n-dimensional space it is defined by
This distance function calculates the distance a cab driver has to drive to reach a certain house in a rectangular grid of houses. This function has the properties of a metric, but its results may differ from those of the Euclidean metric (see figure 1).
Fig.1: Euclidean (a) and Manhattan metric (b).
A e-neighborhood of a point x in a metric space M is defined as the set
A neighborhood defines a set of points in space, which fulfill a closeness-property according to the distance function defined on that space.
The concept of space as defined in this section is very close to our everyday experiences. The world we live in is viewed as a three-dimensional Euclidean space. Generalization of this space concept to more dimensions is possible. Euclidean space is defined using the distance relation of the Euclidean metric. Properties of distance metrics have been defined.
This example shows one general principle of topological spaces: A topological space is one in which there is some arbitrariness in the positioning of locations and arcs and where the only relation that matters is contiguity. [GATR91, p. 122]
A graph G is a collection of nodes and a collection of edges. Nodes are sometimes called points or vertices. Two edges can meet only in their ending nodes. If edges have no direction they are sometimes called links. If they have a direction they are called arcs. A graph consisting of directed arcs is called a directed graph.
There are several ways to describe graphs. One of which is to describe them in graphical form. The left graphs in figure 2 is a directed graph, the right one an undirected graph:
Fig. 2: A directed (left) and an undirected graph (right).
Matrix representation of graphs
A graph can be described as a set of nodes and a set of correspondences, like: A is connected to B. Another way to describe a graph is by giving its adjacency matrix. Table 1 shows the adjacency matrix for the left graph in figure 2. In this matrix a value of 1 means "x has an arc to y".
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 1 | 1 | 1 |
| B | 0 | 0 | 0 | 1 |
| C | 0 | 0 | 0 | 0 |
| D | 0 | 1 | 1 | 0 |
A path is any sequence of arcs where the final node of one arc is the initial node of the next one. In the left graph of figure 2 a possible path is A, B, D, C. A "simple path" is a path that does not use the same arc more than once and an "elementary path" is one that does not use the same node more than once. A path is closed if the starting node and the ending node of the path are the same node. A chain is the undirected counterpart to a path. The order of a node is the number of edges ending in a node. A node is even if the order of that node is even and uneven if the order of the node is uneven.
A graph is said to be complete if for every pair of nodes there is a link between those nodes. A graph is symmetric if for every nodes a and b where there is an arc from a to b there is also an arc from b to a. A graph is a strong graph if for every two distinct nodes a and b there is at least one path to reach b from a.
The reachability matrix shows if a node b is reachable from a node a. In the case of the left graph of figure 2 the reachability matrix is almost identical to the adjacency matrix (see table 2).
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 1 | 1 | 1 |
| B | 0 | 0 | 1 | 1 |
| C | 0 | 0 | 0 | 0 |
| D | 0 | 1 | 1 | 0 |
Directed graphs
Every transport network is a directed graph as the good to be transported can move only into one direction. Arcs or links in graphs described so far have no "length". Graph theory is sufficiently general to describe graphs which show directed links or weighted links.
A distance between nodes in the graph could be calculated by a reachability function which yields a distance of 1 if A is reachable from b and 0 if it is not. Another distance function can be defined by using the length of the shortest path from a to b. It is possible to attach weights to arcs and thus to introduce arc-lengths.
A typical example of weighted graphs are hypertext systems which will be examined in section 5.1. and textual virtual environments (section 4.4.2.). In those systems pieces of information, called nodes, are interlinked by connections called links. Ordinary those links are directed links.
In graphs it is possible to define distance functions. Simple examples are again simple reachability or adjacency functions. Attaching distances to arcs is also possible. In the case of a textual virtual environment where nodes show a distinct position, the distances between two such nodes should be determined by the spatial concept of the virtual space and by the positions of the nodes. Another possibility to define distances is by calculating the similarity of the contents of nodes. This approach is becoming common practice in hypertext systems.
Conclusion
Philosophically space is seen as one of the prerequisites of perception and as a mirror of our thought processes. While Kant assumed space to be independent from its contents, Einstein proved in his general theory of relativity that matter influences and distorts space.
A more general concept of space is the mathematical view of space. Mathematically spaces are defined as a system of relation on a set of elements. In case those set-elements are n-tuples the elements can be interpreted as points in an n-dimensional space. An example for a system of relation of points in such a space is the distance relation. Distance functions showing properties of a metric are called a "metric". Spaces defined using a metric distance function are called metric spaces. A commonly known example is the Euclidean distance function which defines Euclidean space.
Topological space is defined on sets and a neighborhood concept. In topological spaces the relation of contiguity is considered. Graphs can be represented graphically or in table-form as an adjacency matrix. Directed graphs are graphs where the edges have a directionality. Distances can be defined in graphs. Examples for directed graphs are hypertext systems and virtual environments.
A flat representation contains less spatial information than the real object - the missing information is provided by the picture-processing done in our heads. By delivering wrong cues to this processing it is possible to deceive spatial perception, as is done in many pictures of M. C. Escher. Spatial perception from a picture alone therefore can be misleading (see figure 3).
Fig. 3: Waterfall by M. C. Escher.
There is an even less spatial representation of spaces that can be perceived using the visual system: abstract representation of space using language. This possibility will be examined in section 2.2.3.
The human visual system is built to see spatially. Using spatial knowledge humans are able to see spatially when confronted with a representation of a spatial scene - for instance a perspective drawing. Such a drawing is an abstraction and can deceive the spatial perception.
Hearing sound spatially is perhaps the strongest spatial perception besides the visual one. The resolution of this perception is much coarser than the visual system but we are able to "hear" what place we are in and we can locate sound sources approximately. The acoustic sense allows humans to monitor the environment as it perceives from all directions at the same time.
Spatial perception can be induced by the kinesthetic sense when moving at high velocity. When moving in our environment the various senses allowing for spatial perception work together to give us a multimedial spatial perception of that environment.
Another commonly used abstraction to represent space are maps. They represent large spatial constructs at a high level of abstraction. While everybody can perceive spatially from pictures and simple drawings the reading of maps is a skill that has to be learnt.
The most abstract way to represent objects is through the use of language either in spoken or written form. Written language evolved from iconics [BOLT91] [MCCL94]. Kanji or the Chinese language even today use symbolic writing.
Sharing spatial information about dangerous places and places to find food and shelter probably was one of the first uses of language and so we have a very rich vocabulary to describe environments and spatial relations.
When reading textual descriptions of spaces or maps we are able to construct a mental representation of the space described. Perceiving spatially using descriptions is possible because the descriptions can invoke spatial images in our mind which induce a spatial perception.
Language does not specify space as completely or precisely as (direct) perception. Descriptions often do not exactly specify distances between objects or exact orientation. Therefore the spatial perception from a description relies on general knowledge and assumptions.
An example for reference frames was described by Kevin Lynch: "The system used on the North China plain is a strictly regular one. It has deep magical connotations: north being equated with black and evil, south with red, joy, life, and the sun. It controls very strictly the placing of all religious objects and permanent structures. Indeed, the chief use of the "south-pointing needle", a Chinese invention was not for navigation at sea, but for the orientation of buildings. So pervasive is this system that the country people on this flat land give their direction by compass points, and not by right or left, as would be natural to us. The organizing system does not center on the individual, moving and turning with him, but is fixed, universal, and outside the person." [LYNC60, p.128]
The system Lynch describes is a NSEW-reference frame (see below). Bryant proposed a classification of reference frames for his Spatial Representation Scheme (SRS). He sees reference frames are "coordinate systems in which locations can be specified along three dimensions" [BRYA92, p.3]:
The choice of which reference frame to use in a description depends on the language used, on cultural background, and on the situation at hand. This was demonstrated in a study by Pederson. The study looked at two Tamil linguistic systems, one of which is used mainly in cities whereas the other system is in use mainly in the countryside. These two systems show differences in the use of reference frames and therefore in the reference to objects in space [PEDE93]. This shows that descriptions of spatial relationships often depend on context and cultural background of the provider of the description.
By means of formal types of reference it is possible to point out objects and spatial relationships precisely. Everyday language is by far not as precise. Our language mostly provides words for coarse level descriptions of the environment like: "next to", "between", "to the left of" or "on top of". Those words always describe location as a relation to another element. Other words like "within", "contains" or "borders" are descriptions in relation to a reference frame, that often is assumed only implicitly.
When describing spatial relationships relations between objects must be described according to a frame of reference. According to a classification scheme proposed by Bryant there are several types of reference frames. The most important of them is the allocentric reference frame. The choice of which reference frame to use in a description depends on the situation and on the cultural background of the describer.
Montello defines scale as the ratio between the dimensions of a representation and those of the thing that it represents [MONT93, p.313]. He sees scale therefore a relative size of a representation. According to Montello there can be confusion concerning large-scale and large-size because in one instance the size relative to the real environment whereas in the other instance the size relative to a person is considered. Montello therefore introduces a different terminology that avoids this ambiguity.
He uses a four level classification of psychological spaces. He bases this classification on the earlier scale concepts of Ittelson, Mandler and Zubin. Montello distinguishes classes of spaces according to the "projective size" of the space relative to the human body and not according to the actual or apparent absolute size. Therefore a large scale space viewed from a distance may become a smaller-scale space:
An important task involving spatial information is to communicate spatial relationships to other people. Communication of spatial relationships is closely related to the issue of scale as many verbal and gestural descriptions of space contain an assessment of scale.
The simulation of spaces on a computer relies on a clear concept of scale too. A representation of a space on a computer screen is a tabletop-scale space representing larger or smaller spaces. Computer screens normally cannot be seen as environments but they are spatial simulations of environments. This is not evident in systems that try to immerse the user in a synthetic environment using a stereo-representation of a space (see section 4.4.). The way information is represented in those systems may well influence the perception of scale.
The scale of spaces can influence the choice of reference frames. Montello distinguishes four types of spaces based on the "projective size" of those spaces. Verbal or gestural descriptions often contain implicit assessments of scale. A representation of a space can influence the perceived scale of the space and therefore the way it is described.
An common example of a textual description of an environment is a guide-book. In an informal review of guide-books done by Tversky two main types of perspective were discovered:
A route perspective takes readers on a mental tour of the environment, describing landmarks with respect to the (mentally) changing position of the reader in terms of the reader's front, back, left, and right. A survey perspective gives the readers a bird's eye view, and describes landmarks relative to one another in terms of north, south, east and west. These two perspectives have parallels with two major means of learning about environments, the first through exploration, and the second through maps. [TVER93, p.19]
The route perspective often uses a different reference frame in the description than does the survey perspective. It is very likely that route perspective uses reference frames like those proposed by Bryant (egocentric, allocentric or external reference frames) whereas a survey perspective relies on a fixed reference frames like for instance NSEW.
Conclusion
Space can be perceived in a variety of ways. This makes it possible to verify spatial perception and makes people very confident in spatial perception. It is possible to perceive space from abstractions, like maps, perspective drawings and language descriptions. Language descriptions of space make use of reference frames. The choice of reference frame in a description depends on cultural issues, on the situation, and on the scale of the space described. Representing space can change the perceived scale of the space. The scale of the space and the perspective of the describer influence how the space is described. Representations of space have to be designed by keeping these issues in mind because otherwise the representation can lead to a wrong perception of the space.
The various models of the ability to learn an environment and to remember spatial relationships form the basis to understanding misjudgments in spatial relationships. This section looks at diverse mental representations. Most models differ in the sequence in which properties of spaces are learned. This sequence is of importance to virtual environments which are designed to ease navigational tasks.
A property of the physical environment of distinguished psychological importance is the fact that the environment completely surround us. Thus it is not possible for us to experience or perceive all of it at any one instance. We can only turn our attention to discrete aspects of the environment at successive points in time. However, in order for our behavior to be appropriate, effective, or adequate in relation to the physical environment, it is necessary for it to proceed in a continuous fashion. To explain the way in which this discrete experience can produce continuous interaction it is necessary to postulate some representational process on the part of the individual. This "representation" must amalgamate experience into a form which links discontinuities in perception and allows extrapolations to facilitate preparation for future action.
In the case of an individual coping with a city, it is further necessary to assume that the postulated amalgamation of experiences is somehow summarized. This summary consists of a pattern or structure, the resultant effect of which is to organize the 'representations' of the experiences of the city and their implications.
Note that the term 'representation' is in quotation marks. This is because there is nothing within our description of it to say that it must have any direct relationship to the usual methods of representing cities. [CaTa75, pp. 59-60]
Learning route-knowledge
Way-finding is the ability to learn a route through the environment. The work of Piaget and others on learning this route-knowledge forms a basis for most theories of way-finding which were accepted for a long time [BLAD93]. According to those theories the cognitive map is learned in small steps:
Today it is accepted that children and adults can perform way-finding tasks successfully with little experience and that landmarks and routes are learnt conjointly and very quickly.
As maps they [the cognitive maps] are presumed to be coherent wholes that reflect spatial relations among elements. As mental constructs available to mental inspection, cognitive maps are presumed to be like real maps available to real inspection, as well as like images which, according to the classical view of mental imagery, are like internalized representations. [TVER93, p.14].
This description states, that cognitive maps are presumed to be coherent wholes. Mental spatial memory however has certain flaws which are not explainable using a single coherent map-like construct like the cognitive map. Tversky reviews several studies which indicate that such a map-like construct is not very probable [TVER93] (see also section 2.3.5.). She therefore describes two other, constructivist views of how spatial environments could be represented mentally: the cognitive collage and the spatial mental model.
This section introduced the concept of the cognitive map and explained why it is assumed that the cognitive map is not a very adequate model for spatial memory. Piaget's model of how children acquire route-knowledge was introduced.
Tversky describes this representation like that:
In these cases, rather than resembling maps, people's internal representations seem to be more like collages. Collages are thematic overlays of multimedia from different points of view. [TVER93, p.15].
As these constructs represent spatial relationships from various points of view they do not contain coherent metric information. The term metric here not necessarily hints at the properties of a metric distance-function like those presented in section 2.1.2.
The cognitive collage as an alternative representation of spatial knowledge sees spatial memory as a multimedial representation of disparate pieces of spatial information. Cognitive collages do not contain coherent metric information.
Unlike cognitive maps however they [mental models] may not preserve metric information. Unlike cognitive collages, they do preserve coarse spatial relations coherently. These are relations that are easily comprehended from language as well as from direct experience. [TVER93, p.15].
Spatial mental model captures the inexact way people often speak about spatial relationships. That is they contain metric information but only on a coarse level. The spatial mental model is an adequate model for representing spatial knowledge used in descriptions of spaces where terms like "next to", "near" and so forth are commonly used.
Kuipers suggests in his TOUR model that the mental representation of spatial knowledge is more like many maps in the head, loosely related, for the cognitive map certainly lacks the global consistency of a single printed map [KUIP78, p.132].
This mental representation is like a network made up of streets and intersections where the exact shapes and lengths of the links in the network are often unimportant. A third component of the mental representation he proposed is something like a catalog of routes. The TOUR model shows well that the mental representation frequently contains distortions as the exact lengths and shapes of links are often unimportant.
Yet another mental representation, the Spatial Representation System (SRS) is proposed by Bryant [BRYA92]. It is separated from any other memory processes. The basic idea is that people create the same sorts of cognitive maps and mental spatial models from verbal descriptions and direct observation. This suggests that people have a distinct spatial representation system that creates spatial models from disparate sources of input and is independent of memory systems for other domains of knowledge. The primary role of the Spatial Representation System is to organize spatial information in a general form that can be accessed by either perceptual or linguistic mechanisms. [BRYA92]
Systematic errors in distance perception
In [TVER93] Tversky reviews several studies about systematic errors in distance perception. For example distances between functionally grouped buildings are perceived as being smaller than distances between buildings that do not belong to the same functional class. This fact is seen as evidence that the mental representation is structured hierarchically. Similarly distances between landmarks located farther away are perceived as being smaller than distances between landmarks that are close by.
Particularly interesting is an asymmetric distortion in the perception of distances between a landmark and another point of reference: The distance from the point of reference to the landmark is perceived as being smaller than the distance from the landmark to the point of reference.
Another systematic error concerning the estimation of distances occurs when the length of a route is estimated, that features many barriers, detours, has many turns or has more "clutter". Typically such a route is perceived as being longer than a equally long route which features less turns and "clutter".
Systematic errors in the perception of alignment
Another typical error in the spatial memory concerns the rotation of areas according to a frame of reference. Several models of spatial memory assume partly or only weakly related items in the mental representation. The relation of such pieces to each other is learnt only vaguely. As a consequence people show a tendency to straighten objects according to axes even if these objects are not aligned properly in reality. It is also common to straighten out irregular features like rivers and streets.
These systematic mistakes partly are observable only in environments that show much freedom in alignment of objects. If an environment is described mainly using language the degree of freedom for rotation is reduced or the rotation is not mentioned explicitly at all. To be exact - language itself does not impose restrictions on how this kind of information is described. However in natural language the possible set of values for orientations and positions is restricted by the descriptions. This can be overcome only by application of a more formal language.
Another way to communicate this inexact type of information is by making use of gestures. This type of mixed-mode communication is used in a system under development at the MIT MediaLab. The system uses verbal and gestural information in conjunction to control an application moving objects in a virtual space. The system is controlled mainly by spoken language but directions, distances and amount of rotation is input into the system using gestures. A typical command in this system is "Tilt the floor like that". Using the gesture the amount and plane of tilt is "shown" to the system. The system thus avoids using a formal command language to state an exact amount of degrees to tilt the floor [KoSp94].
This type of communication is very common in human communication. A typical example of such gestures is given by Desmond Morris in his book "Peoplewatching". People when asked about directions commonly use a pointing gesture. The angle the pointing arm takes in relation to the floor, that is how much people point up, is directly related to the distance of the location pointed out.
Conclusion
This section describes several models of spatial memory. The various models try to explain systematic mistakes in spatial memory. The most commonly used model - the cognitive map is seen as a map-like construct. It does not explain systematic errors in spatial memory well. The cognitive collage sees the mental representation as a loose collection of multimedial information with little metric information. The spatial mental model assumes metric information of a coarse kind to be present. This model explains the coarse use of spatial relationships in language descriptions well. The TOUR is a computational model of spatial memory. It is based on five different kinds of spatial information. It shows well that spatial memory often contains distortions.
The lexicon definition of navigation sees the navigational task mainly as a planning task. In this thesis navigation is not always associated with planning however. In the context of computers it is common to use the term navigation also for a less planful and more or less tentative way finding. That means every task which aims at locating objects in space or to "navigate to that object" is an act of navigation even if this activity is not planned. It is even seen as navigation to proceed using random decisions about where to navigate at various points on a route. The term navigation is used in the computing domain even without a clear spatial aspect of the task.
Is navigation without memory possible?
Navigation not necessarily is a planning activity. Simply following a red line painted on the floor certainly is a task of navigation although this type of navigation requires neither memory nor intelligence - it is the same kind of navigation robots delivering raw materials in factories sometimes make use of. (In Boston such a red line indicates the so-called freedom trail -- a route almost every tourist walks as it contains most major sights in the city.)
Another example of navigation without memory is finding an exit in a maze when always choosing the left turn. As long as the maze has no loops this strategy is bound to find the exit of the maze.
According to Montello (personal communication) Rodney Brooks from the MIT argues that robots can be designed without programming any "cognitive map" or similar type of memory. Also according to Montello James J. Gibson, a psychologists, proposed a "direct" theory of perception in which information is "picked-up" from the environment without the need for comparison to memory, and without the need for interpretation with respect to internal representations stored in memory. The examples of the maze and of the robot are examples of navigation using this "direct" perception.
Several models for spatial memory have been described. The various models try to explain systematic mistakes in human spatial memory in distance and alignment estimations. The frequently used concept of the cognitive map is not an adequate model as it cannot explain these errors well. Other models explained are cognitive collages, spatial mental models, the TOUR model and the Spatial Representation System. These manage to explain systematic errors better as they do not see spatial memory as a coherent whole but as collections of various types of information with more or less metric information.