[Contents]--[Abstract]--[1]--[2]--[3]--[4]--[5]--[6]--[7]--[8]--[References]
This chapter introduces the concept of the human-computer interface and the idea of using a spatial concept to design such an interface. The human-computer interface (or user interface) is the connection between a user and an object to be used by this user. This object may be as simple as a light-switch. In the context of this thesis the user interfaces to look at are user interfaces that allow people to operate computers to organize and navigate information spaces.
In computer system navigation does not involve physical movement of the user - instead information or the representation of an environment is moved in front of the users eyes on a screen. The navigation task in computer environments therefore concerns moving the environment around the user.
Still the main questions in navigation computer environments are similar. The task of navigating a computer environment therefore consists of answering the questions:
Navigational strategies in computer systems
Navigation in computer systems is similar to navigation in real world environments as it allows the same navigational strategies to be used:
Such text-based user interfaces later were enriched with menus which present the available commands. As the commands are visible for the user she does not have to remember the exact command-text but only to recall it. Selecting a command from a menu is done using a pointing device or a few keystrokes.
As was pointed out in the last section the use of metaphors is one of several ways to control the complexity of the user interface:
An alternate approach to controlling the complexity of user interfaces is to design interface actions, procedures and concepts to exploit specific prior knowledge that users have of other domains, for example, to design an office information system using the metaphor of a desktop. Instead of reducing the absolute complexity of an interface, this approach seeks to increase the initial familiarity of actions, procedures and concepts that are already known. The user of interface metaphors has dramatically impacted actual user interface design practice. [CaMK88, p.67]
Maybe it should be pointed out that whatever representation is chosen in an interface is always a metaphorical representation. The "object" we are referring to truly consists only of streams of bits in the computer system in total absence of concepts as files, folders, desktops, rooms and so forth. To be exact - even the concept of the bit is a metaphor.
Metaphors, by definition, must provide imperfect mappings to their target domains. If a text-editor truly appeared and functioned as a typewriter in every detail, it would be a typewriter. The inevitable mismatches of the metaphor and its target are a source of new complexities for users. [CaMK88, p.69]
These mismatches of metaphors often are important factors of the force of the metaphor. Mismatches in the metaphor can help considerably making a system useful if the mismatches are designed well. The user interface principle of forgiveness is particularly important in metaphor mismatches - it allows the user to explore those unfamiliar features of the system and by exploring them she easily learns to use them for her own benefit.
To handle metaphors and metaphor mismatches more efficiently in design several approaches have been proposed [CaMK88].
Formal representation of metaphors
As an example for a formalization of metaphors this section looks at a formal treatment of metaphors as algebras. It was mentioned already that the structural approach treats metaphors formally as a mapping from a source-domain into a target-domain. The mapped elements of the metaphor are the objects and operations. The kinds of structures that are preserved during this metaphorical mapping are called image-schemata. Several approaches to formalizing user interface metaphor have been described in literature (for instance [INDU92]).
Kuhn and Frank describe a formal approach based on algebraic specifications [KuFr91], see also [EhGL89]. In this approach metaphor domains can be formalized by algebras, metaphorical mappings by morphisms, and image-schemata by categories. An algebraic specification describes objects in terms of operations. It consists of three parts [KuFr91].
As an example consider the abstract specification of a desktop. The sorts used in the example are Desktop, Folder and Bool. Operators are defined to create a new desktop, to put a folder on the desktop, to get a folder from the desktop and to check whether a folder is on the desktop.
The formalization defines the exact meaning of each operation using equations (the axioms). The meaning is therefore independent from any interpretation of the names of objects and operators (The example is taken from [KuFr91, p. 422].)
Desktop Sorts Desktop, Folder, Bool Ops new: -> Desktop put: Desktop x Folder -> Desktop get: Desktop x Folder -> Folder on: Desktop x Folder -> Bool Eqs on(new,folder) = false on(put(desktop,folder1),folder2) = if folder1 = folder2 then true else on(desktop,folder2) get(put(desktop,folder1),folder2) = if folder1 = folder2 then desktop else put(get(desktop,folder2), folder1)
When defining an electronic desktop the same way - for instance the Macintosh desktop, we get the following specification. The specification is identical to the specification except for the prefixes "E" in the sorts because the example specifies only image-schemata:
EDesktop Sorts EDesktop, EFolder, Bool Ops new: -> EDesktop put: EDesktop x EFolder -> EDesktop get: EDesktop x EFolder -> EFolder on: EDesktop x EFolder -> Bool Eqs on(new,efolder) = false on(put(edesktop,efolder1),efolder2) = if efolder1 = efolder2 then true else on(edesktop,efolder2) get(put(edesktop,efolder1),efolder2) = if efolder1 = efolder2 then edesktop else put(get(edesktop,efolder2), efolder1)
A formal representation of every operation easily makes differences in the source and target domains of the metaphor explicit. A full specification of a metaphor will include many additional operators. It is a design decision which operations to include and which to omit.
In an actual interface design process, the designer has to decide which features of a source domain are to be considered salient and which are not. [KuFr91, p.423]. An example of a feature which normally is left away in electronic desktops is the fact, that objects can fall from the real desktop. For more detail see the next section which focuses on these "richness issues" in user interface metaphors.
Image-schemas are those structures in a metaphorical mapping that stay invariant under the mapping. Examples for such schemas in the context of the desktop metaphor are the schema of the surface to put something on and the container to put something in.
If image-schemas are common to different parts of a composite metaphor, the composite metaphor is more likely to be perceived as coherent than if this is not the case [KuFr91]. In any metaphor the algebraic specification defines an image-schema or a combination of several of them. This specification of the pure schema can then be enriched towards specifications of the complete source and target domains.
This section briefly described three approaches to handle metaphors: the operational, the structural and the pragmatic approach. A way of formalizing metaphors as mappings from a source domain to a target domain using algebras was described as an example of the structural approach.
An exact simulation of a typewriter performs only typewriter functions - nothing else. An exact simulation of a book on a computer would force the user to slowly turn one page after the other: (...) imagining an online documentation system that displays a document as a book; users display the next page of text by pointing (with the mouse) to the corner of the page, depressing the mouse button to 'grab' the corner, and pulling it across the screen to the other side, with an accurate animation of the whole sequence. [JOHN87, p.21]
Another example is an electronic mailing system: Would it be appropriate for users to have to affix electronic stamps, obtained from an electronic 'post office', to a document before sending it on its electronic way? That depends upon how conscious one wanted users to be of the cost of mailing an object. If, as in most electronic mail systems, users are not charged on a per-mailing basis or not at all, then adding such detail seems silly. On the other hand, if users are for some reason charged on a per-mailing basis, perhaps even depending upon the size of the mailed object, such a design might well be superior to the one used in most commercial time-sharing systems, in which the dollar balance in user's accounts decreases invisibly, requiring users to ask the system repeatedly for the amount remaining. [JOHN87, p.22]
The appropriate level of detail in a metaphor therefore must be determined on a case-by-case basis. Richness issues in metaphors not only concern the question which aspects of the real world to include or to exclude in a metaphor and where to go beyond the source domain as is looked at in the next section.
Several application programs even today create richer object representations for their files. Instead of icons, which are small symbolic abstract representations of the file type, they use document proxies, which are small pictorial representations of the contents of the file. An example is Adobe Photoshop which uses icons that are small scale versions of picture files. Such proxies are often easier to identify on the screen than icons with names. This need for richer representations is clearly recognized in the literature. [HoSa93], [MaSW92] The challenge is to enrich user interfaces without overloading them with so much information that they get unusable because of "information clutter". Figure 13 shows the development file representations from the file-name over the generic file icon to typed icons and further to document proxies.
Fig. 13: Development of object representations.
This section focused on the richness issue in user interfaces. User interfaces as mappings are bound to be incomplete mappings. It is a design issue which features to map and which not to map. A complete mapping normally does not make sense as it would map the disadvantages of the source domain too.
Conclusion
This section described the human computer interface. Modern user interfaces are based on a set of principles like metaphors and direct manipulation. Metaphors are mappings from a source domain to a target domain. Using a structural approach to metaphors they can be formalized as algebras. These consist of a list of "sorts", a list of operators defined on those sorts and a list of axioms (equations). Source and target domains of metaphors can be formally described using abstract specifications, which make differences in the source and target domains explicit. The structures that are identical in both domains are called image-schemas. A metaphor can formally be defined by first specifying the image-schemas and then enriching the domains to include additional operators. In many cases it is reasonable to enrich the metaphor beyond the source domain. These additional features are termed "magic features". Magic features can make metaphors much more useful but they must be pointed out to users and they should be designed in a way to make their working easily comprehensible.
I define spatial metaphors as user interface metaphors which make their space concept explicit and where the spatial relation that is expressed in this space has or can have a defined meaning.
There are many user interface metaphors that do not fulfill this definition fully. Most spatial metaphors allow users to rearrange objects in a sort of "space" but this rearranging has no meaning - that is an object A is in no way different if it positioned left, right or below an object B. In a true spatial metaphor the spatial position of an object and the arrangement of several objects has a certain meaning. This meaning can be predefined or it can evolve from working with the system. This spatial conception has to be consistent throughout the system - otherwise it communicates no information. A consistent spatial conception in the user interface can serve as a mnemonic system and as a means of communication.
It is not the computer system itself that defines this space for the user but the user interface. Only few user interface makes those spaces explicit. In those systems the computer system can be seen as a real "environment". This situation makes the term "navigation" even more logical in the context of a computer system.
An example of this is - again - the Macintosh user interface. On the desktop objects can be moved freely and the spatial concept is not at all clear. While the one user decides to group objects on the desktop simply on a space available basis, another one groups related items.
When objects on a desktop are arranged mainly on a space available bases, this arrangement expresses nothing. In the following figure there are several groups of objects, two of which can be seen as one larger group. Another group consists of three objects and there is even one object that stands on its own. Note that the groups consist of objects of different types - another possible arrangement therefore is based on the object type (see figure 14).
Fig. 14 - Arrangement of objects according to meaning.
The semantics of the spatialization is not explicitly given, but it is well understandable because of cultural habits - most people would arrange objects like than when "they belong together". But as long as the semantics of spatialization is not clearly defined or immediately understandable because of conventions spatialization is useful only for the person creating the spatialization.
The special cognitive reality of space (...) makes the spatial domain particularly suitable as a medium for conveying knowledge, since its properties are universal to different cognitive systems. Thus, the spatial domain can be used particularly well as the source domain for metaphors with a non-perceivable or abstract target domain. In this way, the properties of physical space can be used as vehicle for conveying non-spatial concepts, provided there exists a mapping from the non-spatial to the spatial domain. (...) I propose that our knowledge about the organization of space serves as a "cognitive interface" between abstract and non-perceptual knowledge and the "real world". In other words, we may interpret non-spatial concepts by mentally transforming them into spatial concepts (i.e. understanding them in terms of spatial concepts), carrying out mental operations in this "visualizable" and "graspable" domain and transforming the results into the original domain. [FREK91, p.362]
Text-based systems
In a spatial user interface objects have a representation and a distinct spatial position. The word "representation" hints at a graphical representation as was shown in the examples of section 4.3.2. However this representation can be entirely text-based. For example in text-based user interfaces like MS-DOS the representation of a file object consists of a filename. Providing the filename along with the path gives the file a "location". However as this user interface does not use a spatial metaphor this location is without meaning.
Text-adventure games like Zork are entirely text-based systems that use the metaphor of a landscape. The system presents descriptions of locations and possible "exits" that lead to other locations. Those exits commonly use directions like "up", "down", "west" or "north" and therefore create an explicit spatial context. Objects, rooms and activities in the game are described entirely using text. On first sight is seems surprising that these game environments manage to create a very realistic spatial environment by using textual descriptions only, on second thought the same is true for a good book (see also chapter 6). These systems are seldom considered as "spatial user interfaces". They show the potential to act as information organizers however [DiTr93a], [DiTr93b], [MaOs93].
Desktop based metaphors
Most spatial systems are based on graphic representations. One of the first graphical spatial user interfaces - using the desktop metaphor - was used in the in the Star system created at Xerox Parc (see for instance [BRSV87]). The desktop is an example of how a user interface metaphor can be used to help users understand the complexity of file management in terms of an office-based metaphor. The desktop metaphor was further developed in systems like the Macintosh Finder desktop interface.
Room-based metaphors
Extending the desktop metaphor to a larger space leads to the metaphor of the room. An example is the Rooms system [HeCa86]. Objects are located in a closed space called a room. Objects sharing the same room show a closeness relation and there is certain functionality that operates on the objects in this closed space. Rooms represent clusters of objects belonging together. User are able to navigate those spaces and thus navigate from one context, or one set of tasks and related applications, to another. This is like moving from a living room, which is dedicated to one set of tasks, to the work room, which is dedicated to entirely different set of tasks. Accordingly the various necessary tools are present or absent in those rooms.
Such spatial user interfaces are very promising on first sight - for instance there is much "room" to put things. Those things can be moved closer or further away and still may be visible for direct manipulation. When moving within this space the users can shift their focus or context from one set of things to another set.
However there are several open questions like: How shall users move? How big should such places be? How shall objects be represented? Do we want or need all that information to be visible? Those questions directly lead to researching navigation in those spaces.
Another room based spatial metaphor was used in the "3D/rooms" project at Xerox Parc [CLAR92]. Rooms are "multiple virtual worksets" connected by doors and windows. Within the rooms there again are various tools. This project uses 3-dimensional visualizations like the perspective wall, or cone-trees [RoCM93]. The basic idea of the rooms system and of its prerunner BigScreen [FeSh91] is to break the working area of users into spaces that are task-related, like a room for text-processing and another room for graphics-processing. In each room the right tools are present and tools not needed to perform the task are absent - this again reduces the complexity of the separate spaces.
While rooms in these systems are abstract, other systems like the ARK workspace, an extension to the Macintosh desktop, and Magic Cap try to extend the desktop metaphor to a real room and beyond.
Another system using a metaphor that is close to a rooms concept has been proposed in [FAHL93], [BeFa93], or [BBCH93]. This system is interesting as it defines a small space around the user. This space is an explicit realization of a "private sphere". It is called "focus" and is used to represent the users interest in objects in space. A related concept is the "nimbus" which shows how much users project their presence to others. These two concepts are enhanced to the notion of "aura", "an enabler of communication between two people in a virtual space". If the auras of two users intersect a conference is created. In this system this personal space shows a explicit meaning - the intent to communicate. The system further provides blackboards and meeting-tables which possess a large aura and allow several people to communicate in this space. For a more detailed description of this metaphor see [FAHL93], [BeFa93] or [BBCH93].
Landscape metaphors
Extending the space of rooms again to a larger real-world construct leads to metaphors of houses, cities and landscapes. An example is the Magic Cap system where it is possible to walk out of an office-room and to leave the house. In the systems several houses are available that - similar to the rooms metaphor provide diverse functionality. Other systems based on similar concepts are Apples eWorld system or the WebWorld system. (The WebWorld can be reached at: http://sailfish.peregrine.com/WebWorld.welcome.html) The WebWorld system is an example of a system, where the spatial concept is not well defined, which makes navigation difficult.
A house metaphor was used in a system called the BOOKHouse [PeGo90]. BOOKHouse used the metaphor of a library to provide easy access to a public library. This system had a strong focus on icons to describe clusters of related documents. To each icon a set of search keywords was associated for searching the library. Books in the library were represented on shelves like users see them when using a real library. In a talk at Hypermedia '93 in Zürich Mrs. Pejtersen - one of the developers of the system - described how this representation of the objects biased selection of books. Books that were not read for years before introduction of the system were suddenly the "runners" when those books got an appealing outfit.
A landscape based approach is described in [FLOR90]. Florin concentrates on how different types of data are represented using five embedded metaphors.
In Information Farming data is represented as fields. However in this metaphor those fields contain only data which is frequently used and that is no raw data. Raw data is held in information swamps or forests or, if it is data coming in continuously, it can be visualized as information streams or rivers. Static data is collected in barns. All those elements are joined together by roads and plazas - social places inhabited by farmers and their agents. The Information-Farming metaphor is a metaphor supporting collaborative work because several users, depicted as "avatars", may work together in the landscape. Private work and private social transactions take place in houses which have an owner who controls access. [BERN93]
Representing the user in the space
Most current user interfaces use a representation of a "world" of objects. The user is present in this world by virtue of a pointer. This representation of the user can interact with objects, touch them, move them and so forth. An other example of user representation is realized in text-based adventure games (see also section 4.4.2. and 6.2.) In those systems users commonly possess an easily changeable full-body representation which can be seen by the other users. This body representation communicates activities of the player to other users. Such a full-body representation sometimes is called an "Avatar" (Avatar is a term from the Hindu mythology and signifies a manifestation of a deity, a "god walking earth". Generally it signifies Vishnu, the pervader and sustainer in the Hindu Pantheon. Avatar can also be seen as a visual manifestation of an abstract concept. In the context of virtual environments all these meanings seem to fit in a way.)
The Information Farming metaphor is an example of a system making explicit use of the Avatar concept [BERN93]. Every avatar must obey a set of laws that restrict the Avatars' activities in space. Avatars can be in only one location at a time for instance. Avatars may use animals (agents) to help them.
This review of spatial user interface metaphors is far from being complete. As spatial user interfaces are rather popular today there are many new ideas brought forth every day. Among other metaphors that have been proposed are information oceans, liquid architecture [NOVA91], information forests [RIFA94], more abstract information landscapes like [CHAL93], travel metaphors [HaAL87], [HaAL88] and many others like [BRIL93], [DaSi93], [ERIC93a], [FAIR93], [LaMa91], [MaSh94], [MuPi93], [PaSu93], [PEMB93], [SmWi93], [TSPM91], and [VÄÄN93b].
Conclusion
This section defined the ideal spatial user interfaces as a user interfaces that makes its spatial conception explicit and where the spatial relationship between objects represented has significance. Advantages of spatial user interfaces are the amount of space to organize objects and the use of spatial arrangement as a tool for communication. The last subsection gives a review of spatialized user interfaces in literature. These systems can be classified into text-based and graphical system. Text-based systems commonly use a house and landscape metaphor. The graphical systems can be classified according to metaphor as desktop, room, and landscape based metaphors.
In Virtual Reality Systems the focus is on the creation of a virtual world. Instead of the term "Virtual Reality" this text in most cases uses the term "Virtual Environment".(The field of Virtual Reality showed much "hype" in the recent years. Many people seriously working in this field refrain from using the term "Virtual Reality" and prefer the term "Virtual Environment" to distance themselves from the Cyberspace mania.) There are several types of virtual environments according to the information channel they make use of and on the level of "presence" they achieve.
Virtual environments can be classified into
In this virtual environment it is impossible to move around at will. Instead the user sits in front of the screen and the environment moves in front of her. Still users often perceive motion in the environment as if they would move themselves.
Although these systems often have strong immersive effects they are seen as non-immersive. The reason is that the users very easily can "drop out" of the simulated world simply by looking away from the monitor.
Technically a head-mounted display is a pair of monitors with additional optical elements attached to a helmet such that stereo-pictures can be viewed more or less comfortably. The additional optics takes care that the monitors, which otherwise are too close to the eye to be viewed in focus, appear to be farther away and cover a broader viewing field. Head-mounted displays try to achieve an overall viewing field of at least 90°. A viewing field of this size is not perceived any more as a picture in front of the user but as a surrounding scene. This system therefore easily creates the illusion of being in a virtual world. [KALA93]
The head-mounted display contains also a position sensor. The data of this sensor, which is called "polhemus device", allows the computer running the virtual environment software to sense if the user moves around, turns the head or looks up or down. The system then displays the appropriate view in the head-mounted display so that the user gets the impression of moving in the environment she is entirely surrounded of.
Mankind is a very optically oriented species and the optical stimuli of the system are strong enough to make people believe to be in this virtual world. This effect is even strong enough to cause nausea when users look down a virtual crevice or walk a plank over a virtual abyss.
A variant of the head-mounted display is the boom-mounted display which has a stereo-display mounted on an supporting arm. This setup allows to use heavier and better monitors and more accurate position tracking.
In most immersive- or non-immersive systems virtual object are present in the visual or acoustic sensory channel. As those systems normally do not provide feedback along other sensory channels it is not possible to "feel" objects however and the user can simply move through the object - there is no wall the user runs into as the necessary output devices for virtual environment are not fully developed yet.
Realizing spatial user interfaces in immersive virtual environments
It is often stated that (immersive) virtual environments are the ultimate user interfaces since users interact directly with objects of the real or virtual life. In a study comparing subjects with and without previous computer experience it was observed that there was no significant difference in the time to master a simple manipulation task in an immersive virtual environment [FURN93]. This is seen as a hint that virtual environments really could be a way to make the interaction with machines easier.
Realizing a spatial user interface in a virtual environment puts tremendous possibilities into the hands of the designer of the user interface. There suddenly is no restriction to objects lying on a surface as they can hover in space, navigation from a place to another place can be realized easily using magic features and users have no problem flying high over the workspace to get an overview. The graphical virtual environment is at first sight the ideal basis for realizing magic features in any graphical user interface. For instance teleportation -- instantaneous travel from one location to another -- can be realized by simply switching the environment.
The representation of the environment influences the user's perception of the environment however. A common application for virtual environments is to visualize a house yet to be built. The customer can walk through the house and view it as if it were built already. In such an application the spatial perception of the house in the virtual environment should be identical to the real house - otherwise decisions based on the visualization are bound to be incorrect. As [HENR92] found out present virtual environment systems often cause a distorted perception of space.
The user therefore sees the surrounding real environment. Additionally the software can display information in the viewing field. In a library such a system could display a red rectangle at the position of the viewing field where the user sees a certain book on a shelf. This way the system could point out objects in the real environment. Other uses of such a system is to use virtual windows to display documentation in a real scene.
As was mentioned already virtual environments today are based mainly on visual inputs and 3D sound. The other senses seldom are supported. The augmented reality system has the advantage that those other senses do not need to be supported explicitly since the user is not manipulating virtual but real objects.
In an augmented reality system objects seen in the real world certainly are really there and - for instance - an apple seen through the augmented-reality display feels like an apple, smells like an apple and tastes like an apple.
Augmented reality thus elegantly circumvents the problem of non-visual and non-acoustic feedback. However there is one drawback: natures laws are fixed in the real world but are not in the virtual world. While the user cannot walk through a wall or simply take off for a flight around the house in the augmented reality system these features can be easily realized in an immersive- or non-immersive environment.
The main difference between the three types of graphical virtual environments presented in this section therefore is the amount of reality muted from the senses of the user. In the immersive environments the system tries to replace all real-world perception by virtual stimuli. In the non-immersive environment it is easy for the user simply to turn away from the monitor and the augmented reality system does not block reality at all but enhances it.
There is an ever growing field of auditory displays that try to create a spatial environment mainly using sound cues. Most systems use the metaphor of a house. For examples of such systems see [LUMB93], [PORT94], [MyEd92], [MYNA92] and [MyWe94].
While the research on acoustic virtual environments focuses on visually impaired users the lessons learnt from these systems may be used one day to make visual virtual environments places that are easier to navigate and more enjoyable to be in. There are strong tendencies in user interface design to communicate with the user using a much broader bandwidth of sensory stimuli. The use of an acoustic virtual environment to enhance user interfaces may be a good start for many navigational tasks [BrWE94] or for monitoring background activities [COHE93]. That such an enhancement makes the use of the system easier can be seen in many game programs [PGST94].