Cybernetic and Connectionist Conceptions of the Mind

Comments on Barry Smith, "The Connectionist Mind: A Study of Hayekian Psychology"

Peter Cariani

Electronic-symposium, archive: http://maelstrom.stjohns.edu/CGI/wa.exe?A1=ind9711&L=hayek-l

1. Overview

I should say at the outset that I found Barry Smith’s paper quite refreshing and enjoyable, and I find myself agreeing with large parts of it. Hayek clearly sees the brain as a general correlational mechanism for coordinating action contingent upon behavior. I agree that the account that Hayek presents in The Sensory Order is largely consistent with contemporary connectionism, but I’m not sure what it adds beyond what Troland and Boring had said two decades earlier (Troland, 1929; Boring, 1933; Boring, 1942).

I don’t think connectionism is the last word, however. There are alternative ways that the brain might work that are equally compatible with the general idea of the brain as a correlation machine. Connectionism in most respects is a big improvement over symbolic AI because it reintroduces a certain dynamism into models of mental processes that was lacking in purely logic-based approaches. This dynamism is due both to the reinstatement of adaptivity and self-stabilization as a central aspect of the functioning of neural networks, and to the reconnection of these networks to the external world. At the very least, the connectionist resurgence has opened up the range of models for the brain and for mental phenomena that can be seriously discussed. At the same time, there are a number of very basic assumptions about the nature of neural codes that have been tacitly made by connectionist models (and by Hayek) that are open to question.

2. Symbolic artificial intelligence (AI), connectionism, and cybernetics

One should always be wary of dichotomies, because they invariably give the illusion that their categories are exhaustive. So it is with theories of the brain. I think if we are to understand Hayek’s position on these matters, it’s useful to review the kinds of ideas that were around before the current paradigms.

Most contemporary accounts of theoretical neuroscience and cognitive science focus on symbolic AI and connectionism, mainly because these strands are what dominate the scene today. This is mainly due to the predominance of symbolic AI from roughly 1965-1985, and the resurgence of neural network approaches from 1985 to the rpesent. However, if we look back before symbolic AI attained hegemony over research in the cognitive and information-sciences, there was a greater diversity of thought about the brain. A number of other intellectual strands co-existed with the purely logical, symbolic approaches and the neural networks that were the precursors of symbolic AI and connectionism. These included hierarchical and homeostatic feedback control systems (Cannon, Weiss, Bertalanffy, Weiner, (Ashby, 1956; Ashby, 1960), later, McFarland and (Powers, 1973)), recurrent neural networks (Hebb, McCulloch, and Pitts (McCulloch, 1951; Pitts and McCulloch, 1965)), time-delay neural networks ((Licklider, 1951)), adaptive timing nets ((MacKay, 1962)), self-organizing systems ((Pask, 1961; von Foerster, 1984)), systems of coupled oscillators ((Greene, 1962)), relational networks of biological and neural processes ((Rashevsky, 1948)), notions of mass action ((Beurle, 1956), later Blum, Walter Freeman) and statistical orders (later,Roy John, (John, 1967; John, 1972; Thatcher and John, 1977)), Gestaltist field theories (Wolfgang Kohler), and autonomous robots (Grey Walter).

Many of these ideas came under the rubric of "Cybernetics," in which analog, circular- causal processes formed the building blocks for adaptive, interactionist machine-environment dynamics. Others came under "theoretical biology." Hayek implicitly refers to many of them in Chapter IV of his book, and in the context of his discussion of the ‘feed-back principle’ (p. 95 section 4.54). The general approach of the time was distinctly biological in flavor, emphasizing the constant interactions of organisms and machines with their environments, and the ongoing (re)production of functional relations of organisms and brains by themselves. Broadly speaking, the underlying philosophy of cybernetics was Aristotelian: very much concerned with how meanings and purposes can come to be embedded in material systems.

Contrasted with the cybernetic approaches were logic-based approaches to reasoning that were part of a larger wave of Platonism that gained momentum in the 1930’s. Over this evolution, distinctions between analytical truths (logical or conventional necessity) and empirical truths (contingent upon observation) were abolished, such that "observation terms" and notions of semantics external to notational systems were replaced by internal, logical semantics. (One sees this in Rudolf Carnap’s trajectory). In this way, "semantic" relations between the symbols in the notational systems and the material world became syntacticized. One of the consequences of this turn was to isolate the mind and questions of intelligence from the outside world, relegating the processes of perceiving and acting to secondary, tertiary, and even inconsequential status. In symbolic AI it was held that the really important problems were those of symbolic reasoning, that the sensors and effectors of devices were peripheral to the central questions of mind. The distinction between analog and digital processes was argued to be inessential, on the grounds that, in principle, any analog process could be simulated in a digital computer. Even in arenas like machine vision, the distal stimulus was quickly pixelized and reduced to a set of features that could be logically manipulated. In theories of science, the earlier conceptions of the scientific model that made clear distinctions between operations of measurement and those of formal computation (e.g. von Helmholtz, Hertz’s commutation diagram, Mach, Bridgman, Bohr) eventually gave way to accounts of science that claimed that no observables are even needed (e.g. (Churchland, 1985)).

Partly because of the rising technological and economic power of the digital computer as the pre-eminent information-processing technology of the day, and partly because of the (some would say arrogant) attitudes of its advocates, the ascendancy of symbolic approaches to dominant positions within the research community in the mid-1960’s coincided with abrupt reductions of funding to the alternatives (see the paper by Dreyfus and Dreyfus that Barry Smith cites). Undoubtedly, the Vietnam War also played a role. Slowly, some of these paradigms are re-emerging back into the general discourse, but after several decades in the funding desert, many of the ideas and the lessons learned have been completely forgotten. In many cases there were ideas that were developed, such as von Foerster’s notion of mental states as eigenstates of a dynamical system, that have yet to be rehabilitated . A few other paradigms that were not based on symbol manipulation, such as the Gestaltist and Gibsonian traditions managed to survive on the margins of psychology by finding supportive niches.

3. Connectionism

In the mid-1980’s, with the development of massively parallel machines, the failure of expert systems approaches, and the rediscovery of back-propagation, there came renewed interest and funding support for adaptive approaches to problem-solving. These included neural networks and a host of other kinds of trainable machines. In many ways the resulting connectionist paradigm is qualitatively different from symbolic AI, in that the structure of the system arises through learning and experience rather than through direct programming. Associationist accounts for behavior are now again permitted alongside pre-programmed symbolic faculties. Glasnost is a good thing.

On the other hand, the basic connectionist representations consist of discrete, pixelized features and the processes are still digital computations, albeit parallel microscopic ones. While we now better recognize the severe limitations of direct programming (and of command economies), we are still casting our models of the brain in terms of discrete processes that digital computers can most easily carry out. Thus far, there has been no similar large-scale revival of analog-based architectures (Carver Mead’s excellent work notwithstanding), those that would be needed to drive the kinds of operations that Gestaltists and Gibsonians envision.

On one level, the debate between the proponents of symbolic AI and connectionism really boils down to an internal dispute within the broader computationalist program over which kinds of computational architectures are better. Barry Smith highlights the transparent 1:1 relationship between symbolic categories and symbols in symbolic AI systems and, in contrast, the complicated, opaque structure of connectionist networks. This is indeed a problem, because this complicated, opaque structure makes the informational equivalence classes themselves hard to access. One runs into this problem practically when one is deciding what methods one uses to use to detect order in messy data (e.g. neural spike train data). Does one use a straightforward analytical method, whose operations are simple enough to understand and whose results are straightforwardly interpreted, or does one use a neural network that may exploit hidden order in the data, but that leaves its user largely without any real grasp of the nature of that order? On the other hand, I think probably too much is made of this distinction. After all, a connectionist net is a completely formal, mathematical object, and while its behavior might not be as readily understood as its symbolic counterpart, it can in principle be exhaustively described with its equivalence classes explicitly mapped out.

I’m similarly dubious that the "knowing that"/ "knowing how" distinction that Barry Smith mentions separates the two approaches. In the computational context, this is the distinction between storing an explicit representation in memory or using a procedure to recompute it (cf. section 3.50 in Hayek’s book). Both the weight matrices of connectionist nets and the inferential procedures of symbolic AI programs can span this memory-processor-based continuum.

4. Connecting minds and machines to the world

The main advantage of an adaptive system over a nonadaptive one is that the designer no longer needs to foresee all of the particular interconnections andparameter settings that one needs to solve a problem. The programmer explicitly designs a nonadaptive logic-based AI program. For a trainable machine or neural network, however, the programmer specifies a set of parameters and a learning rule. Provided that the learning rule is judiciously chosen, the program adjusts its decision rule so as to improve it with experience. Thus, the (syntactic) rules that govern the operation of themachine are fixed in the first case, but are adaptively specified in the second.

In both of these cases, however, if the machines are to solve real world problems, like classifying cars on the freeway, the human programmer-designer must also specify which aspects of the external world are to be mapped onto the symbols in the machine, i.e. s/he must specify the external semantics of the system. Among all of the possible properties of vehicles (e.g. color, weight, size, number of passengers, number of wheels, composition of the exhaust, speed, etc.), some properties adequate to the task must be specified.

A major problem for both symbolic AI and connectionist systems is how one gets (new) semantic relations into the system. Symbolic AI systems suffer from their "closed world" assumptions, that all relevant distinctions are already encoded by the programmer into the notational system. Connectionist systems push this back several levels, by making the encodings more primitive sensory or motor distinctions. But both kinds of networks assume that all of the primitives of the system are given beforehand, that ideas are simply logical (re)combinations of pre-existing sensory features, motor actions, and cognitive representations. Virtually all of our cognitive models are founded on localist models of perception whose outputs are "perceptual atoms" or "sensory pixels."

Alternately, it’s possible to envision devices that adaptively construct their own interfaces with the external world. Each of us has an immune system that generates many possible antibodies and differentially replicates those that are effective in recognizing foreign entities. Antibodies are adaptively constructed molecular sensors. Similarly, artificial devices have been built that adaptively construct new sensors, thereby constructing their own perceptual primitives (Pask, 1958; Pask, 1959; Pask, 1960; Pask, 1961; Cariani, 1993). Pask’s demonstration electrochemical device incorporated some economic principles in its operation, allocating material resources to build structure on the basis of performance. In terms of the brain, we do not conceive of new sensory organs per se being constructed anew. But there are ways of envisioning analog neural representations of sensory information that contain much of the richness of the structure of the distal stimulus. One thinks of the formation of neural assemblies as "internal sensors", to detect those aspects of the external environment that have been retained in the internal analog representation. In effect, the brain brings the environment within and then constructs various kinds of sensors in order to detect particular invariances. Thus it is possible to conceive of systems that not only adapt within pre-existing categories, but which form new perceptual categories de novo. This can be seen in terms of the creation of new feature primitives (Pask, 1959; Pask, 1961; Cariani, 1992; Schyns, Goldstone and Thibaut, in press), or in perceptual learning (Gibson, 1991).

In any case AI, whether symbolic or connectionist, desperately needs to solve the problem of how to give a machine semantic as well as syntactic autonomy, i.e. the ability to independently interact with the world and to form its own external semantic linkages with it. I think we will solve Dreyfus’ "frame problem" (Dreyfus, 1979; Dreyfus and Dreyfus, 1987) only when we get out of our simulated, "toy" environments and into messy, ill-defined, nitty-gritty real world environments where categories are not pre-established and where the ability to formulate new ones is paramount. The problem itself is not intractable once the various layers of Platonic hubris can be shaken off.

5. Neural codes -- what kind of correlation machine is the brain?

For the most part, Hayek avoids specific hypotheses about how information is represented in the brain. I think this issue may have been largely beside the core point of the book, viz. the concept of humans and animals as self-organizing hedonic correlators that are not unlike market systems (or generally how coherent, spontaneous orders are possible in nature). He simply assumed that sensory information is represented by means of perceptual atoms that are then associated in some way that is contingent upon experience. I agree with Hayek’s conception of the brain as an extremely plastic general correlator of sensory inputs, with his relational psychology, and with his assumption (following the Gestaltists) of psychoneural isomorphism (i.e. that every sensory distinction that we perceive has a corresponding distinction in the organization of neural firing patterns). The main point of divergence between us, I think, would be regarding the basic neural codes and computations of the central nervous system. Secondary disagreements would be around his ideas concerning the functions of consciousness (his idea that it has some particular adaptive function vs. being an aspect of the coherent functional organization of a material substrate, as in a hylomorphic view, (Modrak, 1987)).

Hayek adopted the mainstream account of neural coding of the time, e.g. (Adrian, 1928). Neurally speaking, this assumption comes down from notions of signalling via labelled lines or "rate-place" codes (the kind of information conveyed in a neural message is given by the identity of the neuron from which it was sent, while specific alternative messges are conveyed by the rate of spike arrivals in the spike train message). Which neurons fire how much is all that is supposed to matter. Here it is assumed that the time structure of spike trains carries no information about the stimulus.

However, if one actually looks at the discharge patterns of sensory neurons in many different modalities, one sees that there is a great deal of time structure that is related to the stimulus (for reviews see (Perkell and Bullock, 1968; Uttal, 1973; Wasserman, 1992; Cariani, 1995; Rieke et al., 1997)). These were seen early on in the auditory system (Wever and Bray, 1937; Wever, 1949), and Boring (1930, 1942) and Troland (1929) discussed various coding possibilities. (Troland actually had a temporal modulation theory of the coding of color, so as to explain the colors created by the black-and-white Benham Top as it spins). One sees these time patterns in places where the stimulus itself has time structure: in the auditory system, the parts of the somatosensory system that register vibrations, and the electroceptive systems of weakly-electric fish. One also sees stimulus-related time patterns in places like color vision and the chemical senses, where the stimulus itself does not impress its own time structure, but one is created by the action of early lateral inhibitory systems.

The labelled line or "place-coding" assumption has many consequences for how neural networks are conceived. Effectively it means that the signals conveyed are scalar quantities -- one signal per neuron -- and that distinctions are made by different spatial patterns of excitation. These are analyzed by adjusting the weights on respective inputs. It means that maintaining precise point-to-point connectivity is exceedingly important to maintain the functional coherence of the network. Connectionist networks look like telegraph networks because of their basic assumptions about neural codes.

On the other hand, if information is encoded in the time patterns of spikes, then different time patterns can be conveyed simulaneously in the same spike train, i.e. information can be multiplexed, and one is no longer tied down to particular point-to-point pathways. This makes broadcast models of functional integration possible. Temporal pattern codes also permit multidimensional representations in one spike train, so that different sensory qualities can potentially have their own characteristic patterns (like having different frequencies in a radio network). This means ultimately that different kinds of information can be kept apart or combined at will, that elaborate spatial organizations for sorting out different attributes (like the pitch, timbre, loudness, location, duration or a sound) are no longer needed. The higher dimensional space of coded attributes then begins to look more like a symbolic notation with different type-categories , each of which adds a new dimension to a notation system.

It may also be the case that the structure of perceptual spaces may be a direct consequence of the neuralcodes that are involved. In audition, for example, wereadily hear the octave and other harmonic relationsthat make up melodies and harmonies. Rather thanbeing the end-product of a complex associative process,such perceptual relations may be the direct consequencethe neural codes that the auditory system uses to representsounds (i.e. those based on time intervals between spikes,). This is notto say that associative processes do not exist, but to saythat the neural codes may provide structure that allows usto more readily detect the invariances in the world aroundus (our powers of correlation may be aided by the analysisof temporal correlations in our neural representations.)Rather than break up the auditory stimulus into perceptualatoms, and reconstructing them, interval codes exhibit ‘emergent’ properties when they are combined (e.g.the periodicity corresponding to the missing fundamental).Connectionist systems miss all this. One needs time-delayneural networks or timing nets to do the analysis.

The upshot of all of this is that connectionist neural netmodels may not represent "the end of history" for thedevelopment of neural networks. The neural networks that have been developed represent only a small portionof the possible ways in which the brain might be organized.Ultimately I think we will need to incorporate the insightsof the Gestaltists, Gibsonians, cyberneticists, and others intoour general neural network paradigms .

6. Minds and markets

One of the very biological aspects of markets is that they allocate material resources for the growth of the physicalsubstrates that produce for them. In the brain, impulse trafficsimilarly regulates neural growth, such that neurons receive resources for their metabolic upkeep and extension. (See Purves, Body & Brain: A Trophic Theoryof Neural Connections, Harvard U. Press, 1988).This is fundamentally different from the kinds ofinformational feedback relations that obtain in a computersimulated neural network. Whereas money is the commoncurrency of the market, impulse traffic /excitabilityresources are the common currencies of the brain. Andyes, both systems depend strongly on informational andmaterial transactions that are constantly valuated...... More could be said about the analogy between minds and markets, but we'll have to explore these issues later.

References

Adrian, E.D. 1928. The Basis of Sensation. London: Christophers.

Ashby, W. Ross. 1956. An Introduction to Cybernetics. London: Chapman and Hall.Ashby, W. Ross. 1960. Design for a Brain. London: Chapman and Hall.

Beurle, R. L. 1956. Properties of a mass of cells capable of regenerating pulses. Phil. Trans. Roy. Soc. London B240 : 55-94.

Boring, Edwin G. 1933. The Physical Dimensions of Consciousness. New York: Dover.

Boring, Edwin G. 1942. Sensation and Perception in the History of Experimental Psychology. New York: Appleton-Century-Crofts.

Cariani, Peter. 1992. Some epistemological implications of devices which construct their own sensors and effectors. In Towards a Practice of Autonomous Systems. Edited by F. Varela and P. Bourgine. 484- 493. Cambridge, MA: MIT Press.

Cariani, Peter. 1993. To evolve an ear: epistemological implications of Gordon Pask's electrochemical devices. Systems Research 10 (3) : 19-33.

Cariani, P. 1995. As if time really mattered: temporal strategies for neural coding of sensory information. Communication and Cognition - Artificial Intelligence (CC-AI) 12 (1-2) : 161-229.

Cariani, P. 1997. Neural representation of pitch through autocorrelation. Proc., Audio Engineering Society Meeting (AES), New York, September, 1997

Cariani, Peter A., and Bertrand Delgutte. 1996. Neural correlates of the pitch of complex tones. I. Pitch and pitch salience. II. Pitch shift, pitch ambiguity, phase-invariance, pitch circularity, and the dominance region for pitch. J. Neurophysiology 76 (3) : 1698-1734.

Churchland, Paul M. 1985. The ontological status of observables. In Images of Science. Edited by P. M. Churchland and C. A. Hooker. Chicago: University of Chicago.

Dreyfus, Hubert L. 1979. What Computers Can't Do. New York: Harper & Row.

Dreyfus, Hubert L., and Stuart E. Dreyfus. 1987. How to stop worrying about the frame problem even though it's computationally insoluable. In The Robot's Dilemma: The Frame Problem in Artificial Intelligence. Edited by Z. Pylyshyn. 95-112. Norwood, NJ: Ablex.

von Foerster, Heinz. 1984. Observing Systems. Seaside, CA: Intersystems Press.

Gibson, Eleanor J. 1991. An Odyssey in Learning and Perception. Cambridge: MIT Press.

Greene, Peter H. 1962. On looking for neural networks and "cell assemblies" that underlie behavior. I. Mathematical model. II. Neural realization of a mathematical model. Bull. Math. Biophys. 24 : 247- 275, 395-411.

Grossberg, Stephen. 1988. The Adaptive Brain, v. I&II New York: Elsevier.

John, E. R. 1967. Mechanisms of Memory. New York: Wiley.

John, E. R. 1972. Switchboard vs. statistical theories of learning and memory. Science 177 : 850-864.

Licklider, J.C.R. 1951. A duplex theory of pitch perception. Experientia VII (4) : 128-134.

MacKay, D.M. 1962. Self-organization in the time domain. In Self-Organizing Systems 1962. Edited by M. C. Yovitts, G. T. Jacobi and G. D. Goldstein. 37-48. Washington, D.C.: Spartan Books.

McCulloch, W.S. 1951. Why the Mind is in the Head. In Cerebral Mechanisms of Behavior (the Hixon Symposium). Edited by L. A. Jeffress. 42-111. New York: Wiley.

Modrak, Deborah K. 1987. Aristotle: The Power of Perception. Chicago: University of Chicago.

Pask, Gordon. 1958. The growth process inside the cybernetic machine. Namur, Belgium.

Pask, Gordon. 1959. Physical analogues to the growth of a concept. In Mechanization of Thought Processes, Vol II. 765-794. London: H.M.S.O.

Pask, Gordon. 1960. The natural history of networks. In Self-Organzing Systems. Edited by M. C. Yovits and S. Cameron. 232-263. New York: Pergamon Press.

Pask, Gordon. 1961. An Approach to Cybernetics. Science Today Series. New York: Harper & Brothers.

Perkell, D.H, and T.H. Bullock. 1968. Neural Coding. Neurosciences Research Program Bulletin 6 (3) : 221-348.

Pitts, Walter, and Warren S. McCulloch. 1965. How we know universals: the perception of auditory and visual forms (1947). In Embodiments of Mind. Edited by W. S. McCulloch. 46-66. Cambridge: MIT Press.

Powers, WIlliam. 1973. Behavior: The Control of Perception. New York: Aldine.Rashevsky, Nicholas. 1948. Mathematical Biophysics. Chicago: University of Chicago Press.

Rieke, Fred, David Warland, Rob de Ruyter van Steveninck, and William Bialek. 1997. Spikes: Exploring the Neural Code. Cambridge: MIT Press.

Schyns, Phillippe G., Robert L. Goldstone, and Jean- Pierre Thibaut. in press. The development of features in object concepts. Behavioral and Brain Sciences

Thatcher, Robert W., and E. Roy John. 1977. Functional Neuroscience, Vol. I. Foundations of Cognitive Processes. Hillsdale, NJ: Lawrence Erlbaum.

Troland, Leonard T. 1929. The Principles of Psychophysiology, Vols I-III. New York: D. Van Nostrand.Uttal, W.R. 1973. The Psychobiology of Sensory Coding. New York: Harper and Row.

Uttal, W.R. 1988. On Seeing Forms. Hillsdale, NJ: LEA

Wasserman, Gerald S. 1992. Isomorphism, task dependence, and the multiple meaning theory of neural coding. Biol. Signals 1 : 117-142.Wever, Ernest Glen. 1949. Theory of Hearing. New York: Wiley.

Wever, Ernest Glen, and Charles W. Bray. 1937. The perception of low tones and the resonance-volley theory. J. Psychol. 3 : 101-114.