Connections and Symbols S. Pinker & J. Mehler (Eds), 1988 Cambridge, MM MIT Press
I
Stevan Harnad
Department of Psychology
University of Southampton
Highfield, Southampton SO17 1BJ
UNITED KINGDOM
This is a paperback reissue of a 1988 special issue of Cognition - dated but still of interest. The book consists of three chapters, each making one major negative point about connectionism. Fodor & Pylyshyn (F&P) argue that connectionist networks (henceforth 'nets') are not good models for cognition because they lack 'systematicity', Pinker & Price (P&P) argue that nets are not good substitutes for rule-based models of linguistic ability, and Lachter & Bever (L&B) argue that nets can only model the associative relations between cognitive structures, not the structures themselves.
What is at issue here, and how valid are these objections? Two partly distinct theoretical approaches to cognition are being favoured by these authors, in preference to nets: F&P prefer the 'symbolic theory of cognition', according to which thinking is just the ruleful manipulation of symbols (scratches on paper, holes on a tape, states of flip-flops in a machine) on the basis of their shapes, as in a computer or the idealized Turing Machine. The symbol manipulation is purely formal or syntactic. It is not based on the meaning of the symbols. Yet the symbols and the symbol manipulations can be given a systematic semantic interpretation-for example, they can be interpreted as representing objects or states of affairs. This capacity to sustain a coherent, systematic interpretation, in all and in part, and in each of the combinations and recombinations among its parts, is the property of 'systematicity'. F&P point out that, unlike minds and Turing machines, nets, which 'represent' things only as dynamic activity and connectivity among units', lack this property of systematicity, thereby making them unsuitable models for cognition.
The other theoretical approach, advocated in favour of nets by P&B and L&B, is Chomskian grammar, which consists of rules and representations that allow a person to exhibit linguistic (mostly syntactic) competence-among other things, the ability to produce and recognize syntactically correct utterances. Unless nets build in these rules and representations in advance (in which case their net properties are of secondary importance), they cannot provide adequate models for linguistic competence.
How valid are these criticisms? First, it must be pointed out that all three research programs-the symbolic theory of cognition, the connectionist theory of cognition, and the Chomskian theory of language-are incomplete, the first two radically so. Hence to the extent that the authors' arguments are based on what nets have not done, as opposed to what they probably cannot do, they are not very compelling, and unfortunately that extent is almost total. For one can reply to F&P that it is not at all clear that one cannot train nets to become systematic symbol systems. One can reply to P&P that it is not at all clear that nets cannot learn to represent and follow all the learnable rules of syntax (unlearnable rules are another matter, to which we will return), and L&B certainly have not proven that the structures that currently have to be prewired into nets could not be learned by them.
One could make these replies, but one could also point out that the existence of these learning capacities has not been demonstrated either-in nets or any other learning mechanism. So a lot of the discussion here is nondemonstrative, and based on bets about what the other side will or will not ultimately be able to do. Is that all there is to these issues then: carping among embryonic rivals? Perhaps, but some of the issues have deeper interest too.
Consider the issue of built-in structure: there is the question of theft versus honest toil (Harnad, 1976). Whenever a theory simply bestows structure on the mind a priori it is making an extravagant claim. To make it clear how extravagant, there is a quite natural way of construing Chomsky's theory as the 'Big Bang Theory of the Origin of Grammar': If the 'poverty of the stimulus' argument (to the effect that the available data are too underdetermined to allow grammar to be learned; see Chomsky, 1980; Lightfoot, 1989) is correct, then not only can grammar not be learned by the child, but it can't be "learned" by the species through evolution either (though cf. Pinker & Bloom, 1991), leaving no alternative but that it must have been inherent in the 'structure' of the Big Bang and merely unravelled by the causal contingencies of cosmology and history. Let us call such extravagant nativism a kind of 'theft', in contrast to accounts-albeit incomplete ones-that endeavour to come by the same structures (or others that generate the same competence) by 'honest toil', namely, through learning from the available data.
If the arguments and evidence for the poverty of the stimulus (in grammar) can be taken to have the force of a proof, then those who toil with nets or any other inductive approach have as much hope of succeeding as those who would trisect the angle with compass and straightedge. However, it is not at all clear that nonlearnability has the status of a deductive proof just yet. Besides, the rules underlying the particular aspect of grammar that P&P are considering - past transformation - are not even underwritten by a poverty of the stimulus argument. They are not unlearnable in principle. And the fact that one particular embryonic net (Rumelhart & McClelland, 1986) did not happen to do a very good job learning them is certainly no basis for believing that no net could do that kind of thing in principle (see Plunkett & Marchman, 1989, for more recent work of this kind).
And what of systematicity? Well here the problem may be deeper, and I, at least, think it resides in part in the miscasting of symbolism and connectionism as rivals for cognitive hegemony, rather than potential collaborators. Let us quickly set aside their trivial attempts to subsume one another. Symbolists say nets are just an alternative hardware for doing computation. That may be one possibility, but it is certainly not the one that connectionists are exploiting when they use nets to learn. Connectionists say nets have the power of Turing machines. Perhaps, but not when they are used as nets, rather than as a hardware for implementing a Turing machine. Symbolists say most nets are just symbolically simulated nets anyway. True, but they could be implemented as real nets too. Yet this does point out one absurdity of the opposition between symbols and nets, for if the symbolically simulated and the real parallel-distributed nets have exactly the same performance power, it becomes clear that nets are really just a family of computational algorithms-not the quasiverbal ones favored by, say, artificial intelligence, but algorithms nonetheless (mostly for learning); and that connectionism is really just the empirical and formal research program (still embryonic) of exploring the performance power of those algorithms.
But what role might such algorithms play in a complete cognitive economy? Sure, symbol systems have the virtue of systematicity: They and all their parts function in a way that supports a coherent semantic interpretation. But as candidates for explaining how minds work, symbol systems have been shown to have a critical shortcoming too (Searle, 1980): The interpretations of their symbols are not intrinsic to the symbol systems themselves. They depend on the interpretations we make with our minds. And this kind of parasitism makes them ineligible as autonomous models of our minds, on pain of infinite regress.
Consider an example. We have a symbol system that contains a symbol for 'chair'. The system has systematicity, so that symbol enters into combinations with others symbols, say, the symbol 'sit', such that 'A chair it something you can sit on' is a well-formed string in the system, and interpretable as meaning exactly what it says, etc. But then does the symbol string 'A chair is something you can sit on', which can be interpreted as meaning that a chair is something you can sit on, really mean a chair is something you can sit on-mean it in the sense that I mean it when I say it or think it or believe it or know it? In other words, it the symbol string, and its functional relations with the rest of the symbol system (be they ever so systematic and interpretable), a viable candidate for embodying the thought that a chair is something you can sit on, the way I am thinking it now?
Elsewhere (Harnad, 1989, 1990) I have given reasons why the answer to this question is No because the symbols in a pure symbol system are 'ungrounded', in much the way that the symbols in a Chinese/Chinese dictionary are ungrounded : If you don't know Chinese already you can keep going from symbol to symbol without ever arriving at a meaning. What is it for symbols to be grounded? In brief, although for a symbol string and its causal connections with the rest of the system to really constitute a thought it may not be a sufficient condition, it is surely a necessary condition that the system be able to pick out the real objects and states of affairs to which that string (and all the others with which it is systematically related) refers. Otherwise it is merely hanging a (systematic) skyhook, wrapped up in our projected interpretations.
So if symbols must be grounded in the capacity to pick out their referents from their sensory projections, nets are a natural candidate for mediating this capacity. I have proposed that nets might mediate the learning (supervised, with feedback) of the invariants that allow us to sort, categorize and name objects from their sensory and sensorimotor projections. Their names can then serve as grounded elementary symbols in a symbol system in which they are subsequently combined and recombined systematically into higher-order categories, as they are in natural languages. Such a system, however would be hybrid nonsymboic/symbolic, with the nonsymbolic part (the sensorimotor projections and analogue transformations of them, plus the invariance detectors, mediated by nets) being primary-the 'groundwork' of cognition, so to speak (Harnad, 1987). The system would be hybrid, because unlike in a pure symbol system, where the symbol manipulations are based solely on the arbitrary shapes of the symbols, in the hybrid system additional constraints would arise from the connections between the elementary symbols and the objects they refer to, as mediated by the non-arbitrary shapes of the invariance detectors that pick them out from their sensorimotor projections.
In such a hybrid system it would not be a question of whether 'a chair is something you can sit on' is represented symbolically, as a symbol string with functional relations with other symbol strings, or connectionistically, as a dynamic pattern of weights and activations in a parallel distributed system. It would be some of both, grounded in a systematic, bottom-up way. The problems (mostly about unproved capacity) raised by Fodor & Pylyshyn, Pinker & Price and Lachter & Bever, preoccupied as they are with the competitive question - symbols versus connections - become non-problems in a cooperative approach.
References
Harnad, S. (1976) Induction, evolution and accountability. In S. Harnad, H. Steklis & J. B. Lancaster (Eds) Origins and Evolution of Language and Speech. Annals of the New York Academy of Sciences 280.
Harnad, S. (Ed.) (1987) Categorical Perception The Groundwork of Cognition. New York: Cambridge University Press. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad87.cpreview.html
Harnad, S. (1989) Minds, machines and Searle. Journal of Theoretical and Experimental Artificial Intelligence, 1, 5-25. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad89.searle.html
Harnad, S. (1990) The symbol grounding problem. Physica D. 42, 335-346 http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad90.sgproblem.html
Lightfoot, D. (19S9) The child's trigger experience: degree-0 learnability. Behavioral and Brain Sciences,12, 321-375
Plunkett, K. & Marchman, V. (1989) Pattern association in a back-propagation network: implications for child language acquisition. Technical Report 89-02. Center for Research in Language. University of California, San Diego, La Jolla, CA.
Pinker, S. & Bloom, P. (1991) Natural language and natural selection. Behavioral and Brain Sciences, 13 (4): 707-784. http://www.cogsci.soton.ac.uk/bbs/Archive/bbs.pinker.html
Rumelhart, D.E. & McClelland, J.L (1986) On learning the past tense of English verbs. In J. L. McClelland, D. E. Rumelhart and the PDP Research Group (Eds) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol.2, Psychological and Biological Models. Cambridge, MA: Bradford/MIT Press.
Searle, J.R. (1980) Minds, brains and programs. Behavioral and Brain Sciences, 3,417-457. http://www.cogsci.soton.ac.uk/bbs/Archive/bbs.searle2.html