From International
Studies in Philosophy of Science, Vol. 5, pp. 1-15 (1991)
Reproduced here by kind permission of the Editor
The word representation, as it has come to be used in cognitive science, allows for a puzzling range of opinions about when it is applicable. Some AI workers are content to assert blandly that thermostats have beliefs, albeit very simple and primitive ones, and so are capable of embodying representations. At the opposite extreme, John Searle has claimed that no computer, however sophisticated, will ever really be capable of genuine intentional representation, even if it can pass the Turing test. But the rejection of the Turing test suggests the further question: if representation can't ever get into a computer, how does it get into the brain?
In this paper, my central aim is to find a principled criterion, along lines that make biological sense, for deciding just when it becomes theoretically plausible to ascribe to some process or state a representational role. Some representations intuitively involve a "mind to world" direction of fit: they aim at knowledge or true belief. Other representations involve "world to mind" direction of fit.1 These are typically rules or instructions that the system is supposed to follow in order to reach a usefully different state. Although the problem of what counts as genuine representation arises for both types, my focus in this paper will be on representations in the context of a mind to world direction of fit (though in that phrase `world' must be taken to include mind). I shall be particularly concerned with how representation is to be understood in relation to a certain conception of connectionist architecture, namely that recently defended by Paul Smolensky (1989). At the end of the paper, I shall offer some speculations about the consequences that accepting such a connectionist architecture might have for the level of understanding we might legitimately hope for in cognitive science.
Seeing, Cooking, Riding
A first approach of the problem I have in mind can begin with David Marr's beautiful work on vision. This work is based on the assumption that vision is a complex computational task, in which information at each of several levels is processed in such a way as to yield a differently organized level of information. At the first level, for example, the array of light intensities on adjoining pixels is first blurred, to a larger or smaller extent, and the blurred image is then massaged to detect edges. The process of edge detection essentially involves taking the second derivative of the gradient of brightness along the two dimensions of the projected image. The first derivative measures the slope of the gradient of intensity, while the second identifies zero crossings, that is, the loci of points at which the gradient of intensity turns from positive to negative or vice versa (Marr, 1982).
All this is easy to program, providing we know how to feed into the machine both the relevant formulae and the data relating to the grey level image. But what is the relation between what is being thus modeled on the computer and the phenomenon it models? The question arises, in particular, in what sense we must suppose that the eye -- and the computer in which it is modeled -- is equipped with the formula that we have programmed into the computer. Must we suppose that the eye knows calculus?2
There are two paradigms at opposite poles. The interesting cases lie somewhere in between. At one end, think of a falling stone; at the other, think of a person cooking, who is explicitly appealing to a recipe that has been memorized. The first illustrates that merely conforming to a physical law is not knowing it: if it were, then stones would have to know calculus too, in order to be so flawlessly computing their velocity, then executing it at every instant of free fall. On the other hand, the ability to formulate a rule, together with the ability to give a justification for it and to apply it at will, is uncontroversially sufficient for an attribution of knowledge, intentionality, and representation.
Here are some problematic intermediate cases:
We obviously don't want to say, simply, yes: that we have knowledge of the laws of kinematics, or that the soap bubble knows calculus, or the eye has knowledge of calculus. To say that would seem almost a kind of joke, in the same way as we might remark on someone's impeccable mastery of the laws of gravity when he (involuntarily) falls over.
But why not? Involuntarily falling over is not behavior, and it seems intuitively a requirement on the attribution of knowledge on the basis of behavior that it really be an instance of behavior which the knowledge is supposed to explain. That common-sense observation, as we shall see, lies close to the right answer to my central question.
When we compare the computer's "knowledge" of explicit rules with that of the cook, the suspicion may arise that the two are no more than homonymously related. I shall shortly ask what exactly is involved in saying that a machine is applying an explicit rule. But first, a preliminary question must be addressed about the relation between computation and representation
Computation and Representation.
The theory of mind inspired by classical model of AI is frequently referred to as computationalist. The force of this term is this: mental activity is conceived on the model of a sequence of rule-governed manipulation of certain units of fixed types, according to certain syntactic rules. The power of a universal Turing machine to compute anything computable and to imitate any other machine capable of carrying out an effective procedure lends promise to this thesis, serving as a guarantor of the power of computationalist models. But such power would be empty unless the possibility existed of interpreting the syntactic counters in question -- unless, that is, they could be viewed not merely as syntactic counters but also as representations. Should we then say that representation and computation are equivalent conditions?
An objection to this might be that while syntactic manipulation -- pure computation -- would indeed be pointless without representation, the latter might exist without syntax. Is not a photograph, for example, a kind of representation without syntax? Perhaps analog representations in general don't really have a syntax. For that matter, Douglas Hofstadter has suggested that thinking itself (which presumably involves representation) occurs at a level completely different from computation:
What Hofstadter means, I surmise, is that real thought actually doesn't consist in the strictly formal manipulation of symbols. This is an issue I shall come to presently. In any case, I shall take for granted in the following pages that while all computation would have to involve representation, the converse may not hold.
Let us look, then, at some putative requirements for the existence of representation.
Explicitness: A manual in the head?
First, does every rule that is represented have to be explicit? The distinction drawn a couple of pages ago between those systems that involve explicit rule following and those that don't can apply within a system as well as between systems. In fact, as Lewis Carroll showed in his fable of Achilles and the Tortoise (Carrol (n.d.)), it is impossible for a system to convert every rule to which it conforms into an explicit premise.
Achilles, you will recall, presents the Tortoise with an instance of Modus Ponens. The tortoise professes to accept the premises,
In every computational engine, that fact must of necessity be taken into account. That actual execution of any program rests on what Pylyshyn terms a functional architecture, that is, a repertoire of "basic computational resources" (Pylyshyn, 1985, p. 259), components of the semantic engine that just work in a certain way not because they have been programmed to do so but merely because of the laws of nature to which they are subject.
Robert Cummins has clarified this distinction in the following terms. We need to distinguish, he suggests, between an analysis of a process and a cause of the process's actually occurring. The analysis of the process will describe what occurs; but the description won't necessarily have any causal power on its own. By contrast, the instantiation of a process (such as the soap bubble computer instantiating a solution to a certain least path problem) needn't contain any analytical description of that process. Now call the mere instantiation by a system S of a process P, "E-representation."
On the other side, that intuition doesn't really take us very far towards a criterion of demarcation. For some E-representations are worse candidates than others. Consider this example from Hofstadter: "When a computer's operating system begins thrashing... at around 35 users, do you go find the systems programmer and say, "Hey, go raise the thrashing-number in memory from 35 to 60 okay?" No, you don't." (Hofstadter, 1985, p. 642) Now the "thrashing number" is represented in the machine in some sense: it is "embodied," we might say, in the hardware, and could be changed by modifying the configuration, adding memory, or whatnot. That is enough for E-representation; but surely any claim to genuine representation in that case is worse than that of the soap bubble computer. Some other cases of E-representation, by contrast, might be genuine representation after all. We need to look for further relevant conditions.
Besides explicitness, which I have just rejected, three other conditions might plausibly be imposed on what is to count as genuine representation: consciousness, syntactic and semantic complexity, and digitality.
Consciousness
Few people now take consciousness seriously as a condition of genuine intentionality. But many have -- such as Descartes -- and some -- such as John Searle (1980) -- still do. Much of what is most characteristically described as thinking, however, goes on without consciousness. Some sort of problem solving is involved in many responses to environmental stimuli, ranging from relatively complex tasks such as maneuvering a car through traffic to reflex adaptations to particular circumstances, such as the eye's capacity to orient its focus in response to detection of motion in the peripheral visual field. Sometimes the sorts of rules that govern our behavior, even when in some sense they seem to be the sort of thing that might be made conscious -- such as the bicycle riding rule -- are clearly unusable in that form. And even many of our more usable rules -- the rules of probability, for example -- are in fact much easier to learn to apply mechanically than they are to assimilate to the point of being used "naturally." Perhaps the rules of probability are in some sense "represented," or rather implemented, in us, but not in a way that we can, as it were, make contact with by learning them or bringing them to consciousness (see Kahneman and Tversky, 1982). Much the same is true of the "rules" of grammar and phonology that underlie our ability to produce and understand speech. Most conclusive in this regard are the experiments of Lackner and Garrett (1973) on dichotic listening, in which the input to the unattended channel, which is quite inaccessible to consciousness, nevertheless contributes to the disambiguation of the input to the "conscious" channel. In short, we have come to accept as a commonplace Lashley's dictum that "no mental activity is ever conscious." Certainly, then, some mental representation need not be.
Syntactic/Semantic Complexity
In the classical computationalist paradigm, the role of physical symbol systems (see Newell, 1980) offers a neat way of explaining the origin of representation in causal systems, without running afoul of Cummins's distinction between analysis and implementation. The central doctrine on which this rests is that complex mental representations are built up of simple parts in accordance with syntactic rules, and that the semantic content of a molecular representation is a function of the semantic contents of its parts and its syntactic structure. Mental operations are governed by the structure of the representations to which they apply (Fodor and Pylyshyn, 1988, pp. 11-12). In this scheme, "the physical counterparts of the symbols, and their structural properties, cause the system's behavior." (14) They do so, ultimately, because of their functional architecture, that is, in virtue of their causal properties.
Fodor and Pylyshyn emphasize that this way of conceiving of thought assimilates it, in effect, to the process of formal inference:
The other problem is that the production system model is an implausible one for much of what we humans -- as animals -- can do. (It is a pregnant irony that computers are now relatively good at some of the reasoning tasks that Descartes thought the secure privilege of humans, while they are especially inept at the "merely animal" functions that he thought could be accounted for mechanically.) Bicycle riding is only one of a myriad tasks that can't readily be learned by applying any explicit rule even when we know what the rule should be. Moreover, there are also plenty of tasks, particularly those that can be grouped under the general heading of evidential inference, for which we have not even been able to devise appropriate rules, let alone apply them. Fodor and Pylyshyn admit they don't take care of "evidential" inferences. But they seem to think this is no special difficulty for the classical paradigm, because "the problem about evidential logic isn't that we've got one that we don't know how to implement; it's that we haven't got one" (Fodor & Pylyshyn, 1988, p. 30). But that's just it: if we have failed in so many efforts to discover the formal principles of evidential logic, that's possibly because we don't use formal principles in drawing such inferences.
To lay too much stress on such facts, one might object, is to fall prey to the delusion of phenomenology: just because we aren't aware of applying such rules doesn't mean we are not doing so. I have already argued, after all, that representation needs neither consciousness nor explicitness. Perhaps we haven't found the right inferential rules because we haven't looked enough. That seems to be Fodor and Pylyshyn's view: "That infraverbal cognition is pretty generally systematic seems... to be about as secure as any empirical premise in this area can be." (41). Their reason for that act of faith is that even "animal thought is largely systematic: the organism that can perceive (hence learn) that aRb can generally perceive (/learn) that bRa." (44).
To be sure, the appearance of systematicity must be explained. In itself, however, systematicity requires something less than the strict compositionality to which Fodor and Pylyshyn assimilate it, and which derives from the Language of Thought hypothesis. What Fodor and Pylyshyn are actually demanding here, I suspect, is something else than mere systematicity. It is digitality.
Digitality.
Digitality is actually one of Plato's discoveries. In essence it is the substitution of a three-term relation for a two term relation in the real definition of resemblance. Under what circumstances would that be particularly useful? The answer is: when you want a taxonomy to remain stable through multiple replication. Any public information needing to be reproduced over and over again will degrade hopelessly fast unless it is digitalized. If you want to reproduce some complex thing, copying exactly is an unattainable ideal. However careful you may be, errors will creep in. And those errors will be additive, taking you on a random walk which will inexorably lead arbitrarily far from its origin. On the contrary, if we compare each thing to a paradigm and not to the latest in a series of copies (providing the paradigms have been suitably chosen to avoid ambiguity), reproductive errors will not accumulate.
That, I suggest, is what lies behind the insistence of Fodor and Pylyshyn and other defenders of the classical paradigm that there must be a language of thought that is systematic. But if this is right, then their argument is just slightly beside the point. For then what you really need the digital processes of language for is the cultural transmission of information, but not necessarily for that information that is processed internally.3 (Smolensky, 1988, p. 4a). Because language is, at different levels, (relatively) digital, it can embody the "cultural program" running on the individual "virtual machines" constituted by individual members of a social group (ibid.). But that leaves open the possibility, advocated by the Connectionists, that there exists a different mode of information processing, either in addition to or underlying the performance of that part of the brain that functions like a production system. That "intuitive processor" must also, of course, be systematic, but it doesn't have to be digital. Causal laws of nature, after all, are not digital, but they are highly systematic.
Smolensky's challenge to the role of language.
Smolensky (1988) is meant to address both of the problems faced by classical architecture. He suggests that the "intuitive processor" which is "presumably responsible for all of animal behavior and a huge portion of human behavior" (5a) has a connectionist architecture, and claims to explain the genesis of genuine representation.
For our purposes, the crucial features of connectionist architecture are these:
A connectionist network is a network of nodes, typically arranged in several layers one of which is labelled the Input layer and another of which is the Output layer. All the nodes in any one layer are connected to all the nodes in the adjoining layer or layers, but the connection strength can be varied, so that the level of transmitted activation (or, in stochastic versions, the probability of transmission) can vary between 0 and 1. In the case of intermediate layers, there may be no direct access to their pattern of activation at any particular time. Activation of any particular node is determined by the activation of the nodes to which it is connected, and typically depends on its own particular threshold. Given that threshold, whether or not a node is active is determined by the sum of the active nodes to which it is connected, weighted by their connection strengths. The interest of such networks is that the initial connection strengths can be assigned at random, and that their input-output functions can be modified by certain systematic tricks, such as "backward error propagation," (Rumelhart et al., 1985) which result in modification of those weights or connection strengths. In that way, the behavior of the connectionist network as a whole can be modified by learning, and the network as a whole can be said to store information in its connection strengths.
What is misleading about these examples is that the nodes represent identifiable features of the objects to be recognized, and that these features have, in effect, have been just as securely "canned" as any category embodied in a classical von Neumann computer program. But the prospect that Smolensky has in mind is actually a good deal more radical. He rejects both the classical or "implementationist" position, according to which connectionist networks may simply be ways of executing a classical von Neumann type program, and also any "eliminativist" position which would aim to do away with the level of conceptual computation altogether. Instead, he claims that
The LIFO-FILOcal Principle
Actually, both the glories and the miseries of classical architecture models are striking. They can be explained, I suggest, in terms of the LIFO-FILOcal principle: Last In, First Out/First In, Last Out. What we understand best about our minds is what our brains invented last; while we are likely to pierce last the secret of those skills that our brains evolved first.
The glories of the symbolic paradigm derive from the power of logic and sequential reasoning: a piece of intellectual technology more fabulous than the wheel, and more recent. It is an elaboration of what was, as far as I know, an invention of Aristotle's: namely the invention of formal linguistic representations of abstract patterns of inference. That idea, which lies at the core of classical Artificial Intelligence, is in turn based on Plato's discovery of a world of abstract objects that can be modeled in the real world, but never identified with anything in the real world. The modern concept of functionalism was anticipated by Plato's notion that certain realities can indifferently be modeled in many different material supports, because they are not ultimately identifiable with any of them. We now know, thanks to Church and Turing, that the elaboration of Aristotle's discovery is so powerful that one type of device -- the universal Turing machine -- can be made to do anything, given world enough and time, that is effectively computable.4
Now it's that last, incredibly recent achievement (which it is impossible to imagine without language) that computer science naturally has tackled first, not only because it is so powerful but also because it is (relatively) so easy. And it is easy, of course, because we have invented it (or discovered the basic principles that govern it). By the same token, however, these are not procedures that we are naturally very good at: it takes a lot of practice to do a little strictly deductive logical inference, and even then we make mistakes.
On the other hand, such skills as seeing and moving are things we're very good at: we've been practicing those for millions of years. We find it very easy to see, but enormously more difficult to discover the procedures that we use to do it. For several reasons we shouldn't wonder at this: they are more deeply buried, and we didn't consciously make them up. Because they evolved, moreover, we cannot expect them to be especially simple. For natural selection is the ultimate anarchy of hackers, and every programmer knows how a program that a few dozen hackers have tinkered with, let alone a few million, can become hopelessly opaque. The devices hacked together by evolution will sometimes be baroque in the extreme.
Such, then, we should expect Smolensky's "intuitive processor" to be. And among the reasons for thinking that this intuitive processor works along the lines of a connectionist machine is the fact that such networks have, in some simple cases, apparently been able to perform some categorization, at various levels of supervision. And such categorization presumably involves representation. Whether such representation can be labeled "intrinsic" or genuine will presumably depend on the exact nature of the grounds for its ascription.5 But remember that on Smolensky's view there are not just two levels -- the conceptual level, and the level of hardware -- but three: the conceptual, the neural, and in between the subconceptual level, which according to his hypothesis is best to be described in terms of the subsymbolic paradigm. And that paradigm differs from the symbolic paradigm in the relation of the conceptual level to lower levels. In the symbolic paradigm, there is no switch to a lower level: lower levels are simply either subroutines at the same (conceptual) level, or else they are merely implementational, and of no more conceptual relevance than the weight of the computer is to the program it is running. By contrast, "subsymbolic explanations rely crucially on a semantic (`dimensional') shift that accompanies the shift from the conceptual to the subconceptual levels":
But this difference returns us to our central problem: if the intermediate level involves numerical vectors, why is it -- as both Smolensky and his opponents, Fodor and Pylyshyn, agree (see Smolensky, 1988, p. 14d; Fodor and Pylyshyn, 1988, p. 11) -- a genuinely representational level, and not merely an underlying physical level? Why is the "representation" involved at that level not merely a form of E-representation analogous to the stone's "representation" of gravity?
Why is the subconceptual level representational?
Smolensky's answer is in terms of two factors: teleology, and complexity:
As for the criterion of teleology, it is not immediately obvious how it can escape the force of Searle's problem either. For what, in this context, is a "goal condition"? In relation to what or whom is it a goal? If the goal is defined by reference to some external factor, then it's not intrinsic. But in virtue of what can it be claimed to be intrinsic? And besides, what scientific legitimacy can we allow to the notion of a goal, where this is explicitly not assimilable to anyone's conscious or explicit intention?7
The suggestion needs supplementation with an analysis of teleology, in a more explicitly evolutionary perspective.
An Evolutionary Criterion
The kind of account we need is one popularized in the recent literature on teleology,8 where the notion of a goal-directed process is analyzed in terms of a certain etiology of the process in question. Very roughly,
Just such a criterion has recently been suggested by Mohan Matthen (1989) to illuminate the notion of semantic content. He introduces it to address what he calls the "Parallelism problem": why is it the we are caused to go from one belief to another in a way that matches the logical relations between the contents of the respective beliefs? Matthen points out that the Fodor and Pylyshyn answer to this problem -- the requirement of compositionality discussed above -- only yields a partial answer. It doesn't actually explain why syntax follows logic: why "the syntactic rules we follow happen to be rules that have a measure of semantic validity." (Matthen, 1989, p. 563) Matthen's answer is that we must understand the very notion of content in terms of evolutionary selection. Specifically, in typical cases, selection will favour a correlation between the content of a detector state and the content of a corresponding effector state. (Matthen, 1989, p. 567). Crudely put, if you do what's relevant to your situation, you'll survive; if your inference patterns lead to irrelevancies, you will not.
Now this explanation is designed to solve a problem about Smolensky's "conscious rule interpreter". But clearly the same evolutionary explanation can be extended to the "intuitive processor." And this idea can at last allow us to differentiate between the eye's "knowledge" of calculus and Smolensky's intermediate subsymbolic level, on the one hand, and the mere conformity to physical law exemplified by a falling stone or a soap bubble computer on the other. It gives a rationale for our initial common-sense intuition that just falling over couldn't require knowledge or representation of any laws of kinematics, because it wasn't behavior. The first two sorts of device, but not the last two, are there because they serve certain functions. (One example given by Smolensky is the "prediction goal: Given some partial information about the environmental state, correctly infer missing information" (Smolensky, 1988, p. 15b.) Similarly, Marr's zero-crossing detectors are, of course, at some level of analysis just the working out of certain physical laws. But the presence of cells that behave just so must be accounted for in evolutionary terms. Or at least that is the hypothesis on which the claim that they are truly representational must rest.
Understanding understanding
I have tried to contribute to an elucidation of the general conditions under which it is reasonable to ascribe genuine representation. The resulting criterion has, in particular, supported Smolensky's claim that we may need, to account for our infra-linguistic capacities, a level of analysis which is above the physical, because is instantiates genuine semantic or representational characteristics, but which cannot be identified with the usual "conceptual" level more directly explained by the symbolic paradigm.
This perspective may, however, entail an interestingly high price to pay in terms of the level of understanding to which we may aspire with respect to our own mental representations.
Lord Kelvin once wrote:
Lord Kelvin's remarks conjure up an image of a do-it-yourself world, where you could build anything you could design in your home lab. Maybe, in the relevant sense, such building just isn't possible any more. But the kind of building that was envisaged by the classical AI program was certainly of that kind: you could understand everything, precisely because you could build it out of perfectly transparent, Platonic, logical devices. Not so on the connectionist program, where the builders are sometimes so proud of not having programmed the machines that they in effect boast of not understanding exactly how they work -- even though they built them. So building something is perhaps no longer a sufficient condition of understanding.9,10 The best we can do is to understand it "in principle," rather as we understand certain physical events in principle in terms of gravity, wind resistance, friction, etc., without actually being able to read all the relevant parameters precisely enough to get any prediction.
Connectionism may yet solve the two hardest general problems of cognitive science -- the technical problem of modeling intuitive knowledge, and the philosophical problem of how the meaning gets into the brain. But it may have bought those solutions at a high price: the price of a radical lowering of the standard of understanding we can expect in cognitive science. Part of this lowering of expectations can be attributed to the stochastic nature of neural activity which underpins cognition: in that sense, the situation is much as it is in classical thermodynamics in relation to statistical thermodynamics: we know about pressure well enough, and we know what sorts of statistical events underlie the phenomena of gas volume and pressure. But we don't know in detail what happens in any particular case, and couldn't possibly ever do so. On the other hand, we don't much care, either: and that's because the reduction in that case is complete. If we are interested in pressure and volume, accurate statistical information will be enough to tell us all we want. (We won't be able to say even this much if we are dealing with processes that are sensitive to quantum effects.) If Smolensky is right, however, and the level of cognitive discourse is only imprecisely predicted by the hard formalism that governs the subsymbolic level, then it seems we must adopt Steve Stich's verdict that "in those domains where connectionist models prove to be empirically superior to symbolic alternatives, the inference to be drawn is the mental symbols do not exist" (Stich, 54). But if that's so, and if, given the requirements of our linguistic code, we will forever continue to need to talk as if they did, then we will have one more principled reason for thinking that in some strong sense we can never understand ourselves.
Bennett, Jonathan (1976) Linguistic Behavior (Cambridge, Cambridge
University Press).
Carrol, Lewis (nd) "What the Tortoise said to Achilles," in The
Complete Works of Lewis Carroll (New York, the Modern Library, pp.
1225-1230.)
Cherniak, Christopher (1986) Minimal Rationality. (Cambridge, MA, MIT Press: A Bradford Book)
Cummins Robert (1983) Psychological Explanation. (Cambridge, MA, MIT Press: A Bradford Book) (see esp. II.3: "Representation and internal manuals".)
Dewdney, A.K. (1985) "Analog Gadgets That Solve a Diversity of Problems and Raise an Array of Questions," Scientific American 252, 5 18-24.
Fodor, Jerry A. and Pylyshyn, Zeno (1988) "Connectionism and Cognitive Architecture," in Connections and Symbols, ed. Steven Pinker and Jacques Mehler. (Cambridge, MA, MIT Press) (=Cognition Special Issue, 28.)
Hofstadter, Douglas (1985) "Waking Up from the Boolean Dream," in Metamagical Themas (New York, Basic Books).
Johnson-Laird, Philip (1988) The Computer and the Mind: an Introduction to Cognitive Science (Cambridge, MA, Harvard University Press).
Kahneman, David & Amos Tversky (1982) "On the study of statistical intuitions," in Kahneman, D., Slovic, P., Tversky, Al, eds. Judgment Under Uncertainty: Heuristics and Biases (Cambridge and New York, Cambridge University Press).
Kosslyn, Stephen M. and Gary Hatfield (1984) "Representation without Symbol Systems," Social Research (1984) 51, pp. 1019-1045.
Lackner, J.R. and M. Garrett (1973) "Resolving ambiguity: effects of biasing context in the unattended ear." Cognition 359-372.
Marr, David (1982) Vision (San Fancisco, Freeman).
Matthen, Mohan (1989) "Intentional Parallelism and the Two-Level Structure of Evolutionary Theory," in Issues in Evolutionary Epistemology, ed. C. Hooker (Albany, SUNY Press).
Newell, Allen (1980) "Physical Symbol Systems," Cognitive Science 4, pp.135-183.
Pylyshyn, Zenon (1985) Computation and Cognition 2nd ed. (Cambridge, MA, MIT Press: A Bradford Book).
Reeke, George and Edelman, Gerald (1988) "Real Brains and Artificial Intelligence". In Artificial Intelligence, special issue of Daedalus, Winter 1988. Republished by MIT Press 1989.
Rumelhart, D.E., G.E Hinton, and R.J. Williams (1985) "Learning Internal Representations by Error Propagation", in Rumelhart and McClelland (1986)
Rumelhart, David E, James McClelland, and the PDP Research Group (1986) Parallel Distributed Processing Explorations in the Microstructures of Cognition. Vol 1: Foundations (Cambridge, MA, MIT Press: A Bradford Book).
Searle, John R. (1983) Intentionality: An essay in the philosophy of Mind (Cambridge, Cambridge University Press).
Searle, John (1980) "Minds, Brains and Programs," Behavioral and Brain Science, 3, pp.417-24.
Smolensky, Paul (1988) "On the proper treatment of connectionism," Behavioral and Brain Science, 11. (References to this article include "quadrants": a and b represent top and bottom half of first column, c and d of the second.)
Taylor, Charles (1964) The explanation of behaviour, International Library of Philosophical and Scientific Method, (London, Routledge and Kegan Paul).
Wright, Larry (1973) "Functions," Philosophical Review 82, pp.139-168.