Fodor and Pylyshyn Refuted:
Compositionality, Systematicty, and the Power of Distributed Representation

Ken Marable
Bachelor's Thesis
9-4-95

Contents

Bibliography
Footnotes


"I'm the only President you've got." - Lyndon B. Johnson

"The Earth is the only center of the universe you've got." - a typical reply to Copernicus

1. Introduction

Connectionism is dead. At the very least, Jerry Fodor and Zenon Pylyshyn (1988) claim that connectionism as a cognitive architecture is utterly fruitless and no more time should be wasted trying to save it. It can technically still be used as an implementation model, but even then it is not all that beneficial of a theory.

I disagree. I believe that connectionism is not only alive and thriving but is beginning to suffocate classicism as a new direction for artificial intelligence research. For the purposes of this paper, however, I will merely attempt to show that despite Fodor and Pylyshyn's critique, connectionism is a viable alternative with quite a bit of promise.

First off, let's see just what is at issue here. Artificial intelligence has been an area of intense interest for psychologists, computer scientists, and philosophers. With the emergence of the unified discipline of cognitive science, these fields have been able to interchange ideas far more than ever in the past. The growth of knowledge in this field has been staggering over the past several decades alone.

Not only has significant progress been made towards intelligent machines, but we have also learned much about our own mind in the process. Models of human cognition have come and gone, but we do seem to be narrowing the range. Parallel to this have been the varying models of potential computer cognition. With a substantial amount of detail variation and some occasional, but short-lived, alternatives, a primary model for both human and computer thought has been developed.

The dominant models in each field are remarkably, and I doubt very coincidentally, quite similar and, moreover, are easily compatible with each other. The primary style of computing that nearly all artificial intelligence research uses is the von Neuman style. The most relevant point is that it is a serial, symbolic style; a particular form of the more generalized Turing Machine. Although this is a very broad category, only a general understanding of it is necessary for my purposes.

Whereas von Neuman and Turing Machine models pretty much dominated computer science since its onset, models of human cognition were less consistent. However, for approximately the past 20 years, mild variations of Fodor's Language of Thought model have reigned (1975). I will explain it in detail in the next section, but suffice it to say, it is quite serial and symbolic as well. Both the computer and the cognitive models deal with a series of processes over discrete symbolic representations.

There have always been an upstart or two to challenge each view, usually with little or no success however. Recently, a new breakthrough within computer science has begun a revolution in nearly all the disciplines of cognitive science. It has been called a clear example of a Kuhnian paradigm shift for each of those fields. It is connectionism.

There were ancestors to connectionism in both computer science and philosophy of mind dating back to Kant, but, as Fodor and Pylyshyn so gladly point out, they failed. Why should this new version of an old theory rear its ugly head again and still be taken seriously? It should be because now computer scientists have developed sophisticated models that are able to empirically prove what were merely wild speculations and highly abstract arguments. Previously, there were only guesses and hypotheses of what these models could do, but now we can directly see. Even though computer scientists are now able to directly test some philosophical ideas, the greatest advances have been entirely new discoveries made with the connectionist networks that have surprised and inspired philosophers.

Later in the paper, I will give a better explanation of the broad field of connectionism, but first a quick glimpse for those new to the paradigm. Connectionist models are not serial in nature, but primarily parallel networks of highly interconnected nodes. These nodes, which can be vast in number, are very simple processors, each often just summing its input, perhaps performing a simple calculation, and then giving the appropriate output. A very different picture than the serial models where a single, complex processor does all of the work.

To better understand the differences between the two let's compare how they would each handle a certain operation - data retrieval for instance. Rumelhart and McClelland developed a network that stores and retrieves information, in this case information about the various members of the Jets and Sharks gangs (1986). A serial machine would best store the information in a list-like database as shown in Figure 1. The computer would run a specific retrieval program that would scan certain rows and columns searching for keywords to get the desired information.

The network created by Rumelhart and McClelland works much differently. A scaled down version (for clarity) is diagrammed in Figure 2. Here the nodes are divided into groups. The groups are: name, age, "profession", marital status, gang affiliation, and education. These are connected by way of hidden nodes. These nodes are considered "hidden" because they do not receive direct input or have any sort of meaningful output. They are useful for connecting the different groups and, in some other connectionist network designs (as I will discuss later) they are essential for computations.

To retrieve information about a certain feature, you merely activate that node. For example, you can activate the Ralph node and Ralph's characteristics will also be activated: Junior High, Jet, single, Pusher, and 30's. Since the activation is two-way, there is an added benefit that can be of more use in other networks but is illustrated here. Art is a single Jet pusher with a Junior High education but is in his 40's, so when each of those nodes are activated, activation trickles back to the Art node. It is decreased in level enough that it does not greatly affect the rest of the network, but is significant enough to be influenced by exactly how similar Ralph is to Art. This effect is essential to connectionist systems' ability to generalize as I will address towards the end of my paper. You can retrieve any sort of information from this network. You can find out not only which of the individuals are Jets but activating that node, but also due to the two-way activation and generalizing, you can find out which profession is most prevalent among them just by stimulating that single node. So it is quite apparent that connectionism and classical systems approach computation far differently.

Connectionism was originally inspired by neuroscience, but the analogy should not be taken too far, at least not presently. The ultimate goal of many connections is to reduce these nodes to electronic neurons and therefore bridge the gap between neuroscience and cognitive psychology. Others - in fact a great majority - are content with an inexact structural similarity between the neural level and the cognitive level. It is important to emphasize that connectionism is still in its infancy. The general foundation has only been laid in the past few years and future directions are still being fine-tuned.

The problem now arises, which model of human cognition is correct? It would appear that connectionist networks are incompatible with the Language of Thought model. This is precisely the main thrust of Fodor and Pylyshyn's critique. Two main options are open: a) prove that connectionism, or a related form of it, is indeed compatible with Language of Thought, or b) prove that the connectionist model of cognition is better than Language of Thought. The first option is not very attractive to connections, and the latter is forbidding to say the least. However, these are actually only the two ends of a continuum of possible positions. A third position, intermediate between the two, also exists, according to which, it is possible to form a new cognitive model that is at least as good as Language of Thought, if not better in some ways, while still preserving the some of the best aspects of it. I shall argue for this latter position.

Before we dive in, let's do some road-mapping for this paper. For starters, I will explain in greater detail the Language of Thought model and some of Fodor's main arguments in favor of it. I will then summarize Fodor and Pylyshyn critique of connectionism presenting their arguments as to why connectionism cannot support a Language of Thought. The primary issues are the way in which connectionism handles both mental representations and mental processes.

I then begin the defensive portion of paper with an in depth examination of how connectionism actually handles complex mental representations and a look at the preliminary work on structured mental processes in connectionist networks. From these I hope to show that Fodor and Pylyshyn's argument fails and that connectionism is at least a viable alternative. To finish up, I will make a comparison between the two approaches as they stand now and in their potential.

2. The Characteristics of Language of Thought

2.1. General Background for LOT

So just what is language of thought? It is within the broad category that Fodor calls intentional realism. He uses this term to cover the broad range of views that fundamentally all support physical intentionality. In other words, that the mental is a physical thing. Also, mental states can be intentional, they can have meaning. This includes beliefs, desires, and other such "mental" stuff. Intentional realists thus include the majority of the philosophy of mind community (though it is not without its major dissenters).

Fodor begins to depart by claiming that mental states (beliefs and desires in particular) have causal roles for behavior, but that meaning does not arise from causal relations. Fodor believes in a more innate style of semantics. He believes that representations gain their meaning as an intrinsic property by virtue of causal conections with what they represent and that relations with other representations are inconsequential. He does not see intentionality arising from the relations between mental states, a view that he calls "functional-role semantics". I will not belabor this point right now. I will go into more detail about it later since Fodor and Pylyshyn criticize connectionism on the very point that meaning is determined by relations. Just suffice it to say that Fodor believes that language of thought models cannot gain intentionality through internal relations.

2.2. Combinatorial Structure

Some of the criteria specific to LOT are much more mainstream, but as I will show later, they are still far from entirely accepted. Fodor goes into more detail on them, but these criteria can be summarized by the statement that a language of thought must have combinatorial structure for its mental states and representations.

To have this combinatorial structure means that mental states can have an internal structure consisting of other mental states. Mental states can be combined with other mental states to become more complex states. For example, the thought "raise my left arm and hop on my right leg" is a complex mental state composed of the two mental states "raise my left arm" and "hop on my right leg".1

For clarity, we will only deal with intentions to make something true. So the intentional state of wanting to raise my left hand only means that I want to make it true that my left hand is raised. Whether or not this can cover all intentions remains to be seen, but it does seem to be an apt idealization. This move is useful in dealing with such things as unfulfilled intentions and complex, abstract intentions.

A clear analogy of this combinatorial structure was presented by Steven Schiffer (Fodor 1987, Sterelny 1991). He developed the notion of an intention box. Being in an intentional state could be seen as putting something into the intention box. The intention box "churns and gurgles" and at least attempts to make it happen. There may be intermittent gaps that prevent the intention from being realized, so our intention boxes can only do so much (e.g. I can be in the intentional state that I will be Emperor of the Universe all I want, but it sure is not going to happen any time soon).

For example, if I am in the intentional state of wanting to raise my left hand, I simply put the proper mental state thing into my intention box and voila! My left hand raises. Now for raising my left and hopping on my right foot, I put a mental state thing into my intention box but, under the language of thought theory, this mental state thing is made up of the mental states of the two individual actions conjoined in the proper fashion. Those who disagree with a language of thought on this point would want to have a third, completely unrelated mental state for raising my left hand and hopping on my right foot. This mental state would have absolutely no relation at all to the other two except by coincidentally similar behaviors.

2.3. Arguments for Combinatorial Structure

Fodor's main strategy in supporting a language of thought is to argue against the plausibility of non-languages of thought2 (a position supported in Fodor's articles by "Aunty"), since only language of thought is left, then it is the best we have so far. As Fodor puts in a famous quote from Lyndon Johnson, "I'm the only President you've got." (Fodor p.27, 1975). I must give Fodor credit in that this is not his entire strategy. He does show that on the points that non-language of thought models fail, the language of thought model does quite well. However, Fodor's main technique is that of shooting down everyone until you are the only one left standing. Fodor offers three main arguments for a language of thought. The first is a methodological attack on the non-language of thought intentional realists. Then he looks at psychological processes, and finally at the issue of systematicity. On all three, surprisingly enough, language of thought not only does quite well, but AIR [Aunty's Intentional Realism] handles them quite poorly.

2.3.1. Argument from Methodology

First the methodological attack. The AIR model which is quite opposed to combinatorial structure relies upon holistic mental states. These are indivisible to constituent parts (e.g., Raising my left hand and hopping on my right foot is a whole, indivisible mental state unrelated to the two that a language of thought would see as constituents of it).

Fodor believes that this violates a fundamental rule of scientific inference that wants to postulate the least number of "accidents".

Principle P: Suppose there is a kind of event c1 of which the normal effect is a kind of e1; and a kind of event c2 of which the normal effect is a kind of event e2; and a kind of event c3 of which the normal effect is a complex event e1 & e2. Viz.:
c1 => e1
c2 => e2
c3 => e1 & e2
Then, ceteris paribus, it is reasonable to infer that c3 is a complex event whose constituents include c1 and c2. (Fodor p.141, 1987).
This is a particular case of a more general principle known as Occam's Razor. It states that it is best to have the simplest solution as possible with the least number of unseen factors as possible. The AIR violates Occam's Razor by postulating far more mental states than a language of thought with its combinatorial structures.

Also, the similarity in behavior between the mental state of raising my left hand and hopping on my right foot and the two separate mental states of raising my left hand and hopping on my right foot is purely accidental in AIR. They are unrelated mental states and any similarity in outcome is coincidence. Occam's Razor was developed to shave off such ad hoc explanations as this. It seems quite clear that this generalized theory of AIR is methodologically vulnerable at best.

2.3.2. Argument from Psychological Processes

The second main support of LOT/objection to AIR is how each deals with mental processes. Aunty hates the entire idea of mental representation. She refers to it as "ontological promiscuity" (Fodor p.144, 1987). For example, Aunty believes that when someone talks to you, there is only the utterance and an "Unknown Neurological Mechanism" that works on the utterance between the ear and the conscious self so that it is heard already analyzed. There are no representations of any sort. The listener hears it analyzed already, so any mental representations would be superfluous.

Both Fodor (1987) and Kim Sterelny (1991) argue that mental representations are necessary. Sterelny cites problem solving in support of this. People use a hypothesis/test method for problem solving (this includes the full range from abstract psychology experiments in the area to mundane analysis of what is happening in one's environment). This methodology requires some form of representation in which to formulate hypotheses to test. If all of our hypothesis testing had to be done externally rather than mentally, then humans would most likely not have evolved quite as far as we have. Mental trial-and-error has definite survival advantages over external trial-and-error.

Fodor sticks with the sentence comprehension examples and argues that mental processes are defined by the mental representations over which they function (1987). Mental processes, on his account, are a series of mental representations and a transition from one to another is completed by performing operations on the representations. For example, Wh- questions such as "Who is John?" can be easily converted to "John is who?" by performing simple operations on a mental representation that can be seen as a basic parsing tree. Aunty says a somewhat similar thing with her "Unknown Neurological Mechanism" that converts the expression before the listener hears it. Aunty's explanation is just hand-waving, whereas Fodor offers an explanation for how this occurs - a language of thought.

It is widely supported that mental processes are computational, so there must be some sort of structure that allows one to transfer parts to and from other structures without altering the rest. The parsing tree model of mental representations is compatible with this (and not with AIR), and, Fodor tells us, appears to be the best explanation so far.

2.3.3. Argument from Systematicity and Compositionality

Last, Fodor argues that the systematicity of thought requires a language of thought. One of the major features of language of thought that other theories have trouble dealing with is the ability to create new thoughts systematically - having novel mental states without having to randomly put stuff together until we get something that happens to work.

Systematicity is illustrated by the ability to move parts of complex structures to create new ones. Being able to have the mental state John loves Michael also means that you necessarily can have the thought Michael loves John. It is quite apparent that thought is systematic. Parts can be rearranged, removed, and added according to a set of rules to create novel mental states.

AIR must rely on a phrase-book type of system. Since AIR's mental states are indivisible they cannot be systematic. All that Aunty can do is memorize a massive list of sentences, just like a non-native speaker reading from a large phrase-book. However this phrase book would have to be astronomically expansive from the beginning unless you can somehow add new thoughts to it. With thoughts being whole objects, a theory explaining the addition of new ones has its work cut out for it.

With all thoughts coming from a massive phrase book, it is also possible, and even likely, that the situation of being able to think John loves Michael but being utterly unable, no matter how hard you tried, to ever think Michael loves John, could arise. This situation not only does not seem to occur in any moderately complex organisms, but implausible to even conceive of it occurring. This position is not seriously supported by anyone that I have come across.

Language of thought models with their combinatorial semantics can explain the systematicity of thought just as it does for public languages. However, in a (somewhat) noble move, Fodor points a flaw in this reasoning. The argument that thought has combinatorial structure because of systematicity goes along as follows:

(a) There's a certain property [systematicity] that linguistic capacities have in virtue of the fact that natural languages have a combinatorial semantics.
(b) Thought has this property too.
(c) So thought too must have a combinatorial semantics. (Fodor p.148, 1987)
The problem here is that pesky little logical fallacy called affirming the consequent (P -> Q, Q / P). "Since language has combinatorial semantics, it is systematic. Thought is systematic also, therefore thought also has combinatorial semantics." Fodor admits this, but says that it is alright here.

As Fodor puts it, "one man's affirming the consequent is another man's inference to the best explanation" (Fodor p.149, 1987). Since Fodor is only trying to prove that LOT is better than his opponents, he merely has to show that it explains the facts better than the rest. He does not have to prove that it is necessarily true. To further show the power of affirming the consequent to get a conditionally true conclusion, all of science is based completely on this logical fallacy (If my theory is true, then the evidence will come out this way. The evidence came out this way, so my theory is true.)

Compositionality is quite similar to systematicity and Fodor and Pylyshyn even say that "perhaps they should be viewed as aspects of a single phenomenon (1988, p.41)." Pretty much what systematicity is to mental processes, compositionality is to mental representations. It is the clearly defined structure that the mental processes operate over. A mental representation is compositional if and only if it is structured in such a way that the information in it is accessible to mental processes. So the systematicity of mental processes depends on the compositionality of its representations. The compositional structure of the representations depends in turn on combinatorial syntax. The arguments surrounding systematicity apply directly to compositionality, at least for now.

It is quite clear that language of thought models provide a better explanation of systematicity and compositionality than AIR's phrase book model. Not only is language of thought better than AIR, it is apparent that due to systematicity, AIR leads to unacceptable conclusions. To conclude this portion, keep in mind that the characteristics that a language of thought requires are systematicity in mental processes and compositionality of mental states, both of which derive from a language of thought's combinatorial syntax ands semantics. Now I will look at Fodor and Pylyshyn's analysis of artificial intelligence with connectionism as their main target.

3. The Attack of Fodor and Pylyshyn

3.1. Fodor and Pylyshyn's Version of Connectionism

I will now offer an overview of Fodor and Pylyshyn's arguments against connectionism, of course, with the trimming of extraneous connectionist bashing that fails to form any sort of argument. However, you may have noticed that I am starting with their critique of connectionism without really laying out the connectionist view. This is intentional. My main objection to Fodor and Pylyshyn's critique is the form of connectionism they assume. So let's first take a look at the argument on their terms before we look at the connectionist views that people actually believe in.

Fodor and Pylyshyn define connectionist systems as a large network of nodes that sum all of their input then and output some value according to a certain simple function. Rather than having some single, complex processor to carry out the mental functions, there are a vast number of very simple processors that together carry out the mental functions. This is the fundamental difference between connectionism and classical systems. This apparently small shift in processing style actually leads to some very large differences on key issues.

It will be important to note that Fodor and Pylyshyn discuss a localist version of connectionism. This means that semantics is assigned directly to individual nodes as opposed to being distributed over a large number of nodes. So in a localist network, each node would have a specifically assigned meaning, whereas in a distributed network groups of nodes would be assigned meaning.

Fodor and Pylyshyn see this shift to distributed representations as irrelevant to the issue at hand. In their eyes, distributed representations have no advantage over localist models. They are both still fundamentally connectionist and consequently their arguments should apply similarly. This point is clearly made in their first footnote:

The difference between Connectionist networks in which the state of a single node encodes properties of the world (i.e., the so-called 'localist' networks) and ones in which the pattern of states of an entire population of units does the encoding (the so-called 'distributed' representation networks) is considered to be important by many people working on Connectionist models. Although Connections debate the relative merits of localist (or 'compact') versus distributed representations, the distinction will usually be of little consequence for our purposes, for reasons we give later. (Fodor & Pylyshyn, 1988 p.5)
In actuality, there is little debate amongst connections. Nearly all believe that distributed representations are not only the better of the two forms for connectionist systems, but also believe that they are connectionism's saving grace. I will get to that later, though. For now we will deal exclusively with localist networks.

Fodor and Pylyshyn then point out that the only relevant "primitive relation" in connectionism is the casual relation between nodes (i.e., such and such node affects these other nodes in such and such a way). Classical systems not only recognize causal relations but also "a range of structural relations, of which constituency is paradigmatic" (Fodor and Pylyshyn, 1988, p.12). This fact, that classical systems have constituency and other structural properties and connectionist systems do not, is the focal point of their argument: after all, constituency is where Aunty failed.

Further clarifying connectionism (mostly through contrasting it with classical systems), Fodor and Pylyshyn see the distinction becoming most apparent and relevant with regard to issues of mental representation and mental processes. Classical systems have a combinatorial syntax and semantics for their representations whereas connectionist networks do not. Fodor and Pylyshyn supposedly establish this point with the classic arguments for language of thought that I presented earlier - compositionality and systematicity.

3.2. Some New Twists to an Old Argument

The model used by Fodor and Pylyshyn as an allegedly typical case of connectionism is reprinted as Figure 3. It is a simple logical reference machine, a paradigmatic case of language of thought. It is a network in which the complex predicate A & B can be broken down to either or both of its constituents, A and B. Understandably, in reality such a simple network would only be part of a larger machine, but Fodor and Pylyshyn use it as a connectionist network stripped to its essence, with no frills at all. If this "purified" connectionist system can support a language of thought, then there's no problem at all. However, if it cannot, then connectionism is just out of luck.

They now use the compositionality of mental representations and the systematicity of mental processes as the two primary tests of connectionism and its potential for a language of thought. These two characteristics are the primary indicators of combinatorial structure, the foundation of a language of thought. If and only if our test network can possess these, can it then support a language of thought.

3.2.1. The Attack of Structured Mental Representations

This objection deals with how connectionism handles compositionality. Just as a quick reminder, compositionality is the form of structure that is necessary in mental representations. It is the constituent structure that allows mental processes to be systematic.

Fodor and Pylyshyn's logical inference network does not possess any constituent structure, or any internal structure whatsoever. Each node is atomic. Even node 1, which represents the molecular statement A & B, is itself atomic. It is just a single site of activation and lacks structure of any kind, let alone compositional structure.

One might be tempted to say that since node 1 represents A & B, a statement that obviously has constituent structure, this representation can then be split into its individual parts. However, this move misinterprets the nature of this connectionist system. The fault lies in the fact that the characteristics of the labels (i.e. compositionality) are being attributed to the representations (which have no internal structure). The logical statements A, B, A & B are all node labels. These labels are conventions attached to them by the programmers to make the network's activity meaningful.

The representations themselves are merely the nodes themselves. Each representation is a simple yes/no, on/off, 1/0 level of activation. Since this network is a localist one, each node is by definition a representation. With Fodor and Pylyshyn's formulation of it, as with most (but not all) localist nets, there are no other levels of representation. There is only the activation of individual nodes.

It is impossible for a system with compositionality to have representations that are all atomic. The two are mutually exclusive. In fact, without molecular representations, there can be no structure at all present in them. Fodor and Pylyshyn clearly point out that a base, primitive object that cannot be broken down obviously cannot have any structure at all. Language of thought gets around this by developing molecular representations from the base atomic ones. Localist connectionist networks, at least this particular one, are not able to make this move.

3.2.2. The Attack of the Structured Mental Processes

So now that it is apparent that the mental representations of our connectionist network necessarily lack compositionality, it is merely beating a dead horse to show that they lack systematicity as well. Quite simply, systematic mental processes rely upon compositionality. If compositionality is not present, then systematicity must be lacking also.

It is possible to perform operations that end up exhibiting systematic rules, as our logical inference device does. However, htese are not intrinsic operations. Again, the particular labels are the source of systematicity and not the representations themselves.

The labels of the system play no causal role whatsoever. The fact that node 1, when activated, tends to activate node 2 and node 3, is the only causal relation present. The particular node labels are utterly irrelevant. We could relabel node 1 'Bill the Clown', node 2 'Bill', and node 3 'clown'. It would appear to be systematic still. But what about 'Bill the Clown', 'elephant', and lampshade' being the respective node labels? 'Elephant' and 'lampshade' are not constituents of 'Bill the Clown', yet the causal relations among the nodes remains unchanged. To be anthropomorphic, the computational system couldn't care less what labels we humans put onto it; it will compute exactly the same way. This is another obvious point that Fodor and Pylyshyn elaborate.

So just what process do connectionist networks operate by? Fodor and Pylyshyn claim that they operate by association alone. They are trained to relate representations according to statistical relations that emerge from experience. A network will be so trained that if it gets a certain input it should produce a certain output. If it does not, then the network is altered so that it will do better next time. In other words, connectionist networks (supposedly) are trained only to associate a certain output with a certain input. However, association is not a structure sensitive operation. There is nothing causally relevant within the representations, only their association to other representations. This is not a viable option for systematic processes either.

3.3 Fodor and Pylyshyn's Conclusions

It would seem that our logical inference network just cannot support a language of thought. It not only lacks constituent structure in its representations and systematic structure in its processes, it lacks any kind of structure whatsoever. Apparently connectionism is out of luck as a model of human cognition and consequently as a method for artificial intelligence research.

Fodor and Pylyshyn do not completely nail the lid shut on connectionism though. They are kind enough to present the only alternatives they see open to die hard connections:

1) Try to show that unstructured mental representations are the correct model for cognitive processes.
2) Rely on structured mental representations but continue to use an associational account of mental processes.
3) Use connectionism only as an implementation theory for a classical architecture.
4) Give up on networks as model for cognitive processes in general and only use for certain less cognitive mental processes (i.e. perception).
Option 1 says that even though structured mental representations are very useful and have been generally accepted for years, the notion is wrong - actual human mentality does not involve systematicity and compositionality. However, this seems hopeless at its worst and unattractive at its best. The second option is similar to the previous one except that it throws out only systematicity but keeps compositionality. Again, this counter-productiveness is not a good sign and makes this option seem equally bleak. The third option is Fodor and Pylyshyn's favorite, since it pretty much gives in to their argument. It states that connectionism fails as a cognitive theory, but alows that computer scientists can use it if they really want but only to implement a primarily classical system. The last one is to give up all the higher cognitive processes like language and reason and only use connectionist systems to model lower processes that are only generally "mental", such as perception and reflexes. This is also looked upon favorably by Fodor and Pylyshyn, but they feel that instances where connectionist networks would model the process better than language of thought are quite rare.

My argument from here on does not follow any of these options. Instead I use the work of many other philosophers and computer scientists to create a fifth one: a connectionist variant of the language of thought. However, I will also touch on the last two options that Fodor and Pylyshyn offer. It would seem that the third can be used to refute their entire argument (as I will argue in the next section), and the fourth option may be more prevalent than they realize (which I will discuss briefly in the last section).

4. Fodor and Pylyshyn are Wrong

4.1. Chalmers' "Simple Refutation"

David Chalmers offers "a particularly simple refutation of Fodor and Pylyshyn's argument" (1990b, p. 340). I think it will be nice to start off with it, both to take some of the steam out of them right from the start as well as to introduce the main problem with all of Fodor and Pylyshyn's arguments against connectionism. According to Fodor and Pylyshyn's premises and argument, argues Chalmers, their conclusion not only does not follow but is also simply false. Connectionism can be a viable cognitive model.

Fodor and Pylyshyn argue that a connectionist network cannot support a language of thought. Their logical inference example shows a case where this is obviously true. So their conclusion is that connectionism fails as a form of cognitive architecture. They explicitly offer the possibility, however, that connectionism is a viable option as an implementational architecture. More simply stated, Fodor and Pylyshyn believe that connectionism has no place in philosophy and should be confined solely to the computer science domain. This may very well be what they set out to prove, but they did not succeed.

Their arguments, as they stand, prove that no connectionist network can support a language of thought. It is not possible to develop any structure formental representations and processes within a connectionist network. Fodor and Pylyshyn argue that due to its associational basis, it is impossible to form any structured mental process and representations in a connectionist system. I emphasize no connectionist network because ones that are attempted implementations of classical cognitive architectures are still fundamentally connectionist networks. Therefore, it is impossible to implement a classical cognitive architecture (and therefore a language of thought) on a fundamentally connectionist network. However, Fodor and Pylyshyn then claim that they proved that no connectionist systems can instantiate a language of thought except for implementations of classical systems. So what their argument proves goes far beyond what they conclude from it, no tto mention that it also goes beyond what is generally accepted.

The analogy that Chalmers draws is his mad scientist who proves that Earth is the only inhabited planet in the universe. First this scientist runs through an a priori proof that the proper biochemical reactions for life to ever evolve cannot occur. It is necessarily impossible for life to exist. However, life obviously exists on Earth. So the Earth is the only planet with life on it! If you are thinking ad hoc, you are right. This maneuver is the one that Fodor and Pylyshyn subtly use.

This is a big problem for Fodor and Pylyshyn. They could say that they were wrong in their conclusion and that implementations of classical architectures are not an option either. However, connectionist implementations of classical architectures that work just as well as standard implementations are entirely possible. As a computational system, both connectionist networks and von Neuman/Turing Machines are equivalent. Both can compute anything computable. Consequently, Fodor and Pylyshyn are out of luck there.

Their next option is to claim that their argument applies only to cognitive architectures and not implementational ones. In fact, this very well seems to be the most likely direction the would go. Still no luck though. Their arguments are too strong and end up attacking connectionism as an implementational architecture as well. They prove that their example cannot support structured representations or processes of any sort as either a cognitive or an implementational architecture. I do not contest this. The example they use cannot implement a classical cognitive architecture, and therefore, even indirectly, a language of thought.

4.2. Fodor and Pylyshyn's Error

So just where do they go wrong? It is in their example. The logical inference network that they test their objections against is a textbook example of a straw person argument. Since Fodor and Pylyshyn show that some connectionist systems (localist in particular) can't support a language of thought, it does not mean that none can. They simply looked at the wrong kind of system. I have not come across any connectionist who would believe that such a localist, association-based network could model human cognition.

It is not the fact that the example is so simple and small. The fallacy lies in its purely localist nature. Plenty of useful and well-documented connectionist networks use atomic, localist representations. However, none of them are attempts at modelling cognition. By analogy, I could very well look quite in-depth into this word processing program that I am currently using and claim that it cannot be even remotely cognitive and conclude that no von Neuman computer (or Turing machine, for that matter) can be cognitive.

Sounds quite ridiculous, doesn't it? It parallels Fodor and Pylyshyn's argument exactly. They take an extreme example of connectionism that no true connectionist would support as a model of cognition, and prove that it cannot support a language of thought. Since all other aspects of connectionism are "confusions and irrelevancies" (Fodor and Pylyshyn 1988, p. 6), all connectionist networks are incapable of supporting a language of thought.

Even after the substantial literature on the subject, Fodor still fails to see its importance. He still claims that "connections clearly assume that there are elementary mental representations (typically labeled nodes), and that these have both semantic and causal properties." (1994, p.96) However this is far from being assumed at all. In fact it is the very thing connections argue against.

When one asks what is the deepest philosophical commitment of the connectionist movement, the answer is surely this: the rejection of the atomic symbol as the bearer of meaning. Connections feel that atomic tokens simply do not carry enough information with them to be useful in modeling human cognition. Rather, distributed, subdivisible, malleable representations are the cornerstone of the connectionist endeavor. For this reason, localist networks are regarded by many connections as not really connectionist at all. These networks employ precisely the traditional notion of atomic symbols, with a new twist added by connecting them by associative links. (Chalmers, 1990b, p. 343)
This sums up quite nicely the true foundation of the connectionist movement. It is not association as Fodor and Pylyshyn would have you believe, but distribution that i sessential.

Fodor and Pylyshyn's model though, is only half-way to the true models of cognitive architecture that is supported by connectionists. This example is actually atomic symbols connected by associations. As Chalmers puts it, it is "symbolic AI with soft constraints" (Chalmers, 1990b p.343). It is a blend of the two positions using the representation style of one and a processing style of the other. On top of this, these two characteristics that Fodor and Pylyshyn decide to blend are the weakest of each. In the classic view, atomic symbols are pretty powerless unless they are combined using systematic rules. In the connectionist views, the associations between representations are also quite powerless except for the fact that their representations are distributed ones with a great deal of internal structure. Clearly, Fodor and Pylyshyn's example is not the standard for connectionism.

What Fodor and Pylyshyn overlook is the primary characteristic of potentially cognitive networks - distributed representation. They brush it off nonchalantly. As we will see, though, it can be a powerful, not to mention relevant, element.

4.3. A New Hope: Distributed Representations

So just what is a distributed representation and why is it so important? Contrary to Fodor and Pylyshyn, they are the fundamental basis of the connectionist movement. The primary direction for AI research followed by connections is the attempt to find a new basis for meaning. As I will elaborate on in the last section, there is a growing dissatisfaction with atomic symbols. Distributed representations are the connectionist movement's alternative.

With distributed representation, the basic level of representation is not a single node, but the pattern of activation over a group of nodes. This is a far cry from the atomic tokens within classical AI, and far more difficult to understand. Atomic tokens are easily understood, most likely due to parallel with variables in computer programs and simple algebra. Understanding distributed representations requires a massive theoretical shift.

For one thing, the nodes themselves are not exactly the representation. They are merely the objects used to instantiate the representations. Consequently, a given set of nodes within a network is able to instantiate a large variety of representations. To give a simplified example, say that we have a connectionist network within which a group of 5 nodes are responsible for representations. The rest of the network can simply be a camera for input, and a speech box that outputs "Hey, that's a ______!". Now particular patterns of activation will correspond to particular representations. For example, 01101 could represent cat, 01100 could represent dog, and 10011 rock. Whenever those 5 nodes match one of those activation patterns, they are said to represent that thing. The 5 nodes could represent any of them or none of them. It's the particular patterns that are the representations.

It is quite a change in representational structure to say the least. All that is really necessary to understand at the beginning is that representations and nodes are not assigned on a one-to-one basis as in the Fodor and Pylyshyn example. Instead, the activation pattern over a group of nodes is the representation.

It is also important to note that these representations are complex without being molecular. According to classical AI, there are only atomic and molecular representations. Atomic representations are indivisible, simple pieces. One piece with no internal structure. Molecular representations are directly composed of atomic or other molecular representations. You can remove a part and it will merely be the same representation missing that piece. For example, a representation may go from "Ralph ate the meat that was spoiled." to "Ralph ate the meat."

Distributed representations possess characteristics of both atomic and molecular representations. They are complex, being composed of the activation pattern of 2 or more nodes: in the above examples, each representation had 5 distinct parts. However, qua representations, they are indivisible. A distributed representation depends for its identity on all of the activations. You cannot remove even one without altering the entire representation. In the distributed cat representation, each node does not correspond to a particular part or feature of the cat: the representation is one holistic entirety. So distributed representations are complex yet indivisible. When I look at connectionist compositionality in a little while, I will go more in depth into these two different forms of complexity and internal structure.

4.3.1. Just What Difference Would Distribution Make?

You may now be wondering just what benefits distributed representations would have other than scaring off your opposition with techno babble. There are many in fact. The largest and main reason why connections first began to pursue them is that a group of nodes can carry vastly more information than any single node can. This is simply obvious and a very good reason to prefer them. The information may not be literally present in the representations, but far more can be encoded within a large group of them rather than a single on/off.

Another very large benefit of distributed representations is their ability to generalize. Between the training methods and the very nature of connectionist networks, systematic rules naturally emerge, and the network will generalize beyond its training set. Localist connectionist systems cannot do this at all. It is also interesting to note that while classical models are able to do this, they must be specially modified to do so. With distributed representational networks, this generalizing is automatic.

Lastly, a major bonus for distributed representations is that they are able to perform operations and possess properties that localist models cannot. The type of operations and processes that Fodor and Pylyshyn are looking for in order to support a language of thought fall within this group. As they clearly showed, localist models cannot account for the big two: compositionality and systematicity. I will show later that connectionist networks that use distributed representations can have compositionality and systematicity.

4.3.2. The (Brief) Argument Against Distribution

Fodor and Pylyshyn do mention distributed representations. However, it is quite clear that they misunderstand the entire basis of them. In two footnotes, they state that the "[localist/distributed] distinction will usually be of little consequence to our discussion" (Fodor and Pylyshyn, 1988, p.5) and that "nothing relevant to this discussion is changed" (Fodor and Pylyshyn 1988, p.15) if their example were to be a distributed representational one rather than localist. Clearly, they merely take it on assumption that localist versus distributed is only a matter of implementation.

This assumption is very wrong. In even the earliest connectionist research, it was clear that having representations distributed over many nodes rather than being on a one-to-one relation allowed for far more flexibility and possible characteristics. Apparently, all distributed representations cannot be merely reduced to localist ones. There is a relevant theoretical difference involved, not just an implementational one.

Fodor and Pylyshyn do actually argue against distributed representations, but very briefly. Their argument is solely on the basis of compositionality. They claim that since the individual parts of the representation (each activation value) are not a separable and "semantically evaluable" on their own, the internal structure involved is irrelevant. Representations must be divisible in order to be considered compositional. For example, with the distrbuted cat representation, the individual parts (each 1 and 0) carry no meaning (are not semantically evaluable) so they are irrelevant to the issue. So, again, distributed representations supposedly add nothing new to the issue. Arguing against this will be the focus of my next section.

5. Connectionist Compositionality

5.1. Brands of Compositionality

Our two main questions now are whether or not distributed representations can possess compositionality and systematicity. To date, more work has been done on compositionality, so we shall look at that first. For starters, we are familiar with Fodor's form of compositionality where a complex representation is composed of constituents each of which is either atomic or another complex representation. This is the traditional version. It involves recursively adding the new parts on so that the constituent parts are literally present within the representation. For example, ((P&Q)&R) is created from adding 'P', 'Q', and 'R' together in that order. The final, complex representation still has each of the original constituents preserved within it.

However, compositionality itself is a bit broader than this. Constituency is just one method to bring about compositionality. The actual definition of compositionality that I will use is:

The ability to create a representation of a structured object in such a way as to preserve the original structure in a usable form.
I feel that Fodor and Pylyshyn would be unlikely to argue with this. It does not specify the method for forming these representations, but they would say that the only way to do so is through constituent structure, or "concatenative compositionality" as Tim van Gelder has called it (199?, 1990).

I believe that compositionality can take other forms. Constituent structure is the most widely accepted and perhaps even paradigmatic version, but I feel that it is not a necessary one. There is a more general, catch-all category that involves any method of representing a structured object in a way different from Fodor's concatenative compositionality. Tim van Gelder had called this alternative "functional compositionality". A representation of a structured object is created with all relevant information accessible, but not literally present.

The reason that Fodor and others do not believe compositionality can exist without the constituent parts being literally present is that they confuse where the structure must be. Functional compositionality is representation of structure rather than a structured representation. All of the information concerning the original structure is accessible.

All that is needed is functional structure. For a representation to be useful and compositional, it must merely contain the information concerning the original structure. Fodor did not realize this. He thought that the only way to do this was to have the representation directly mirror the original structure. There is no reason for this other than it has worked well so far.

Connectionism's problem is with concatenative compositionality. Using the concatenative method, to represent an object of arbitrary size, you would need an arbitrary amount of space to represent it in. This is not feasible in a connectionist network. The number of nodes is normally specified at the onset, and all representations must fit within a given number of nodes. Programming a connectionist network to change its size as necessary is an extremely challenging task at best, and perhaps even impossible to properly train at worst.

Connections are consequently left with having to develop a form of functional compositionality. This form would have to be able to represent an arbitrary size object with a fixed-size representation. The answer to this is, of course, distributed representations. If trained in the proper manner, it is possible to consistently form representations of objects of varying size objects as well as consistently recover them with the original structure still intact.

5.2. The Possibilities of Functional Compositionality

There are three models in particular that have tried (rather successfully in my opinion) to create connectionist compositionality. They are: Pollack and his Recursive Auto-Associator Memory, Hinton's Representational Hierarchy by Reduced Description, and the Tensor Product Vectors of Smolensky (van Gelder, 1990). All three of these differ in details, but they share a fundamental similarity by relying on functional compositionality.

With each of these models, there is an input which is to be represented. This input is then encoded into a distributed representation. It is also possible to output the original pattern of input given only the distributed representation form of it. They all have their own method for doing this, but they each perform the same basic function - transform the input into a distributed representation. The differences between each of these models is irrelevant to our discussions.

The way that these representations can be understood is by picturing them as vectors within a multi-dimensional space. Sounds easy, huh? It actually is not too bad. Going back to our 5 node example with the cat, dog, and rock representations, each of these representations would correspond to a vector within a 5 dimensional space (corresponding to the number of nodes). The values for each of these nodes are its coordinates within this space. This move is helpful in understanding the role of relations between representations as well as the method for achieving compositionality.

If two representations are very similar, then they are near each other within this multi-dimensional space. For example, dogs and cats are somewhat similar (at least more so than dogs and rocks), consequently their activation patterns are quite similar (differing in only one value).3 Vector mathematics is quite handy with connectionist models. The amount of similarity and difference can be determined quantitatively. Also, the method of combining two representations without literally attaching one to the other most often uses vector multiplication. The two vectors are combined to make a third one. This third representation bears no surface resemblance to either of them, but both are contained within it and are always fully recoverable.

On a slight side note, this also helps to understand the natural generalizing ability of connectionist networks. Training a network on one representation spills over to all other representations to a degree that is directly proportional to their similarity. The more similar they are the more the training of one will affect the other. So training our network on cat would greatly affect its training on dog but hardly affect rock since it is quite unrelated.

To return to compositionality, this encoding of representations is not done randomly. This generalizing ability causes the network to settle towards a set of connections where the representations are ordered according to their similarity. The coordinates within the multi-dimensional space are assigned according to systematic rules that naturally develop.

For combining two representations, concatenative compositionality would simply stick them together. This type of functional compositionality blends them together instead. Using straightforward vector mathematics, the two vectors are combined into a third one. Within this new representation, neither of the originals is literally present. All of the information is still contained, however, and can be recovered just as easily. This is a peculiarly connectionist way of storing and retrieving information - through distributed representations. This fulfills all the requirements of the general definition of compositionality.

Furthermore, it is impossible to create a model of an yof these networks, or even come close, with a localized connectionist network. There is no way of capturing all of the information these distributed representations contain. To go even a step further, this functional compositionality can be richer in meaning than its concatenative counterpart. Distributed representations are sensitive to more subtle and various similarities among representations than just constituency. Classical, concatenative compositionality handles constituent relations extremely well, but does little else. Distributed representations can handle constituent relations (not as naturally as classical, however, but still rather well) but can also contain the information relevant to a whole host of other similarities.

6. Connectionist Systematicity

6.1. Groundwork

Now that it seems apparent that connectionist networks can possess compositionality, what about systematicity? Fortunately, the problem is not too daunting. Systematicity relies quite heavily on compositionality, and with that, we are already well on our way. Having representations that contain all the relevant internal structure, then, in principle, it should be possible to have the system perform systematic operations on those representations. The next step is simply to show that this can, indeed, be done. As van Gelder put it, "Having designed their flying machines, connections now need to show they can actually stay in the air" (van Gelder 1990, p.380).

This is easier said than done, though. Since most work has been focused on connectionist compositionality, and that has only been achieved recently, very little has been done in the way of systematicity. Most networks will use a distributed representation until it is necessary to perform any systematic processes on it, then convert it over to the more traditional version.

This leaves one to wonder, perhaps systematicity only works with concatenative compositionality. Even though all of the same information is contained with functional compositionality, the particular structure of the concatenative version is necessary. Since the information in the original structure is merely recoverable within a distributed representation rather than being literally present, there is some plausibility to this notion.

I, surprisingly enough, disagree. Functional compositionality is all that is necessary for systematicity, on one condition. The constituent parts of the distributed representation must be individually recoverable and analyzable. In other words, you do not have to decode the entire thing just to get a single piece of it. You can remove a piece and leave the rest of it in a distributed format. If the representation must be transformed back to its original state entirely to access any bit of it, then storing it as a distributed representation is unnecessary. It has no advantage over classical systems and is most likely inferior.

If a system can create a representation that contains all the information concerning the original structure and can perform operations on that representation using that information without completely converting it, then the system will possess systematicity. This also sidesteps Fodor and Pylyshyn's objection that connectionist networks rely solely on association for their processes. If a process uses the information contained within the representation, then it is not purely "Associationist".

6.2. First Steps: Syntactic Transformations

Whether or not there is a type of distributed representation that allows for the accessing of individual parts is a very challenging theoretical question. Empirically, however, it is much easier. The biggest problem is waiting for someone to do it. Well, Pollack did it.

Pollack was able to make a version of his RAAM architecture that can have simple syntactic transformations performed upon it (Chalmers, 1990a). In this case, the network is able to create distributed representations of sentences, and then transfer them from the active to passive voice (or vice versa). This transformation is done entirely on the distributed representation, it is never restored back to its original state.

The network works best at a 3N-N-3N set-up as diagramed in Figure 4. The 3N is the input layer where the sentence, in its original state, is first fed into the network. The middle layer is the hidden layer where the distributed representations are instantiated. The third layer is the output layer for the transformed sentence. Within this simple architecture, is a very basic version of systematicity.

In addition to this, the system is very reliable, achieving 100% accuracy in its transformations. This accuracy includes sentences (within its vocabulary) that it was never trained on. Even with what it generalizes to, it is perfectly accurate.

This clearly shows that it is possible to perform operations on a distributed representation using the information contained within it. These operations are not merely statistical tricks, where it learned certain associations and correlations. That could not account for the accuracy of the generalized instances. These transformations make use of the information of the original structure that is encoded within the representations, even though the representation itself does not have that structure. Our flying machines have stayed in the air. Connectionist networks can possess systematicity.

7. Where Do We End Up?

7.1. Differing Cognitive Models: The Hard and Soft Approaches

To recap things, I have just finished showing how connectionist networks that rely on distributed representations as opposed to localist ones, can possess functional compositionality. This is a more generalized form of compositionality than the one Fodor prefers, concatenative compositionality. However, it has all of the relevant attributes and is entirely capable of encoding and decoding a representation consistently. From there I showed that this new form of compositionality can yield systematicity. Pollack even gave a clear empirical example of a connectionist network that performed syntactic transformations, a paradigmatic operation for language of thought.

Basically, it seems that Fodor and Pylyshyn were wrong and that connectionism is a possibility for cognitive modeling. Now we are left with their last argument, mostly presented by Fodor himself. He claims that the ultimate argument for the classical view of language of thought is that it is the only viable alternative around. A few others may try to pop up their heads briefly, but then don't last. Therefore, since the classical view is the only one around, it would seem to be the winner for now.

Through showing that connectionism can possess compositionality and systematicity, the two criteria set down by Fodor and Pylyshyn themselves, connectionism has become a viable alternative. It seems that the classical view is no longer the only option around. So which will it be then?

Smolensky (1987) addressed this issue, although from a broader perspective. He refers to the Paradox of Cognition. He claims that there are two distinct approaches to cognitive modeling: (i) the hard approach - the mind seems to be characterized by rules, and (ii) the soft approach - rules alone do not seem to be enough. This is the paradox. On the one hand it seems that the mind follows clear rules, on the other hand, the rules only seem to be able to explain things to a point. The matter of whether or not the mind is only a set of rules seems paradoxical. This is parallel to the classical/connectionist debate since the classical endeavor follows the hard approach and the connectionist the soft.

In actuality the answer most likely lies somewhere in the middle: the majority of the mind is characterized by rules, but not all of it is. Some parts lay below rules. By following both approaches at the same time, it should be much easier to find that spot. Also, the problems of one are compensated by the benefits of the other. They complement each other quite well.

Unfortunately, the vast majority of the research has been done following the hard/classical view, and very little has been done on the soft approach. The majority of the work done on the soft approach has only been in recent years as well. So there does seem to be an imbalance here, an imbalance that Fodor and Pylyshyn wish to preserve.

The soft side is quickly growing, though, for reasons I will address shortly. Connectionism is quite compatible with the soft approach, and it has finally proven itself to be at least a viable alternative. The issue now is: of these two alternatives, which, if any, is better?

7.2. The Actual Lure of Connectionism

Fodor and Pylyshyn offered several explanations for the lure of connectionism. Most, however, were not very substantial. They mostly dealt with facts of implementation, misconceptions about either side, as well as touching on the two I will talk about in a moment. Needless to say, the massive growth in the connectionist movement recently must be for good reason. As we have seen, connectionism is gaining ground, but still lagging behind. The two biggest reasons, in my opinion, are the nature of other mental processes, and the problems with classical AI.

7.2.1. Those Overlooked Cognitive Abilities

Fodor has consistently used analogies to the English language to better explain his views. However, with his theory of mind involving a language of thought, constantly using these examples can easily lead to trying to explain everything possible linguistically. Fodor has come to see language as the basis for all intelligence.

I feel that language is an enormous part of thinking and helps humans to organize our thoughts better, but I do not see how something that has all the same brain power as us, but no language, cannot be intelligent and possess any kind of conscious thought. Supporters of this view have taken linguistic ability (which is allegedly unique to humans) and becuse of its human uniqueness claimed that it the most important aspect of intelligence and even consciousness.

Whereas language is very important, I have not seen any conclusive evidence to support the notion that it is the basis for all intelligence. There are many, many mental processes that are not even remotely linguistic in nature. A partial list by Chalmers of mental processes in which compositionality is unimportant, includes: "perception, categorization, motor control, memory, similarity judgements, associations, and attention" (Chalmers 1990b, p.347).

For a machine to be fully intelligent, it must incorporate all or nearly all of the characteristics of the mental, whether they are linguistic or not. Classical AI handles the language-like aspects better, and connectionism does surprisingly well at most of the other aspects. So a full AI system would most likely incorporate parts of both.

7.2.2. That Pesky Chinese Room

The Chinese Room argument was originally presented by John Searle in 1980 and has been widely debated since. The argument itself has many problems of its own, but the basic idea behind it well symbolizes the general concern over classical AI. The Chinese Room argument is primarily attacking the claim that atomic symbols are able to possess meaning without internal structure. This is one of the biggest objections to classical AI. I will not go into any further detail, but suffice it to say, classical AI is not without its own problems. Interestingly enough, connectionism seems to handle those problems quite well.

Also, classical AI has pretty much dominated the research for decades. Most of its failings are quite apparent now. With most of these problems lingering with little or no progres made to resolve them, many researchers are seeking alternative theories. This general dissatisfaction with classical AI and the growing desire for an alternative have fanned the flames of connectionism.

7.3. Concluding Remarks

Where do we end up? Fodor and Pylyshyn set out to establish that connectionism cannot support a language of thought because of its lack of compositionality and systematicity. It is possible to possess both of these in distributed representations even if the compositionality involved is of a more general and functional variety than Fodor's concatenative version.

We seem to be left with two options, classical and connectionist AI. I think that the final answer will be a blend of both (even if it is skewed more towards connectionism). At the very least, connectionism will offer a brand new perspective on the issues. However, connectionism seem to be faring better than that, and may someday be an equal alternative to classical AI, or perhaps even take over the lead that young students will argue against since it would then be "establishment".

It would seem that the classical dominance is at an end. Connectionism is carving out its own place, even if it is with the "lesser" mental functions. Contrary to Fodor and Pylyshyn, it is a force that demands attention, not to mention that the amount of controversy stirred up is revitalizing the field. Many people are dissatisfied with the recent lack of progress in classical AI. Far from being dead, connectionism is alive and thriving extremely well.


Bibliography

Chalmers, D. (1990a). Syntactic transformations on distributed representations. Connection Science, 2: 53-62.

Chalmers, D. (1990b). Why Fodor and Pylyshyn were wrong: The simplest refutation. In Proceedings of the Twelfth Annual Conference of the Cognitive Science Society, pp. 340-347.

Elman, J. (1990). Structured representations and connectionist models. In Proceedings of the Twelfth Annual Conference of the Cognitive Science Society, pp.17-23.

Fodor, J. (1975). Language of Thought. Scranton, PA: Crowell.

Fodor, J. (1987). Psychosemantics: The Problem of Meaning in the Philosophy of Mind. Cambridge, MA: MIT Press.

Fodor, J. (1990). Theory of Content and Other Essays. Cambridge, MA: MIT Press.

Fodor, J. (1994). Concepts: A potboiler. Cognition, 50: 95-113.

Fodor, J. & Pylyshyn, Z. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28: 3-71.

Pollack, J. (1988). Recursive auto-associative memory: Devising compositional distributed representations. In Proceedings of the Tenth Annual Conference of the Cognitive Science Society, pp. 33-39.

Polack, J. (1990). Recursive distributed representations. Artificial Intelligence, 46: 77-105.

Rumelhart, D., McClelland, J., and the PDP Research Group. (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition Vols. 1-2. Cambridge, MA: MIT Press.

Searle, J. (1980). Minds, brains, and programs. Behavorial and Brain Science, 3: 417-458.

Smolensky, P. (1987). The constituent structure of connectionist mental states: A reply to Fodor and Pylyshyn. Southern Journal of Philosophy, 26: 137-163.

Sterelney, K. (1991). The Representational Theory of Mind: An Introduction. Cambridge, MA: Blackwell.

Van Gelder, T. (199?). Compositionality and the explanation of cognitive process.

Van Gelder, T. (1990). Compositionality: A connectionist variation on a theme. Cognitive Science, 14: 355-384.


Footnotes

  1. These mental states may or may not be composed by further smaller mental states such as ones that represent legs in general, my right in particular, the act of hopping and so on. However, for the time being, this point is pretty much irrelevant. The arguments work just as well with these simplified, concrete thoughts. [Return]
  2. For the remainder of this section, I will refer to this as AIR - Aunty's Intentional Realism. Not all non-LOT intentional realists support Aunty's view. However, it does offer a good contrast to the LOT model, so I will use it simply for the fact that it helps to clarify Fodor's position. [Return]
  3. I must admit that these values are arbitrarily assigned by me. I do this more for simplicity than anything else. Using actual examples would complicate the issue more and would require much more understanding of the computer science side of this issue. Suffice it to say, this is in principle what happens with these networks when they are trained. [Return]

Back to the top.
Back to my papers.
Back to my main page.