First posted 9.9.99
Updated 25.4.01
by Craig Webster
In a fascinating and farsighted report written in 1948 Alan Turing suggested that the infant human cortex was what he called an unorganized machine (1). In the following, I discuss Turing's ground-breaking ideas on unorganized machines and intelligence, with working examples of such machines, which I derived in 1994.
Turing defined the class of unorganized machines as largely random in their initial construction, but capable of being trained to perform particular tasks. There is good reason to consider the cortex unorganized in this sense: there is insufficient storage capacity in the DNA which controls the construction of the central nervous system to exactly specify the position and connectivity of every neurone and by not hard-wiring brain function before birth we are able to learn language and other socially important behaviours which carry great evolutionary advantage.
Turing gives two examples of artificial unorganized machines, which he claims are about the simplest possible models of the nervous system. The first type are A-type machines – these are randomly connected networks of NAND gates (where every node has two states representing 0 or 1, two inputs and any number of outputs). The second type Turing calls B-type machines – these are derived from any A-type network by intersecting every inter-node connection with a construction of three further A-type nodes which form a connection modifier as shown in Figure 1. Nodes are shown as circles, and in Figure 1 the connection modifier intersects a length of circuit cd. Arrowheads on connecting circuits show the direction in which binary pulses flow. The values in nodes x and y in Figure 1 affect the behaviour of the circuit they are connected to – acting like a kind of memory unit. B-type networks are therefore a special and more interesting case of A-type networks, as the connection modifiers greatly facilitate training by an external agent by allowing functional modifications to be made at any point in the network. This is an advantage, which would be very unlikely to arise spontaneously from a large randomly connected A-type network. Figure 2 below is a non-randomly connected example of a functional B-type network, which counts to 10 then starts again. The red circles marked m show three of the connection modifiers, which have been placed between every node in the network. For clarity, connection modifiers (Figure 1) are reduced to three small circles in Figure 2. All connection modifiers have been set to the correct values for the network to operate as a decimal counter. The blue circle marked s indicates the start node, which holds a value of 1. When this B-type machine operates the value at s traverses the network in an anti-clockwise direction.
Figure 2 
A B-type cortex would begin with a very large number of nodes and follow a developmental path with the same delicate mix of the random and determined as a living brain. At a magnification where individual nodes and connections could be seen, the resulting very large B-type network would typically look much like a bowl of spaghetti. Such a disorderly structure is prone to forming feedback loops of varying lengths which take varying times to traverse, thus forming possible delay or memory circuits. In a large network these loops can lead to greatly varying patterns of activity, regardless of input, since activity can be perpetually recycled in a complex manner. The activity in many conventional neural networks stops when the output layer settles into a stable pattern; the equivalent of a Turing Machine halting, its computation over. But just as the brain does not halt, large B-type networks will tend not to either.
Self-stimulating feedback loops, variable length circuits and continuously varying patterns of activity are all known features of the central nervous system and have been implicated in many cognitive processes and in intelligence itself (4). A recent view of the brain called Dynamicism stresses the central importance of self-stimulating feedback loops in almost every aspect of brain function (5). According to this view, information is not encoded in individual cells, but rather in waves of excitation which sweep the brain like ripples on a pond. New stimulus causes new ripples, but also interferes with the old ripples which are memories, making the overall activity pattern distributed and complex. Your brain is continuously active, even when you are asleep, as the majority of inputs to a neurone come from feedback loops, not from the world. B-type networks with their propensity to form loops of various lengths may be well suited to model the kind of massive, widespread feedback and interacting waves of activity that modern theories like Dynamicism imply.
Genetic algorithms are an efficient kind of complex search, which allow a desirable set of values to be found in the very large space of all possible values for a particular problem. A GA mimics the process of natural selection by setting up a population of artificial "organisms" and allowing them to reproduce based on selection pressures defined by the user. For example, if we intended to produce a B-type network capable of binary addition by this method, we would create a population of randomly connected B-type networks and test each in turn at each of the four possibilities of the goal task. Each B-type network would get a score in the range of 0 to 4 dependent on the number of additions it got right. Initially some individual networks in the population would score better than others, even if apparently only by chance. These scores would become the fitness measures of the individual networks and dictate their number of offspring. The most fit networks would be disproportionately over-represented in the next generation, while those poorer scoring networks would be under-represented or drop out of the population altogether. Artificial sexual reproduction is then typically employed where artificial genes are swapped between paired-off parents ranked in order of fitness, producing new networks which are composites of their parents. If this test-and-reproduce cycle is repeated for many generations individual networks will become better at the goal task until eventually a network will be created which gains a perfect score. Click here to see a successful binary addition network.
This process is in effect a replacement for training since it progressively discards networks which cannot perform the task until it comes across a network configuration which can – the resulting network is the end point of an exhaustive search. This is appropriate for constructing networks with fixed function for simple tasks such as binary addition. However, GAs can also be used to find network configurations for more complex tasks such as creating a network which is capable of human-like adaptation or learning. Although considerably more difficult, GAs could be used to construct a network with an appropriate structure to allow it to carry out the more subtle self-modifications involved in learning by itself or heeding instruction. Such a B-type network would be Turing's unorganized-machine cortex, which would be ready to assimilate information from its environment just as a human brain is primed to learn language after birth.
Some real evolved networks More about Alan Turing About the author