Neural Computation and Psychology 8 (NCPW8) - Abstracts

 

Session 1: Memory and Learning (Thursday 28th August, 9:15-12:00)

Short/long term memory in terms of activation versus weight based processes.

Marius Usher (presenting author) and Eddy Davelaar

Dept. Psychology, Birkbeck College, Malet Street, London, WC1E 7HX (e.davelaar@psychology.bbk.ac.uk, m.usher@bbk.ac.uk)

 

The distinction between short and long-term memory processes is at present debated due to a series of effects, such as recency (higher recall for the last items in a list) previously associated with STM that have also been discovered in long-term memory tasks. I will describe a Hebbian account of short- and long- STM/LTM where the two type of memory correspond to activation versus synaptic (weight) based processes. A computational model will be presented, where the activation based processes drive the formation of synaptic changes (episodic traces) between item representations and a changing context. The theory makes predictions about the shape of serial position function in immediate free recall and about dissociations between short and long term recency, supported in a series of experiments. The results may also help to better understand specific patterns of neuropsychological memory deficits (amnesia/frontal-lobes lessions).

On the nature of active memory

Eddy J. Davelaar (presenting author) and Marius Usher

Dept. Psychology, Birkbeck College, Malet Street, London, WC1E 7HX (e.davelaar@psychology.bbk.ac.uk, m.usher@bbk.ac.uk)

 

In recent work, we have focused on the role of active memory in free recall tasks (Davelaar & Usher, 2002; Usher & Cohen, 1999). We have shown that a model based on units with self-recurrent excitation and global inhibition explains how a short-term buffer could be implemented in the brain. This limited-capacity buffer is referred to as an active memory component, as it constitutes the activated part of long-term semantic memory and it provides a contribution to episodic memory additional to the one from contextual learning implemented as a synaptic based system. In this talk, the relation between the buffer parameters and the following aspects will be discussed. First, it will be shown how the model parameters relate to neural processes that are deficient in patients with brain damage. To this end, deviant serial position functions documented for patients with frontal, medial-temporal or striatal brain damage will be simulated. Second, it will be shown how the model behavior is affected by the neuromodulation of its inputs, accounting for a variety of processing in working memory, such as updating versus maintenance of the information as well as ignoring irrelevant information. Third, a comparison between the activation-based memory buffer and the mathematical buffers used in some computational models of free recall memory (e.g. SAM; Raaijmakers & Shiffrin, 1980) will be carried out. These three aspects (relation to neuropsychology, working memory, mathematical buffers) shape the domain in which our implementation of active memory can be applied and thereby can provide a means to bridge the separate literatures on short-term/working memory.

Effect of the learning material structure on retroactive and proactive interference in humans: When the self-refreshing neural network mechanism provides new insights

Serban C. Musca (presenting author), Stephane Rousset and Bernard Ans

Laboratoire de Psychologie et NeuroCognition, Universite Pierre Mendes France, BP 47, 38040 Grenoble Cedex 9, France (Serban.Musca@upmf-grenoble.fr)

 

If retroactive interference (RI) is a well-known phenomenon in humans, the differential effect of the structure of the learning material on RI was only seldom addressed. Mirman and Spivey (2001) investigated this effect and reported on behavioural results that show more RI for the subjects exposed to structured items than for those exposed to unstructured items. These authors claim that two complementary memory systems working in a radically different way (localized vs distributed) are required to account for the behavioural results they reported. Now in the behavioural experiment of Mirman and Spivey (2001) the proactive interference (PI) level was not controlled. With the same paradigm we investigated the influence of the nature of the to-be-learned material on RI. Subjects learned meaningless A-B associations, took a forced-choice recognition test, then learned meaningless C-D associations and took a forced-choice recognition test on all the learned associations. Subjects learned sequentially either two lists of 'structured' or two lists of 'unstructured' associations. Unstructured associations are arbitrary, structured associations are built using two simple rules per list. If PI level is left uncontrolled no influence of the nature of the to-be-learned material on RI is found. Controlling for PI level, more RI was found for subjects exposed to unstructured items than for those exposed to structured items, a result opposed to that of Mirman and Spivey (2001). Two control experiments confirmed that the subjects in the 'Structured' condition could generalize to novel associations, and that they learned the items at the exemplar level as the subjects from the 'Unstructured' condition did. A first simulation using a classical three-layer backpropagation network produced a pattern of RI results that mirrored qualitatively the structure effect we found in humans. However the amount of RI was high. Moreover, the network did not exhibit PI but a proactive advantage: Training on List2 associations gave better results after training on List1. In a second simulation the memory self-refreshing neural network model of Ans and Rousset (1997, 2000) was used. As expected, amount of RI was smaller and the structure effect on RI was still present. Moreover, PI was observed, and also a structure effect on PI. Furthermore, as for the behavioural data, the structure effect on RI and the structure effect on PI were negatively correlated.

Limited capacity dimensional attention and the configural-cue model of stimulus representation

Paul D. Bartos      

The Psychology Department, Faculty of Social Sciences, The Open University, Milton Keynes, MK7 6AA. (p.d.bartos@open.ac.uk)

 

Gluck and Bower's (1988) configural cue model is an associative network that represents stimuli using independent nodes for each feature and feature combination within the stimulus. This form of powerset representation was first used to simulate performance on non-linear discrimination tasks by Wagner and Rescorla (1972). While conceptually influential in associative learning and categorisation research, the configural cue model of stimulus representation is highly limited with respect to its ability to simulate a variety of experimental findings. One of its main limitations is the lack of any clear method for incorporating secondary learning processes such as selective attention. A new approach to the configural cue model is proposed in which node activation is dependent on the average characteristics of a sequential component or dimensional sampling process. This process may be described in terms of a Markov process. Associability of and contribution from singlet nodes is controlled by component sampling probabilities with the same characteristics for configural nodes being governed by the probability of sequentially sampling their member components.

Learning algorithms based on error reduction may be applied to alter the matrix of transition probabilities governing the behaviour of the sampling process on each trial. This allows the model to qualitatively simulate learning effects that seem to based on limited-capacity dimensional attention. Because distribution of capacity across dimensions by the model is based on learnt relationships between instantiated dimensions, the approach also allows the model to be used to simulate attention learning and associative learning with stimuli that vary in terms of their dimensionality. This represents a potential advance over many models used in category learning research where dominant models are either only applicable to stimuli that do not vary in terms of their dimensionality (such as ALCOVE (Kruschke, 1992)), or make use of component-cue stimulus representations that are incapable of learning non-linear discriminations (such as EXIT (Kruschke, 2001)).

Session 2: Vision (Thursday 28th August, 12:00-13:00)

Spatiotemporal Linear Simple-Cell Models Based on Temporal Coherence and Independent Component Analysis

Jarmo Hurri (presenting author), Jaakko Vayrynen and Aapo Hyvarinen

P.O.Box 9800, FIN-02015 Helsinki University of Technology, Finland (jarmo.hurri@hut.fi)

 

The search for computational principles that underlie the functionality of different cortical areas is a fundamental task in science. In the case of sensory areas, one approach to this issue is to examine how the statistical properties of natural stimuli – which in the case of vision include natural images and image sequences -- are related to the operations that the neurons seem to perform. For simple cells, the most prominent computational theories linking neural properties and stimulus statistics are temporal coherence, and independent component analysis (ICA) / sparse coding (in the case of visual data, ICA and sparse coding are closely related). For these theories, the case of spatial linear cell models has been studied in a number of recent publications, but the case of spatiotemporal models has received fairly little attention.

Here we first provide a short introduction to these theories, and to the results obtained with spatial models. We then examine the spatiotemporal case by applying the theories to natural image sequence data, and by analyzing the obtained results quantitatively. We compare the properties of the spatiotemporal linear cell models obtained with the methods against each other, and against parameters measured from real visual systems.

Predicting collision: a connectionist model

Joni Karanka (presenting author) and David Luque

Carril del Capitan n 3 blq.1 5A. Postal Code 29010, Malaga, Spain. (davidluque2001@yahoo.es, jonikaranka@mixmail.com)

 

The information about time-to-collision is critical for animals. Making interceptions, visual navigation and avoiding collition is crucial for adaptation in a competitive environment. There have been many proposals of how time-to-collision is computed. Tau function shows an easy way to compute time-to-collision independently of object size or speed by dividing the visual angle of the approaching object by its rate of change (Lee, 1976). The evidence for this function is not clear in all conditions and tasks. Rho function is the absolute rate of expansion of the object, and so it presents effects product of object size and speed. Eta function has been recently proposed and it has been developed from neurophysiological data in locusts (Hatsopoulos, Gabbiani & Laurent, 1995). Eta peaks before the collision, and is sensible to size and speed, being a good predictor of collision. The results of different tasks are not conclusive for any of the functions, but there is some evidence that several sources of information are implicated in the timing of time to collition (Sun & Frost, 1998). We propose a framework to formalize the integration of these sources of information with the advantages of a parallel distributed processing model. We have created for the task a Elman network with a input layer that represents a unidimensional retina. The retinal size of aproaching objects was presented in the input layer, changing orthogonally size and speed of every object. The network learned to predict time-to-collision correctly for all sizes and speeds. When presented with objects of new sizes and speeds, the network showed correct generalization of the prediction response. Among the preceding functions, rho is which explain more variance, being quite similar in its behaviour to the behaviour of the network.

 

* Lee, D. N. (1976). A theory of visual control of braking based on information about time-to-collision. Perception, 5, 437-459.

* Hatsopoulos, N., Gabbiani, F. & Laurent, G. (1995). Elementary computation of object approach by wide field visual neuron. Science, 270, 1000-1003.

* Sun, H. & Frost, B. J. (1998). Computation of different optical variables of looming objects in pigeon nucleus rotundus neurons. Nature Neoroscience, 1(4), 296-303.

Session 3: Face Recognition (Thursday 28th August, 14:00-16:15)

Modeling Face Perception

Garrison W. Cottrell

Computer Science and Engineering Department, Institute for Neural Computation, University of California, San Diego, USA                    

 

We have modeled various aspects of face perception using connectionist networks. We have shown how developmentally reasonable constraints can lead to a "face expert" network, how expertise with one domain can lead to faster learning of expertise in another domain (e.g., why the Fusiform Gyrus might get recruited for Greeble processing if it is already a face expert), and how disparate theories of facial expression recognition can be resolved in a single model. In the latter domain, we have shown how a single model can accomodate categorical perception theories as well as "dimensional" theories of facial expression perception, and how a single model can explain the apparent independence between identity and expression processing without positing separate representations for these. We review these results in this presentation.

Face recognition: average or exemplar?

Peter Hancock (presenting author) a, Mike Burton b and Rob Jenkins b

a. Department of Psychology, University of Stirling, Scotland FK9 4LA (pjbh1@stir.ac.uk)

b. Department of Psychology, University of Glasgow, 58 Hillhead Street, Glasgow, Scotland G12 8QB (m.burton@psy.gla.ac.uk, r.jenkins@psy.gla.ac.uk)

 

For all the work that has been done on human face recognition, we still don't know whether we store lots of exemplars of a given face, or some kind of average. Burton and Jenkins (2003) have demonstrated that a PCA-based face recognition system works much better if faces are averaged prior to being coded by the system.  We hypothesise that this is because averaging removes extraneous variations such as lighting.  I shall present some simulation work that demonstrates why this should be and discuss implications for both models of human and artificial face recognition.

Exploring the case for a psychological "face-space".

C.J. Solomon (presenting author), S.J. Gibson, A. Pallares-Bejarano and M.Maylin

School of Physical Sciences, University of Kent, Canterbury, Kent, UK CT2 7NR (C.J.Solomon@kent.ac.uk)

 

This paper will examine the basis and utility for the concept of a psychological face-space in which a given facial appearance can be represented as a location in an abstract vector space. The origins of the concept will be discussed and we will present a specific and powerful incarnation of the idea which we term appearance space. Practical applications of this model to automated caricature generation and construction of facial composites will be demonstrated.

Session 4: Action and Navigation (1) (Thursday 28th August, 16:45-18:15)

Applying forward models to sequence learning: A connectionist implementation

Dionyssios Theofilou (presenting author), Arnaud Destrebecqz and Axel Cleeremans

Cognitive Science Research Unit, Universite Libre de Bruxelles, CP122, Av. F.-D. Roosevelt 50, 1050 Brussels, Belgium (Dionyssios.Theofilou@ulb.ac.be)

 

Constant interaction with a dynamic environment - from riding a bicycle to segmenting speech - makes sensitivity to the sequential structure of the world a crucial dimension of information processing. In sequence learning (SL), participants are asked to perform a choice reaction task. Unbeknownst to them, the material contains sequential structure, so that the location of each stimulus depends on the context set by previous stimuli. Results indicate that participants react faster to predictable stimuli than to random stimuli, thus suggesting that they prepare their responses based on implicit knowledge of the sequential contingencies contained in the material ([7], see [1] for a review).

Albeit detailed models of SL based on Elman's Simple Recurrent Network (SRN) [4] exist [2], such models are unable to account (1) for data suggesting that participants learn not only about sequences of stimuli, but also about sequences of responses (e.g. [5]), and (2) for empirical data indicating that SL is enhanced when responses have specific effects (e.g. tone-onset), even when these effects are irrelevant. The latter specifically suggests that voluntary action is initiated by anticipation of the sensory changes that will result from it.

To address these issues, we explored how forward models, which originated in control theory [6], [8], can account for basic SL data. In such models, two distinct components interact continually. The action network (AN) receives goals and stimulus inputs and produces appropriate actions. These actions then serve, together with the stimulus, as input to the second network, the forward network (FN). The FN is trained to produce the next stimulus, that is, the expected sensory consequences of the actions produced by the AN. In this framework, learning how to produce appropriate actions depends on previous learning about what consequences each action entails. As a first step towards the goal of developing a theory of SL rooted in these ideas, we simulated SL data obtained in [3]. In our adaptation, both AN and FN are SRNs, and processing is cascaded to make it possible for the model to capture the time course of processing within a single trial. The output of the FN at time t (representing the network's expectation about the identity of the next stimulus) is used as input to the action network at t+1, together with the next stimulus. This makes it possible for the network's responses to be shaped by its expectations. Simulation results are presented and discussed in light of current SL theory.

 

[1].  Cleeremans, A.., Destrebecqz, A., & Boyer, M., (1998), Implicit learning: News from the front. Trends in Cognitive Sciences, 2, 406-416.

[2].  Cleeremans, A.., & McClelland, J.L., (1991), Learning the structure of event sequences. Journal of Experimental Psychology: General, 120, 235-253.

[3].  Destrebecqz, A.., & Cleeremans, A.., (2001), Can sequence learning be implicit? New evidence with the Process Dissociation Procedure. Psychonomic Bulletin & Review, 8(2), 343-350.

[4].  Elman, J.L., (1990), Finding Structure In Time. Cognitive Science, 14, 179-211.

[5].  Hoffman, J., Sebald, A., & St=F6cker, C., (2001), Irrelevant response effects improve serial learning in serial reaction time tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 470-482.

[6]. Jordan, M.L., & Rumelhart, D.E., (1992), Forward models: Supervised learning with a distal teacher. Cognitive Science, 16(3), 307-354.

[7].  Nissen, M.J., &  Bullemer, P. (1987), Attentional requirement of learning: Evidence from performance measures. Cognitive Psychology, 19, 1-32.

[8]. Wolpert, D.M. & Kawato, M., (1998), Multiple paired forward and inverse models for motor control. Neural Networks, 11:1317-1329.

The Simulation of Character Production Behaviours in Connectionist Networks

F.M.Richardson (presenting author), N.Davey and L.Peters

Department of Computer Science, University of Hertfordshire, College Lane, Hatfield, AL10 9AB, UK. (F.1.Richardson@herts.ac.uk)

 

In order to draw (or copy) a character, a process of linearising takes place. In this process the complete static form of the character is broken down into a temporal sequence of strokes for production. According to Thomassen, Meulenbroek and Tibosh, (1991), individuals develop their own production rule base, which is reflected as tendencies or strategies for graphic production. Often the sequence in which strokes are produced is influenced by the direction in which script will be read (Alston and Taylor, 1987). In the case of writing that is produced from left to right, the sequence of strokes usually commences at the leftmost point of the character, and progresses through neighbouring strokes. However, in some cases characters are produced starting with the topmost point or first vertical line of the character. Occasionally, these principles of production come into conflict resulting in a variable sequence of production for some characters. For example, a letter 'T' can be produced starting with either the horizontal crossbar or the vertical stroke (Richardson, 2000).

This work uses a connectionist modelling approach to investigate the emergence of production-based behaviours in the sequential production of characters (Richardson 2002). The aim was to discover whether connectionist networks are capable of simulating a complex task such as character production sequence, without the use of explicitly imposed heuristics. Networks were trained using back-propagation through time (Rumelhart, Hinton and Williams 1986). During training they were presented with a static visual depiction of a character upon an input array and were required to produce the correct sequence of strokes with which it was drawn. Following this, the networks were tested using a batch of previously unseen characters. Interestingly, the networks produced all characters using similar sequence production-based behaviours to those seen in humans, specifically, generating variable production sequences for characters (such as 'T') with more than one viable method of production. These results are interesting as they demonstrate that not only are connectionist networks capable of emulating the production-sequence a behaviour of humans, but also that rule-like tendencies can emerge naturally upon the basis of learning experience.

 

1. Alston, J. and Taylor, J. (1987) The Sequence and Structure of Handwriting Skills. Handwriting: theory, research and practice. New York: Nichols.

2. Richardson, F.M., Davey, N., Peters, L., Done, D.J., Anthony, S.H. (2002) Connectionist Models Investigating Representations Formed in the Sequential Generation of Characters. Proceedings of the 10th European Symposium on Artificial Neural Networks. D-side publications, Belgium, 83-88.

3. Richardson, F.M. (2000). Stroke properties of capital letters of the alphabet. Technical Report. University of Hertfordshire.

4. Rumelhart, D.E., Hinton, G.E., Williams, R.J. (1986). Learning internal representations by error propagation. In Parallel Distributed Processing, vol. 1, chap 8. Cambridge: MIT Press.

5. Thomassen, A., Meulenbroek, R., and Tibosh, H. (1991). Latencies and kinematics reflect graphic production rules. Human Movement Science, 10, p271-289.

An integration of two control architectures of action selection and navigation inspired by neural circuits in the vertebrate: the basal ganglia.

Benoit Girard (presenting author) a, David Filliat b, Alain Berthoz c, Jean-Arcady Meyer a and Agnes Guillot a

a. Animatlab, LIP6, CNRS-Universite Pierre et Marie Curie, 8 rue du Capitaine Scott, Paris, France (benoit.girard@lip6.fr)

b. DGA/Centre Technique d'Arcueil, Arcueil, France

c. Laboratoire de Physiologie de la Perception et de l'Action, CNRS-College de France, Paris, France

 

Contemporary robots are predominantly single-task systems, operating in specially-designed environments, in which they perform pre-programmed sequences of actions. While such systems have proved appropriate and useful on the factory floor, many future applications for robotics will require control systems with much greater flexibility. An effective strategy for designing flexible control systems is to reverse-engineer the biological control systems implemented in animals.

The 'Psikharpax' project is a multi-partner program which aims at synthesizing an `artificial rat', whose control systems are modelled on neural circuits of a specific model animal -the laboratory rat. A control architecture of action selection, inspired by the neural circuits of the basal ganglia -a group of subcortical nuclei of the vertebrate brain- was already implemented into a robot which was required to select efficiently between a set of actions in order to `survive' in an environment where it could find places for `ingestive' and `digestive' behaviours. The results have shown that the model was able to generate adaptive switching which ensured survival. Nevertheless, the lack of a navigation system occasionally prevents the robot from reaching essential environmental resources.

The present paper concerns the connexion of this architecture with a navigation system. This connexion is also inspired by recent hypotheses concerning the basal ganglia. A particular nucleus - the striatum – which was formerly modelled as a whole, is now segregated into dorsal and ventral parts, the latter corresponding to the nucleus accumbens (Nacc) that is assumed to integrate spatial, motivational and sensorimotor information. The Nacc selects locomotor actions generated by various navigation strategies and modulated by the motivations. The dorsal striatum is in charge of non-spatial tasks selection and of the coordination with the Nacc selection. Implemented in a simulated robot performing the same task as before, the whole architecture improves the robot's survival in a large environment, using the abilities of building a cognitive map and of path planning. Furthermore, the robot is also able to occasionally neglect the information recorded in its cognitive map in order to behave opportunistically, i.e. to reach an unexpected but visible resource, rather than to reach a memorized but remote one. These results are discussed from the perspective of the pros and cons of such a biomimetic approach.

Session 5: Developmental Processes (Friday 29th August, 9:00-11:15)

The bottom-up nature of category acquisition in 3- to 4-month old infants: Predictions of a connectionist model and empirical data

Bob French

Quantitative Psychology and Cognitive Science, Universite de Liege, Sart Tilman, 4000 Liege, Belgium (rfrench@ulg.ac.be)

 

Three- to 4-month-old infants presented with cat or dog images form a category representation for Cat that excludes dogs and a category representation for Dog that includes cats (Quinn, Eimas, & Rosenkrantz, 1993). We have accounted for this asymmetry by positing an inclusion relationship in the distribution of features present in the cat and dog images (Mareschal, French, & Quinn, 2000). Using a combination of computational modeling and experimental testing of infants, we show that the asymmetry can be reversed or removed by selecting and manipulating stimulus images that reverse or remove the inclusion relationship. The findings suggest that categorization of cat and dog images by young infants is a bottom-up driven process based on learning occurring within the experimental task.  We will also discuss how this work can be fit into a more neurobiologically plausible framework using Gabor-filtered input.  We will discuss various possible future directions for this research.

Asymmetric Categorization of Cats and Dogs: The Representational Acuity Hypothesis

Gert Westermann (presenting author) and Denis Mareschal

Centre for Brain and Cognitive Development, Birkbeck College, University of London, Malet Street, London WC1E 7HX (g.westermann@bbk.ac.uk, d.mareschal@bbk.ac.uk)

 

It has been found that infants at 3-4 months of age show an asymmetry in their construction of categories of cats and dogs (Quinn, Eimas & Rosenkrantz, 1993). 3-4 month old infants familiarized with pictures of cats respond with interest when subsequently presented with a picture of a dog, indicating that they have formed a category of cats that excludes dogs. By contrast, when familiarized with pictures of dogs, the infants do not show increased interest to subsequent cat pictures, indicating that their category of dogs includes novel cats. It has previously been argued that this asymmetry is based on a featural analysis of the stimuli: the characteristic features of dogs vary broadly and in many cases cover the more narrow range of variation in the same features of cats. This hypothesis has been modelled an auto-encoder neural network (Mareschal, Quinn & French, 2002). However, this model had two weaknesses: first, it relied on a training regime in which all familiarization stimuli were presented together before adaptation occurred. This was different to the infant experiments, where pictures were presented sequentially. Second, their model was unable to account for developmental change, namely, the disappearance of the asymmetry in older infants.

Here we present a new model of categorization that overcomes these problems. The model is an auto-encoder neural network with Gaussian hidden units. The model implements the 'Representational Acuity Hypothesis' which states that objects are represented on cortical maps in terms of their salient features, and during development these representations pass from low to high acuity based on a decrease in neural receptive field sizes. Decrease of receptive field size is modelled with a decrease in the width of the Gaussian hidden unit activation function. This model has previously been shown to account for the developmental trajectory in other perceptual categorization tasks (Westermann & Mareschal, in press). In contrast to a featural account of the categorization asymmetry, the model represents objects holistically, albeit with progressively higher degrees of acuity. With large receptive fields, representations of dogs cover those of cats, but not vice versa. During development, receptive field sizes shrink and become better tuned to individual objects, so that representations no longer overlap and the asymmetry disappears. In this way, the model can account for the categorization asymmetry in young infants and its disappearance in older infants. The model thus presents a biologically plausible account of behavioural change in infancy.

An Embodied Computational Model for the Emergence of Gaze Following

Jochen Triesch (presenting author) and Eric Carlson

Department of Cognitive Science, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093-0515 (triesch@ucsd.edu)

 

We present a computational model of the emergence of gaze following behavior in infant caregiver interactions. We regard gaze following as a skill that infants acquire because they learn that monitoring their caregiver's direction of gaze allows them to predict where interesting objects/events in their environment are (Moore, 1996). In particular, we propose a specific "basic set" of mechanisms that are sufficient for gaze following to emerge (Fasel et al., 2002). This basic set comprises perceptual and motivational biases and habituation mechanisms driving the infant to look at and shift attention between "interesting" visual stimuli, a generic learning mechanism that learns behavioral strategies to satisfy these preferences, and a structured environment providing correlations between where caregivers look and where interesting stimuli are. We formalize these ideas in a simple model based on temporal difference learning. We analyze the model and demonstrate that a) the proposed basic set of mechanisms is indeed sufficient for gaze following to emerge and b) alterations of parameters of some of the basic set mechanisms motivated by findings on developmental disorders lead to impairments in the learning of gaze following that are typical of these disorders.

Connectionist models of autism

Joe Levy

School of Psychology and Therapeutic Studies, Roehampton University of Surrey, Whitelands College, West Hill, London SW15 3SN (j.levy@roehampton.ac.uk)

 

Autism is a complex and pervasive developmental disorder affecting social interaction, communication and behavioural flexibility. Cognition, perception and emotion have all been implicated in theories of autism. There are many theoretical accounts for this condition including executive disorders, failures in interpersonal emotional connectedness, poor theories of mind, weak coherence in low-level cognitive processes and extreme patterns of a "male-brain" cognitive style.

Recently, several connectionist approaches have attempted to characterise some of the aspects of autism. They can all be linked to accounts claiming  that autism is caused by weak central coherence. Cohen (1994) has invoked large hidden layers in feed-forward networks as a mechanism for over-fitting and thus a lack of generalisation in autism. Gustafsson (1997) has suggested that excessive lateral inhibition causing poor and restricted topographical maps in a Kohonen network could be a cause of over-specific categorisation. McClelland (2000) has implicated an excessive reliance on conjunctive encoding as a cause of over-specificity. O'Loughlin and Thagard (2000) employ symbolic networks to show how a disruption in constraint satisfaction can cause seemingly autistic behaviour in a linguistic task and a task requiring a "theory of mind".

Most of these models only cover very specific aspects of autism and fail to model how low-level deficits develop into a number of linked high level problems. However, it will probably be very useful to use connectionist modelling to understand autism (and other developmental disorders) because such models can be seen to develop in a way that no verbal description is able to.

This paper will review these models and other relevant approaches to development and ask how mutually incompatible they are. Possible syntheses and new ways forward will be suggested including combinations of learning rules and the implications of new work that has implicated deficits in binding with autism. If autism is caused by poor binding mechanisms we can ask whether it is possible to model this purely representationally or whether mechanisms such as temporal synchrony must be invoked.

Session 6: Category Acquisition and Binding (Friday 29th August, 11:45-13:15)

Empirical Evidence & Neural Network Modeling of Feature Creation during Perceptual Category Acquisition

Michael Fink (presenting author) a, Gershon Ben-Shakhar b and David Horn c.

a. Interdisciplinary Center for Neural Computation, The Hebrew University of Jerusalem, Israel. (fink@vision.caltech.edu)

b. Department of Psychology, The Hebrew University of Jerusalem, Israel. (mskpugb@mscc.huji.ac.il)

c. Department of Physics, Tel Aviv University, Israel. (horn@brain.tau.ac.il)

 

Our study is aimed at detecting the factors influencing perceptual feature creation. By teaching several new perceptual categories in a controlled setting, we demonstrate the emergence of a new intermediate internal representation. We focus on contrasting the role of two basic factors that govern feature creation. The first is the feature-set's discriminative value measured by the mutual information the feature-set delivers on the required perceptual categories. The second factor is the feature-set's degree of parsimony. In our experimental process we use a simple model problem that requires learning categories based on a conjunction of four binary input elements. If the feature's discriminative information is the sole dominant factor in the feature creation process the more informative quadruple features should emerge. On the other hand if feature-set parsimony plays a major role in the feature creation process, the internal representation of the learned categories should be based on pair-features, conjunctions of two input elements. Several methods of exploring the structure of internal features are developed using a multi layered perceptron (MLP) artificial neural network. These methods were empirically implemented in two experiments, both demonstrating the emergence of an intermediate internal representation that is based on pair-features. As a result, we conclude that in addition to the feature-set's discriminative information value, the feature-set's parsimony is a major factor influencing the hierarchical process of feature creation. Our results suggest that feature parsimony is maintained not only in order to optimize the perceptual system's current resource management but also to aid future category learning.

Does the energy spectrum from Gabor wavelet filtering contain sufficient information for neural network recognition and classification tasks?

Martial Mermillod (presenting author) a, Nathalie Guyader b,c, and Alan Chauvin b,c

a. Cognitive Science Department (Bat B.33), Universite de Liege, 4000 Liege 1, Belgium (mmermillod@ulg.ac.be)

b. Laboratoire de Psychologie Experimentale, CNRS, UMR 5105

c. Laboratoire des Images et des Signaux, CNRS, ESA 5083., INPG, 27 Av Felix Viallet, Grenoble

 

Results from neurophysiological studies (Gollisch & Herz, in press) suggest that the energy spectrum (i.e., the square of the amplitude spectrum) can be used to simulate in an appropriate physiological manner the spectral integration of sensory neurons. We have attempted to show the effectiveness of energy-spectum descriptors for neural network simulations of a high level cognitive task. We used a neurobiologically plausible simulation of the complex cells in the striate cortex as a perceptual model. This was done with a bank of Gabor wavelets applied in the Fourier domain in order to simulate mammalian visual processes (Jones & Palmer, 1987; Jones, Stepnoski & Palmer, 1987). The energy-value outputs of this perceptual model were presented to two different kinds of neural network classifiers and their respective performances were recorded for a recognition/classification task. The stimuli used correspond to 6 categories of 12 natural scene images: Beach, City, Forest, Mountain, Indoor and Village. In the first simulation, a backpropagation autoencoder was reliably able to distinguish a specific stimulus from other exemplars of the same or different categories. Using a method proposed by Mareschal & French (1997), we also compared the classification performance of the autoencoder for unseen exemplars from the training category versus stimuli from new categories.  Categorization is performed in a natural (i.e., self supervised) manner. In a second simulation, we tested the energy vectors with a standard backpropagation hetero-associator. Both results show a reliable ability of the two types of neural networks to categorize and generalize on new exemplars based on the information provided by the energy spectrum of the natural scene images.

Understanding Object Feature Binding Through Experimentation and Modelling

Carolyn Mair (presenting author) and Martin Shepperd

ESERG, School of Design Engineering and Computing, Bournemouth University, UK. (cmair@bournemouth.ac.uk, mshepper@bournemouth.ac.uk)

 

In an attempt to demonstrate underlying brain mechanisms, psychophysical data, from an experiment designed to further understand object feature binding, are modelled using an attractor network. The experiment used a post-cue rapid serial visual processing paradigm in each condition: spatial, temporal and spatio-temporal. In the spatial conditions, objects were presented simultaneously at pre-fixed unique locations: adjacent to, or distant from the target. In the spatial condition, objects were presented simultaneously at fixed, unique positions; in the temporal condition, objects were presented sequentially at the focal point; and in the spatio-temporal condition, objects were presented sequentially at fixed, unique positions. A ~50% error rate was determined during practice trails and stimulus onset asynchrony was thus determined for each condition and each observer. Results indicate that when objects are distinguishable by their spatial location, these spatial properties are used preferentially to temporal properties. When both spatial and temporal properties are available, spatial properties facilitate binding, and temporal properties hinder it. Our expectation is that the neural model, in line with physiological evidence of enhanced cortical activation, will exhibit regions of enhanced activity when the post-cue is input and the target will be retrieved. By testing the experimental hypotheses on the neural model, we expect to gain a better understanding of how and where object feature binding takes place.

Session 7: Attention and Cognitive Architectures (Friday 29th August, 14:15-16:30)

Through Attention to Consciousness by CODAM

JG Taylor

Dept of Mathematics, King's College Strand, London WC2R2LS, UK (john.g.taylor@kcl.ac.uk)

 

Attention is arguably a necessary condition for consciousness. The problem as to what more is needed will be addressed in the talk by using an engineering control approach to attention, following much work on such an approach in motor control. After a brief account of the control nature of attention, an engineering control model for attention will be developed for both sensory and motor modalities. Simulations of various paradigms will be described [JGT & M Rogers (2002) Neural Networks 15:309-326; N Fragopanagos & JGT (2003) Simulating Sensory Motor Attention, KES'03]. The model will then be extended to the COrollary Discharge of the Movement of Attention (CODAM) model [JGT (2002) Trends in Cognitive Sciences 6:206-210; JGT (2002) J Consc Studies 9(4):3-22; JGT (2003) KES'03]. The application of CODAM to a variety of areas (simulations of pyschological paradigms, such as the Attentional Blink, schizophrenia, neglect, other deficits] will be covered briefly in conclusion.

Model Visual Search Experiments: A new version of the Selective Attention for Identification Model (SAIM)

Dietmar Heinke (presenting author) and Glyn W. Humphreys

Behavioural Brain Science Centre, School of Psychology, University of Birmingham, Birmingham B15 2TT, UK (d.g.heinke@bham.ac.uk)

 

We have recently published SAIM (Selective Attention for Identification Model,Heinke & Humphreys, 2003, Psych. Review) which uses competitive processes and interactions between top-down and bottom-up processes to achieve translation-invariant object recognition. We have shown that SAIM can account for large amounts of empirical data on normal and impaired human attention. In this paper we report on an extension of SAIM by a simple feature extraction. We show that the new version of SAIM can capture important aspects of findings in visual search experiments, e.g. variations of search slopes and search asymmetries, depending the contents of search displays. These simulations represent further support for the hypothesis that parallel competitive processes could be a source for major findings in visual search experiments. Additionally we present experimental data examining the influence of priming on reaction times in visual search. These results confirm predictions which originate from the interaction between top-down and bottom-up pathways in SAIM.

An Oscillatory Neural Model of Cognitive Functions: Feature Representation, Binding, Memory, and Attention

Roman Borisyuk (presenting author) and Yakov Kazanovich             

School of Computing, University of Plymouth, Plymouth, PL4 8AA (borisyuk@soc.plym.ac.uk)

 

Biologically plausible oscillatory neural model of several cognitive functions is developed. The system architecture includes a hierarchy of interactive modules which are associated with different stages of visual information processing:

·         Representation Module (RM): information encoding, features extraction and representation, feature binding.

·         Central Executive (CE): attention focus formation, consecutive selection of objects in visual field into the attention focus.

·         Memory Module: memorisation of objects, novelty detection.

Each module consists of interactive phase oscillators with synchronising and desynchronising interactions. The dynamical functioning of the system is based on principle of synchronisation and resonant activity increase. Phase-frequency coding scheme is used in the primary layer of information representation: grey level of the corresponding pixel is represented by natural frequency of the oscillator. There are other layers of RM to represent local (in neighbourhood of the pixel) and invariant features of objects. A mechanism of feature coding and feature binding is based on the principle of neural activity synchronisation.

Attention is realised in the system in the form of synchronisation of the CE with some of RM oscillators. Those oscillators that work synchronously with the CE are supposed to be included in the attention focus and their amplitudes drastically increase forming a resonant ensemble of oscillators. These resonant oscillators demonstrate the regime of partial synchronisation (some part of RM oscillators work synchronously with the CE but not all oscillators are synchronised). The inclusion of the CE in the system design is important because it allows one to organise the global interaction in the system without connections of "all-to-all" type, thus avoiding an exponential explosion of the number of lateral connection At each moment of time the activity of an ensemble corresponding to a selected object is significantly higher than the activity in other regions of the network. We demonstrate that the system can perform both attentional grouping and consecutive selection of objects. Functioning of the attention system is based on phase-locking mechanism and adaptation of natural frequency of the CE.

Memory mechanism is based on sparse spatial representation of objects by groups of synchronous oscillators and adaptation of their natural frequencies. Novelty detection is realised via different reaction times on presentation of new and familiar stimulus: tonic (long) respond on new stimulus and phasic (short) reaction on familiar object.

Neural Network Models of the Attentional Blink

Howard Bowman (presenting author) a, Brad Wyble a and Phil Barnard b

a. Computing Laboratory, University of Kent, Canterbury, UK, (H.Bowman@kent.ac.uk)

b. Cognition and Brain Sciences Unit, Cambridge, UK

 

A set of experiments that employ rapid serial visual presentation (RSVP)have clarified the time course of selective attention by identifying what has come to be called the attentional blink (AB). Many theoretical explanations of the phenomenon and associated informal models have been advanced. However, until recently, no connectionist model had been developed. In response we have constructed two prototype neural network models of the blink.

In the typical AB experiment, letters are presented using RSVP at around ten items a second. One letter (T1) is presented in a distinct colour. It is the (first) target whose identity must be reported. A second target (T2) follows after some number of intervening items. For example, the person may have to report whether the letter X was among list items that followed T1. Detection of T2 is impaired with a characteristic serial position curve.

Within the context of theoretical accounts of the blink, Chun and Potter's 2-stage model [1] has been our most direct inspiration. Their first stage implements a fast identification process, traces of which decay rapidly and are subject to erasure. It is only through the second stage that items can be consolidated into working memory.

Both our models are based upon this two-stage account (although a symbolic computational model [2] has also influenced us). The first model employs an attentional control system. This contains two neural pathways, one that feeds activation into the first recognition stage and the second that feeds activation into the second stage. However critically, the two pathways compete through lateral inhibition and thus, only one of them can be active at any instant. It is this attentional control mechanism that generates the blink.

The second model is more explicit in its explanation of the stage 2 process of consolidation into working memory. Specifically, it assumes that in order to consolidate T1 items into working memory a resonating circuit has to be set-up between working memory cells and recognition cells. It is this resonating activity that closes the door on T2 processing.

Although very simple, both the models successfully reproduce the AB serial position curve, showing a clear decline in performance for serial positions 2, 3, 4 and 5. In addition, the networks show the characteristic property of lag 1 sparing, i.e. if T2 immediately follows T1, there is no performance cost. Also our second model reproduces a number of the other AB properties, e.g. masking effects.

 

[1] Chun, M. M. and M. C. Potter (1995). "A Two-Stage Model for Multiple Target

Detection in Rapid Serial Visual Presentation." JEP : HPP 21(1): 109-127.

[2] Barnard, P.J. and H. Bowman (2003) "Rendering information processing models of cognition and affect computationally explicit: Distributed executive control and the deployment of attention." Cog Sci Quart, 3(3):32, (in press).

Session 8: High Level Cognition and Implementation Issues (Friday 29th August, 17:00-18:30)

A connectionist account of the development of transitive analogies

Robert Leech (presenting author), Denis Mareschal and Richard Cooper

School of Psychology, Birkbeck, University of London, Malet Street, London WC1E 7HX, UK (r.leech@psychology.bbk.ac.uk)

 

We present a connectionist model of the development of analogical reasoning capable of modelling relational analogies between domains of two or more objects. In the model, relational analogy completion arises as a bi-product of pattern completion in a dynamic memory system. The current model is an adaptation of Leech, Mareschal and Cooper (2003) extending it to draw more complex and abstract analogies. Units are connected by two types of modifiable connections: fast connections which transmit the current activation of the units and slow connections which implement a delay transmitting an earlier activation state of the network. The fast connections drive the network into attractor states corresponding to objects. The slow connections implement transformations between states by pushing the network out of its stable state and into another attractor basin. The fast and slow connections work together to move the network from one attractor state to another in an ordered way. Relations are assumed to be transformations between different states of objects. Analogy is achieved by an earlier example of a relation priming the network so that subsequent presentation of an object produces the appropriate analogical response. Since the network can learn transformations between more than two objects the network can draw analogies involving more than two objects. We investigate the developmental plausibility of the network by comparing its performance with that of 3- and 4-year olds on transitive analogies reported by Goswami (1995).

Multiple Person Inferences: A View of a Connectionist Integration

Frank Van Overwalle

Department of Psychology, Vrije Universiteit Brussel, Pleinlaan 2, B -1050 Brussel, Belgium (Frank.VanOverwalle@vub.ac.be)

 

This paper provides a connectionist account of the processes underlying the multiple inference model of person impression formation proposed by Reeder, Kumar, Hesson-McInnis and Trafimow (2002).  First, in a replication and extension of one of their main studies, I found evidence for discounting of trait inferences when facilitating situational forces were present consistent with earlier causality-based theories, while at the same time I replicated the lack of discounting in moral inferences as documented and predicted by Reeder et al.  (2002).  Second, to provide an account of how these different and sometimes contradictory inferences are formed and integrated in a coherent person impression, I extended existing recurrent network models of person perception by assuming that perceivers take into account also the actor's motives. Together with the behavior information, the connectionist network automatically integrates these inferred motives, resulting in a pattern of inferences that closely reproduces the observed data.  It is concluded that perceivers apparently have a much richer knowledge on which they base their inferences than assumed so far in earlier theories, and that a connectionist approach appears a plausible candidate to account for this complex integration process.

Approaches to efficient simulation with spiking neural networks.

Ioana Marian a, Colm G. Connolly a (presenting author) and Ronan G. Reilly b

a. Dept. of Computer Science, University College Dublin, Belfield, Dublin 4 Ireland (Ioana_Marian@rdslink.ro, Colm.Connolly@ucd.ie)

b. Dep of Computer Science, National University of Ireland, Maynooth, Co. Kildare, Ireland (Ronan.Reilly@may.ie)

 

Computer simulations of the nervous system play an increasingly prominent role in understanding the way neurons process information. Spiking neural networks received special attention after experimental evidence suggested that biological neurons use the timing of the spikes to encode information and compute. They represent a powerful tool for investigating how cognitive functions emerge from the properties of basic components that interact and function cooperatively.

When modelling the dynamics of large-scale spike-processing networks, time and memory efficiency are crucial criteria in the design of the simulation. Previous work on the efficient simulation of such networks has indicated that high performance simulators for rate-coding networks are not appropriate. The event-driven nature of spiking neural networks require special attention when designing efficient simulation environments.  Significant efforts have been made in the last decade to maximise the computational efficiency of these simulators.

This paper critically reviews current work on data structures and algorithms, appropriate for efficient event-driven simulation of spiking neural networks on single processor systems. These techniques make feasible the simulation of highly active large networks. However, they place limits on the networks that they can process. We describe two additional algorithms which we have developed in an attempt to overcome these limitations. Moreover, simulation studies show that these algorithms deliver significant performance improvements.

Session 9: Language and Speech (Saturday 30th August, 9:00-11:15)

Sublexical units in the computational modelling of visual word recognition.

Richard Shillcock (presenting author) and Padraic Managhan

School of Informatics, University of Edinburgh, 2 Buccleuch Place, Edinburgh EH8 9LW, UK (rcs@inf.ed.ac.uk)

 

We explore two effects reported in visual word recognition: the neighbourhood effect and the transposed-letters effect. In the first, the recognition of a word from a larger lexical neighbourhood is facilitated. In the second, minimally different words like "bolt" and "blot" seem to interact during the recognition of one of them. We show that both of these effects can be captured by the same, anatomically-based approach to visual word recognition.

How the constraints on English compound production might be learnt from the linguistic input: Evidence from 4 connectionist models

Jenny Hayes (presenting author), Victoria Murphy, Neil Davey and Pam Smith

Psychology Department, University of Hertfordshire, College Lane, Hatfield, AL10 9AB (j.hayes@herts.ac.uk)

 

Native English speakers include irregular plurals in English noun-noun compounds (e.g. mice chaser) more frequently than regular plurals (e.g. *rats chaser) (Gordon, 1985). This dissociation in inflectional morphology has been argued to stem from an internal and innate morphological constraint as it is thought that the input to which English speaking children are exposed is insufficient to signal that regular plurals are prohibited in compounds but irregulars might be allowed (Marcus, Brinkmann, Clahsen, Wiese & Pinker, 1995).  In addition, this dissociation in English compounds has been invoked to support the idea that regular and irregular morphology are mediated by separate cognitive systems (Pinker, 1999). The evidence of four recurrent connectionist models provides support for an alternative view that the input the language learner is exposed to constrains the types of English compounds that are produced. Model 1, demonstrates that there is a discernable relationship between the [-s] morpheme and word finality in child directed speech. Thus to include the regular plural [-s] morpheme internal to words such as compounds contravenes an obvious pattern in the input. Models 2, 3 and 4 investigate the hypothesis that the regular plural morpheme is omitted from the middle of compound words because the pattern noun morpheme [-s]-noun is used to denote possession not plurality in English. Having make a first order distinction between the function of various words (nouns, verbs, determiners and adjectives), Model 2, using a localist coding scheme to represent sentences made up of 38 words (c.f. Elman, 1990), learnt a second order distinction that nouns could appear after some [-s] morphemes but not others (even though the two [-s] morphemes were encoded in the same way in the input). With the addition of the absolute minimum of semantics, (whether the subject of the sentence was a singular or a plural thing), the model learnt to further differentiate between the plural and the possessive [-s] morpheme (Model 3) (Hayes, Murphy, Davey, Smith and Peters, 2002). In Model 4, a large training set of natural child directed speech was employed and the syntactic category of each word (rather than the word) was input. The actual frequency of each syntactic category in real child directed speech was represented. Under these realistic input conditions there was a suggestion that the network was able to recognise that the noun morpheme [-s] pattern occurred in different patterns when it was plural than when it was possessive. Specifically, the network showed some indication of being able to discern that nouns follow possessives but not regular plurals. Thus it is argued that input the language learner is exposed to does constrain the types of English compounds that are produced without the need for internal morphological constraints.

 

* Gordon, P. (1985). Level-ordering in lexical development. Cognition, 21, 73-93.

* Hayes, J.A., Murphy, V.A., Davey, N., Smith, P.M., & Peters, L. (2002). The /s/ morpheme and the compounding phenomenon in English. In W. D. Gray & C.D. Schunn (Eds.) Proceeding s of the 24th Annual Conference of the Cognitive science Society. Mahwah, NJ: Lawrence Erlbaum Associates.

* Marcus, G. F.,  Brinkmann,  U.,  Clahsen,  H.,  Weise, R., & Pinker, S. (1995). German inflection: The exception that proves the rule. Cognitive Psychology, 29, 189-256.

* Pinker, S. (1999). Words and Rules. London: Weidenfeld & Nicholson.

Using the Structure Found in Time: Building real-scale orthographic and phonetic representations by Accumulation of Expectations

Fermin Moscoso del Prado Martin (presenting author), Robert Schreuder and R. Harald Baayen

Interfaculty Research Unit for Language and Speech, University of Nijmegen & Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands (fermin.moscoso-del-prado@mpi.nl)

 

Large-scale distributed connectionist models of language processing require distributed representations of orthographic and phonetic/phonological word forms that allow to capture the similarity structures among all words in a language. Ideally, such representations should themselves be learned from the input, without resorting to hand-crafted schemas such as slot-based templates  (e.g., CCCVVCCC, ...) that preimpose great amount of structure into the networks. Additionally, templates ignore the fundamentally sequential nature of human language, introducing arbitrary restrictions on word length and morphological complexity. Finally, template-based approaches require artificial manipulations of the input by means of alignment and introduction of `gaps'. It has been argued that such manipulations of the original input sequences implicitely assume some sort of symbolic preprocessing, therefore damaging the sub-symbolic processing assumptions. Elman (1990; 1993) showed how a Simple Recurrent Network (SRN) trained on next letter or next phoneme prediction on a sufficiently large sample of language, developed in its hidden layer detailed representations of the phonotactics and orthotactics of a particular language. SRNs overcome most of the problems described above; they provide for a natural way to represent strings of virtually unlimited length without having to resort to predefined `possible' structures. In the present study we show how the weighted accumulation over time of the activation values of neural networks trained on next-letter or next phoneme prediction renders detailed representations of the orthographic and phonetic form of all words (and also for pseudo-words) in a language. These representations render detailed measures of word similarity that provide a continuous alternative to discrete-valued techniques for estimating phonetic and orthographic overlap between words, with the added value that the similarity space takes into consideration the properties of a particular language. Using this technique, Accumulation of Expectations, we present examples of building detailed orthographic and phonetic vectors for the full lexicons of English and Dutch. We show that these vectors can be succesfully used in connectionist models of lexical processing with unrestricted vocabulary sizes, including classical tasks in connectionist modelling such as past-tense formation and word spelling. We conclude by showing that the similarity spaces defined by such representations accurately predict human responses in behavioural experiments.

A SOM-based model of speech segmentation

James Hammerton

Alfa-Informatica, University of Groningen, Postbus 716, 9700 AS Groningen, The Netherlands (james@let.rug.nl)

 

In recent years a number of models of speech segmentation, both connectionist and non-connectionist, have been developed. Whilst the non-connectionist models include unsupervised models, the connectionist models have employed Simple Recurrent Networks (SRNs) trained with back-propagation and thus requiring an explicit error signal. However, these models have used networks trained on some other task and then exploited the behaviour of the network to predict word boundaries making them only indirectly supervised.

In this talk, a connectionist unsupervised model of speech segmentation, based on Kohonen's Self-Organising Map, will be presented. The SOM is chosen because it does not need an explicit error signal and is a biologically plausible architecture (e.g. both the operation and training of the SOM correspond to process known to occur in the brain -- the training of SRNs with back-propagation does not), and thus whether a SOM can become sensitive to the patterns in speech is of interest. During training it is presented with

phonotactic transcriptions of child-directed speech taken from the Korman corpus of the CHILDES database. At the end of training, the units which are activated at the end of utterances are noted and then whenever these are activated during testing, a word boundary is predicted. To date, this model has achieved reasonable, though not state of the art, results, with fscores (an aggregate of precision and recall) of 68.15 for finding word boundaries and 30.20 for finding words, when using purely phonetic input.

A direct comparison of this model with another connectionist model developed by Christiansen et al, based on the SRN is currently being worked on and will also be presented in the talk. Christiansen et al's model is trained to read in utterances phoneme by phoneme and predict the next phoneme or whether an utterance boundary occurs. The behaviour of the output unit is then used to predict word boundaries, on the basis that its activation is higher than for word internal positions. The comparison will consider both the performance and the plausibility of the 2 models and how they might be made more plausible and their performance improved.

Finally, the question of why current connectionist models do not perform as well as the best non-connectionist models (e.g. Brent's INCDROP model) will be discussed, and it is intended that some results will be presented based on an attempt to account for the performance gap.

Session 10: Evolution (Saturday 30th August, 11:45-13:00)

On the Evolution of Irrational Behaviour

John A. Bullinaria

School of Computer Science, The University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK (j.a.bullinaria@cs.bham.ac.uk)

 

Many aspects of human and animal behaviour require individuals to learn quickly how to classify the patterns they encounter. For example, which foods are good or safe to eat, which other animals should be feared, which environments should be avoided, and so on. One might imagine that evolution by natural selection would result in neural systems emerging that are very good at learning things like this. Explicit simulations of the evolution of simple developmental neural systems, that are required to classify various types of sensory information, show that such rational behaviour can indeed emerge quite easily.  However, the same simulations also reveal that there are situations in which evolution lets the species down, and populations emerge that perform rather irrationally.  These populations are effectively trapped in a local maximum of evolutionary fitness, and are unable to escape into the true maximum that corresponds to optimal behaviour.  One can speculate on which aspects of human behaviour this might correspond to. I shall present the results from a selection of my simulations that begin to explore the issues involved.

SOH, a Self Observing Heuristic to control neural networks learning process

Jean-Jacques Mariage

CSAR group, AI Laboratory, Paris 8 University, 2, rue de la Liberte, St Denis, France, Cdx 93526 (jam@ai.univ-paris8.fr)

 

We here present the theoretical frame of an ongoing research. Our aim is to realize an adaptive learning system based on neo-Darwinian evolution of neural units. We proceed in two complementary directions. On one hand, we try to automatically compute the costly tuning phase of the configuration and learning parameters of neural networks (NN)s. On the other hand, we use meiosis cellular growth as a natural computation technique to bypass palimpsest effects observed when adding new knowledge to previous one. The main idea is to build an event guided growing competitive NN that develops while it learns to tune other models' parameters. NNs are very complex processes, which develop their structure in time. From real-time process control and analysis, we can set out that, when faced with too complex processes, usual learning algorithms reach their limit. It thus becomes necessary to resort to more powerful methods. We make the hypothesis that, among the existing NN algorithms, there is a sufficient set of primitives from which a holistic constructive programming scheme can emerge, by means of evolutionary techniques. We have chosen the Self-Organizing Map (SOM) model because, as attested by a large amount of research publications, the model itself is currently evolving, towards adaptive variants. Moreover, it offers a visual interface of remarkable quality with the data space in an either familiar and accessible bi-dimensional topologically ordered representation. And last but not least, its number of configuration and learning parameters to tune is rather important.

Adaptive topologies are of crucial interest to automatically determine the size of NNs. Oversized nets lead to prohibitive training time, while undersized nets can't find the data space structure. Around back-propagation architectures, efficient growing and pruning methods allow to find a near-optimal number of hidden units. A review has been made in [1]. For SOM, things become more difficult because of topological ordering constraints in the mapping it performs.

The following of this paper is organized in three parts. We first investigate the main properties of the SOM algorithm and its evolutionary growing variants. From that frame, in a second part, we draw out a self-observing Heuristic and a minimal set of properties, necessary to obtain the emergence of Darwinian evolution among elementary constituents, only constrained locally by a few deterministic rules. In the last part, we extract a few neural primitives in order to implement a minimal system capable of evolving toward self-observation and set out the algorithm.

 

Session 11: Action, Navigation and Location in Space (2) (Saturday 30th August, 14:00-15:30)

Extracting a Spatial Grammar from Recurrent Neural Network

Corina Sas (presenting author) and Ronan Reilly

Department of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland (corina.sas@ucd.ie)

 

Human ability in spatial exploration plays an important role in how humans understand space. This involves an implicitly developed representation of the spatial layout usually in the form of so-called cognitive maps. Despite its significance, the process of building such representations is difficult to study, directly.  Nonetheless, an overt indicator of the quality of spatial knowledge acquisition is a user's performance on specific spatial navigation tasks. This work proposes a hybrid connectionist-symbolic model for investigating and extracting procedural and strategic rules governing human exploratory behaviour. The experiments were carried out within a Virtual Environment (VE), which due to its tractable characteristics allowed the moment-to-moment recording of users' positions and headings. Simulations involved the use of a simple recurrent network (SRN, Elman, 1990), and the application of the C5.0 method for automatic rule extraction (Quinlan, 1993). An SRN implements a form of short-term memory, which makes it suitable for application to symbolic tasks that have a sequential nature, such as language. The C5.0 method is based on the ID3 algorithms developed by Quinlan (1993), which induces concepts from examples. It is particularly interesting due to its representation of learned knowledge and its heuristics for selecting candidate concepts. The SRN successfully learned to predict user's next position and orientation. The prediction accuracy of greater than 67% suggested that the neural network succeeded in acquiring the underlying regularities characterising user trajectory patterns. The next step consisted in extracting the rules that characterised human heuristics for exploring an unfamiliar environment. The analysis of the representation in the SRN hidden layer suggested that distinct groups of hidden units become specialised for place and direction. The distributed representations acquired by SRN were investigated by analysing the individual pattern error in the network prediction. According to their values, the errors were classified as good predictions and poor predictions and were input into the C5.0 method. This algorithm provided a set of rules which can be expressed into a symbolic manner. The benefit of these results is both theoretical and practical. The extracted rules can be used to confirm findings in the area of environmental psychology. They can also be harnessed for building adaptive VEs, designed to assist users with poor navigational skills to improve their exploratory behaviour.

 

Elman, J.L. (1990). Finding structure in time, Cognitive Science, 14, 179-211.

Quinlan, R. (1993). C5.0 Programs for Machine Learning. Morgan Kaufmann.

Smooth Shifts In Animate Goal-Directed Behaviour

M.K. Weir (presenting author) and A.P. Wale

School of Computer Science, North Haugh, St Andrews University, Scotland KY16 9SS (adrian@dcs.st-and.ac.uk, mkw@dcs.st-and.ac.uk)

 

Animate goal-directedness is characterised by highly directed, persistent, and plastic action where paths are continually shifted towards the goal.

This paper aims to provide both a potential basis for improving heuristic search techniques and also new tools for future psychological research.

Firstly it is argued that the use of highly directed plastic persistence should provide a type of heuristic search that contrasts favourably with existing search techniques such as gradient descent and simulated annealing.

Study of this animate organisation is also useful for more specific applications such as designing neural controllers in robots to learn first time, ie. with a single run of heuristic search. Such an ability may be vital for a truly autonomous robot that may otherwise get stuck in a hostile environment, and yet cannot re-initialise itself physically elsewhere to try again. Likewise, re-initialisation of its control may lead to disjoint and ineffective action.

The paper describes the testing of a theory of how smooth shifts may be made, to see if it can elucidate principles underlying smooth shifts in human movement. Specifically, an experimental framework suggested in [1] is used to create an experiment for measuring potentially constant shifts in goal-directed action. The experiment is designed to be analogous to those of Renaissance scientists seeking to establish constants in inanimate motion. It involves analysing hand movements using a mouse that are similar to those used to make manual gearshifts in a car and that involve a degree of free action.

A simple novel form of differential calculus specially designed for measuring shifts is used to analyse the movement and characterise the smooth shaping and deviation occurring in the shifts. The shift shape is then discussed and put forward as being present in a wide range of animate motion.

 

[1] Wale, A.P. and Weir, M.K., "Measurement and Design of Goal-directed Behaviour", Proceedings of the 7th Neural Computation and Psychology Workshop, World Scientific, 2002.

 

Staged Learning of Saccadic Eye Movements

Wolfram Schenck (presenting author), Ralf Moller

Max-Planck-Institute for Psychological Research, Cognitive Robotics Group, Wolfram Schenck, Amalienstr. 33, D-81249 Munich, Germany (schenck@psy.mpg.de)

 

Saccade control belongs to the broad field of motor control and sensorimotor coordination. For cognitive modeling in this area, adaptive internal models (like inverse models) are of central interest. Inverse models generate motor commands which transform the current into the desired sensory state. For the learning of inverse models, two main problems arise: The missing teacher signal, and the necessity to explore sensorimotor spaces. Several solutions have been proposed, all of them limited in some respect. In the present work, an alternative learning mechanism is developed for the example of saccade control, implemented on a stereo vision robot camera head.

A saccade controller can be seen as an inverse model for a constant sensory goal state in which a chosen target region appears in the foveae of both eyes. In our model, the saccade controller produces three motor parameters as output: pan and tilt of the camera head, and software vergence. As input, it receives the current motor state, and the coordinates of a selected foveation target in both the left and right camera image.

The saccade controller is implemented by a multi-layer perceptron. Training patterns are collected by generating random saccades. The missing teacher problem is solved in the following way: Whenever such a random saccade results in a shift of the target position towards the center in both the left and right image, it is included in the training set. After the pattern collection, this set contains for most part movements resulting in over- or undershoots. In order to learn from these imperfect examples, we exploit the averaging properties of multi-layer perceptrons in our approach. Due to the structure of the training set, the resulting performance of the trained controller network will be close to the optimum saccade.

As a solution to the exploration problem, we propose a staged learning procedure. A new training set is created, this time consisting of samples generated by random variation of the output of the already existing controller. If a random variation results in better foveation than the original controller output, this movement is included in the new training set. With the new set, a new controller can be trained, and this one can again be used for pattern generation. In this way, one can incrementally improve controllers' performance without the need to search from scratch in sensorimotor space for the rare learning examples with very good foveation quality.