Abstract
Selective attention is a fundamental cognitive function that uses top-down signals to orient and prioritize information processing in the brain. Single-cell recordings from behaving monkeys have revealed a number of attention-induced effects on sensory neurons, and have given rise to contrasting viewpoints about the neural underpinning of attentive processing. Moreover, there is evidence that attentional signals originate from the prefrontoparietal working memory network, but precisely how a source area of attention interacts with a sensory system remains unclear. To address these questions, we investigated a biophysically based network model of spiking neurons composed of a reciprocally connected loop of two (sensory and working memory) networks. We found that a wide variety of physiological phenomena induced by selective attention arise naturally in such a system. In particular, our work demonstrates a neural circuit that instantiates the “feature-similarity gain modulation principle,” according to which the attentional gain effect on sensory neuronal responses is a graded function of the difference between the attended feature and the preferred feature of the neuron, independent of the stimulus. Furthermore, our model identifies key circuit mechanisms that underlie feature-similarity gain modulation, multiplicative scaling of tuning curve, and biased competition, and provide specific testable predictions. These results offer a synthetic account of the diverse attentional effects, suggesting a canonical neural circuit for feature-based attentional processing in the cortex.
- feature-based attention
- cortical circuits
- working memory
- sensory systems
- computational model
- top-down
- control
Introduction
In cluttered environments, efficient vision depends critically on appropriate allocation and deployment of voluntary attention (Desimone and Duncan, 1995; Colby and Goldberg, 1999; Itti and Koch, 2001). Neurobiologists have identified different forms of attentional modulation of neuronal responses in the sensory cortex, notably in visual areas V4 and visual middle-temporal area (MT), and this has led to the formulation of different conceptual models for the physiological action of attention.
Biased competition (BC) (Desimone and Duncan, 1995) holds that attention biases the competition between the representations of stimuli within the receptive field (RF) of the neuron. This is based on experiments showing that neural responses to simultaneous presentation of preferred and nonpreferred stimuli in the RF are intermediate between responses elicited by either stimulus alone, and attention focused onto one of the stimuli leads to responses determined by the attended stimulus, as if the unattended stimulus was absent (Moran and Desimone, 1985; Desimone and Duncan, 1995; Treue and Maunsell, 1996; Reynolds et al., 1999).
Other researchers have proposed the multiplicative gain modulation interpretation (MGM). This is based on experiments showing how the gain of neuronal responses to a single stimulus is enhanced both when attention is focused inside the RF of the neuron (McAdams and Maunsell, 1999) or onto the preferred feature of the neuron (Treue and Martinez-Trujillo, 1999), without affecting the selectivity of the cell.
And, recently, it has been observed in MT that attention modulates population activity by enhancing it at the attentional focus and suppressing it in its surround (Martinez-Trujillo and Treue, 2004). This finding can be understood within the feature-similarity gain principle (FSGP) (Treue and Martinez-Trujillo, 1999; Martinez-Trujillo and Treue, 2004; Boynton, 2005; Maunsell and Treue, 2006), which posits that selective attention to a given feature modulates the firing rate of a neuron by a gain factor, which depends on the parametrical similarity between the attended feature (spatial location, orientation, direction… ) and the preference of the neuron for that feature. Neurons multiply their firing whenever their preferences are parametrically close to the attended feature; otherwise, neural responses are divided, resulting in a population selectivity enhancement. In contrast with BC, this principle emphasizes that attention modifies neural responses multiplicatively, with the same factor regardless of sensory inputs (Maunsell and Treue, 2006). Physiologically based computational models have been proposed for BC (Deco and Rolls, 2005); in contrast, FSGP has been considered only conceptually (Boynton, 2005). How the precise algorithmic formulation of attentional modulation in FSGP could be instantiated neurophysiologically remains unclear.
Because these conceptual models emphasize contrasting computational algorithms that apply to both spatial and featural attention, it remains unclear whether a unified mechanistic framework can account for all of these experimental results. We addressed this question using a biophysically plausible computational model of two interacting cortical networks of spiking neurons that integrates the source area of the attentional top-down signal. We modeled the visual area MT, selective to direction of motion, and we assumed that it received a top-down attentional signal originating in a working memory area (Desimone and Duncan, 1995; Barceló et al., 2000; Hopfinger et al., 2000; Miller and Cohen, 2001; Corbetta and Shulman, 2002; Lebedev et al., 2004; Hagler and Sereno, 2006; Grent-t Jong and Woldorff, 2007). Note that the hypothesized role of a working memory circuit as a source area of attentional signals still awaits explicit experimental validation in attention-to-motion tasks. In this work, we tested the idea that the interactions between these two areas are sufficient to explain the above-described attentional effects.
In addition to reproducing the experimental observations, our model is the first explicit neurophysiological implementation of FSGP. And, together, it proves the compatibility between FSGP and BC (Boynton, 2005; Deco and Rolls, 2005), assigning them, for the first time, specific (and dissectable) mechanisms within the cortical circuitry. Such a comprehensive account of neurophysiological data of attention in a biophysical network model leads us to suggest that it constitutes the backbone of a “canonical” neural circuit for feature-based attentional processing.
Materials and Methods
The model network.
Each of the two network modules represents a local circuit of the cortex. The sensory network represents a local circuit of the MT, and we refer to the working memory module as a local circuit of the prefrontal cortex (PFC) for the sake of simplicity, although working memory and selective attention are likely to be subserved by both prefrontal and parietal cortices (Colby and Goldberg, 1999; Hopfinger et al., 2000; Corbetta and Shulman, 2002; Lebedev et al., 2004; Moore, 2006; Grent-t Jong and Woldorff, 2007). In addition, there is anatomical evidence of reciprocal connections between PFC and MT (Barbas, 1988; Schall et al., 1995; Burman et al., 2006). The MT and PFC circuits had exactly the same wiring structure; they only differed in the strength of the synaptic connectivity within each module: the PFC module had strong recurrent excitatory connections to sustain persistent activity, whereas MT was dominated by inhibition. A detailed account of the local circuit model can also be found in the study by Compte et al. (2000). For each circuit, pyramidal cells (NE = 1024) and interneurons (NI = 256) were spatially distributed on a ring simulating the cortical columnar organization, labeled by their preferred direction of motion (θpref, from 0 to 360°). Their axonal collaterals differentially targeted neighboring (isodirectional) and distant (crossdirectional) neurons. This was implemented by taking the synaptic conductance between neuron i and neuron j to be gsyn,ij = W(θi − θj)Gsyn, where W(θi − θj) was either a constant for unstructured connections (W(θi − θj) = 1), or the sum of a constant term plus a Gaussian: W(θi − θj) = J− + (J+ − J−) exp(−(θi − θj)2/2σ2). In both networks, only the excitatory-to-excitatory connectivity was structured with σEE = 14.4° and JEE+ = 1.62 (Compte et al., 2000). The excitatory-to-inhibitory, inhibitory-to-excitatory, and inhibitory-to-inhibitory connections were unstructured (i.e., the cross-directional and isodirectional components of feedback inhibitory connections were equally strong). Following the notations in the study by Compte et al. (2000), the parameters defining the strengths of local connections in the two networks were as follows: in PFC, GEE,AMPA = 0.391 nS, GEE,NMDA = 0.732 nS (pyramid-to-pyramid); GEI,AMPA = 0.293 nS, GEI,NMDA = 0.566 nS (pyramid-to-interneuron); GIE = 3.74 nS (interneuron-to-pyramid); GII = 2.87 nS (interneuron-to- interneuron); in MT: GEE,AMPA = 0.005 nS, GEE,NMDA = 0.093 nS (pyramid-to-pyramid); GEI,AMPA = 0.005 nS, GEI,NMDA = 0.195 nS (pyramid-to-interneuron); GIE = 1.47 nS (interneuron-to-pyramid); GII = 0.391 nS (interneuron-to-interneuron). Thus, recurrent excitation was between 1 and 2 orders of magnitude stronger in PFC than in MT, and synaptic inhibition was very strong in both modules.
Both pyramidal cells and interneurons were modeled as leaky integrate-and-fire neurons, with the same parameters as for neurons in the network model of Compte et al. (2000). Specifically, each type of cell was characterized by six intrinsic parameters: the total capacitance, Cm; the total leak conductance, gL; the leak reversal potential, EL; the threshold potential, Vth; the reset potential, Vres; and the refractory time, τref. The values used were as follows: Cm = 0.5 nF, gL = 25 nS, EL = −70 mV, Vth = −50 mV, Vres = −60 mV, and τref = 2 ms for pyramidal cells; and Cm = 0.2 nF, gL = 20 nS, EL = −70 mV, Vth = −50 mV, Vres = −60 mV, and τref = 1 ms for interneurons. All cells received random background excitatory inputs. This unspecific external input was modeled as uncorrelated Poisson spike trains to each neuron at a rate of vext = 1800 Hz per cell (or equivalently, 1000 presynaptic Poisson spike trains at 1.8 Hz). This input was exclusively mediated by AMPA receptors (AMPARs), with the maximum conductances gext,E = 3.1 nS on pyramidal cells and gext,I = 2.38 nS on interneurons in PFC; and gext,E = 15 nS and gext,I = 4.5 nS in MT.
Neurons received their recurrent excitatory inputs through AMPAR- and NMDA receptor (NMDAR)-mediated transmission and their inhibitory inputs through GABAA receptors (GABAARs). These conductance-based synaptic responses were calibrated by the experimentally measured dynamics of synaptic currents. Thus, postsynaptic currents were modeled according to Isyn = gsyns(V − Vsyn), where gsyn is a synaptic conductance, s is a synaptic gating variable, and Vsyn is the synaptic reversal potential (Vsyn = 0 for excitatory synapses; Vsyn = −70 mV for inhibitory synapses). AMPAR and GABAAR synaptic gating variables were modeled as an instantaneous jump of magnitude 1 when a spike occurred in the presynaptic neuron followed by an exponential decay with time constant 2 ms for AMPA and 10 ms for GABAA. The NMDA conductance was voltage dependent, with gsyn multiplied by 1/(1 + [Mg2+] exp(−0.062 Vm)/3.57), [Mg2+] = 1.0 mm. The channel kinetics was modeled by the following equations:
where s is the gating variable, x is a synaptic variable proportional to the neurotransmitter concentration in the synapse, ti are the presynaptic spike times, τs = 100 ms is the decay time of NMDA currents, τx = 2 ms controls the rise time of NMDAR channels, and αs = 0.5 kHz controls the saturation properties of NMDAR channels at high presynaptic firing frequencies. Parameters for synaptic transmission were taken from the study by Compte et al. (2000).
The MT and PFC network modules were interconnected through bottom-up and top-down AMPAR-mediated connections (see scheme in Fig. 1). Both bottom-up and top-down connectivities were topographic, so that for both the bottom-up and top-down pathways neurons sharing the same preference were more strongly coupled than neurons with disparate preferences. This connectivity was described by a Gaussian function: gsyn = Gsyn exp(−(θi − θj)2/2σ2)/σ√2π. We used for the bottom-up connection onto PFC pyramids, GEEMT→PFC = 0.005 nS and σ = 36°; for the bottom-up connection onto PFC interneurons, GEIMT→PFC = 0; for the top-down connection onto MT pyramids, GEEPFC→MT = 0.146 nS and σ = 72°; and for the top-down connection onto MT interneurons, GEIPFC→MT = 0.039 nS and σ = 72°.
The simulations.
The simulation protocol was chosen to resemble the behavioral protocol used in the experiment of Martinez-Trujillo and Treue (2004). There, monkeys were trained to fixate a central spot during a brief presentation of a peripheral random-dot pattern in coherent motion, which was the stimulus to be attended. Subsequently, an additional random-dot pattern in coherent motion was added in the receptive field of the neuron (test stimulus), which was behaviorally irrelevant but could share or not the direction of motion of the attended stimulus. The monkey had to report a direction or speed change in the attended stimulus and ignore changes in the test stimulus. The experiment revealed attentional modulation on the neuronal responses to the unattended test stimulus that depends on the attended direction of motion (feature-based attention). Here, motion stimulus presentation to the network was modeled through selective transient current injection to MT cells (see below, Task-related extrinsic inputs). We included a delay period or D-period (see D in Figs. 2A, 4A), between the presentation of an attended feature (cue period or C-period) (see C in Figs. 2A, 4A) and the presentation of the test stimulus (test period or T-period) (see T in Figs. 2A, 4A). During the D-period, the visual stimulus was absent. By including this period, we were able to evaluate the effect of an attentional bias on the MT network baseline activity (see Fig. 4C).
Task-related extrinsic inputs.
Cells in area MT received external inputs from primary visual area V1, which were selective to the direction of motion of the visually presented stimulus (Born and Bradley, 2005). We thus modeled motion stimuli presentation by injecting external currents to MT neurons that mimicked outputs from V1 to MT. We tried with Poisson-triggered synaptic inputs and our conclusions remained unaffected. When there was a single motion direction (θS), the current injected to a neuron labeled by θi was I(θi) = I0 + I1exp(μ(cos(θi − θS) − 1)); for MT pyramids, we used I0E = 1 nA and I1E = 0.9 nA; for MT interneurons, we used I0I = 0.2 nA and I1I = 0.18 nA; and for both cell types, μ = 2.53 (this choice of μ gives a connectivity profile very close to a Gaussian with a constant baseline, with the same width as MT-to-PFC connections). When two overlapping directions of motion were visually presented, the current impinging on MT neurons was the sum of the currents corresponding to the two single stimuli, normalized so that the maximal current was still I0 + I1 (supplemental Fig. 1A, available at www.jneurosci.org as supplemental material).
This normalization was derived from the observation that the maximal response of a direction-selective V1 neuron remains the same for either single motion or transparent motion stimuli (Snowden et al., 1991). More abstract models of V1 neurons selective to motion direction typically include a similar normalizing factor (Simoncelli and Heeger, 1998; Rust et al., 2006). PFC model neurons received motion-specific sensory inputs only through the MT-to-PFC pathway.
In all our simulation trials and during the attentional C-period, all PFC neurons also received a constant current injection of 0.025 nA. This current was not selective, and thus it did not carry any direction of motion information. It was too weak to trigger by itself a persistent activity pattern in the PFC network (see Fig. 2A, left), but strong enough so that, when presented coincidentally with a visual stimulus, the PFC was able to store the directional information from MT (see Fig. 2A, right). Such a “gating input” allows our model to differentiate an attentional cue from a visual stimulus presented during the T-period in Figure 2A, left.
The integration method used was a second-order Runge–Kutta algorithm with a time step of Δt = 0.02 ms. The custom code for the simulations was written in C++.
Results
The network model architecture
We built a network model of spiking neurons composed of two interacting areas, a sensory area selective for motion direction (MT) and a working memory area that selectively stored this information. The internal structure in each of the two local networks is in accordance with the known anatomical and physiological characteristics of cortical microcircuitry. The interareal reciprocal connections followed some simple rules, based also on biological plausibility: neurons with similar preferred directions were more strongly connected (following a Gaussian function), and synapses were all excitatory, but could target both pyramidal cells and interneurons. For explicit details, see Materials and Methods.
Our model was constrained based on a number of specific experimental results in area MT. On the one hand, neural responses to a motion stimulus in the receptive field have been quantitatively characterized (Maunsell and Van Essen, 1983; Snowden et al., 1992). On the other hand, there is evidence that the circuits in area MT are endowed with competition mechanisms, because the spiking response of an MT cell (but not a V1 cell) is suppressed when two superimposed moving random dot patterns are presented (Snowden et al., 1991; Treue et al., 2000) (see scheme of these stimuli in Fig. 1). We used this data to constrain our MT network model. To this end, we used both full-scale simulations (see Materials and Methods) and a mean-field approximation (Renart et al., 2003) of the MT network to allow for extensive parameter space exploration. We found that the appropriate competitive responses could be realized (supplemental Fig. 1, available at www.jneurosci.org as supplemental material) if bottom-up inputs into MT targeted also local-circuit inhibitory neurons. Interneurons project onto excitatory cells strongly to provide inhibition commensurate with the overall feedforward drive, thereby instantiating a circuit mechanism for normalization (Simoncelli and Heeger, 1998; Rust et al., 2006). The downstream working memory area was modeled as by Compte et al. (2000). This module will be referred to as prefrontal cortex (PFC) module for the sake of simplicity, although working memory and selective attention are likely to be subserved by both prefrontal and parietal cortices (Colby and Goldberg, 1999; Hopfinger and Mangun, 2000; Corbetta and Shulman, 2002; Lebedev et al., 2004; Moore, 2006; Grent-t Jong and Woldorff, 2007). In this model, working memory of a directional cue is achieved through reverberatory interactions between spiking neurons in the local network. Thus, the two networks in our model share the same qualitative internal architecture, but the PFC module is endowed with strong recurrent excitation, whereas the MT network is dominated by inhibition. Both cortical network modules are reciprocally connected with topographically specific bottom-up and top-down synaptic connections (Fig. 1) to explore the orienting effects of a selective firing pattern in the PFC network over the population activity and single neuron responses in the MT network. The bottom-up connection parameters were tuned to allow the transmission of visual information from the MT to the PFC module.
The parameters of the top-down connection were tuned to produce selectivity enhancement of MT neural population responses (Fig. 2C) in agreement with experimental data (Martinez-Trujillo and Treue, 2004). We will call this selective enhancement with inhibitory surround of population responses, where by inhibitory surround we mean that peak responses are enhanced and surround responses are suppressed. Such an inhibitory ring around the attentional focus has been recently validated in imaging studies in humans as well (Hopf et al., 2006). The selectivity enhancement of population profiles is a relevant finding, because what matters functionally are instantaneous population activity patterns rather than neuronal tuning curves obtained from multiple trials.
The rest of the phenomenology reported here (see Figs. 3⇓–5) emerged from the model constrained this way, without any further parameter tuning.
Attentional enhancement of population selectivity
A single-trial simulation (Fig. 2A, right) consisted of three task epochs. In a cue period (C), a transient input about attended motion direction (θA) triggered a self-sustained persistent activity (peaked at the attended directional angle θA) in the PFC network. In simulations, it was done by a combination of a directional stimulus (θA) to the MT network, whose activity was projected through weak bottom-up connections to PFC, and a transient nonspecific input to the PFC module (see Materials and Methods). This “gating input” was weak enough so it did not trigger by itself persistent activity in the PFC module (Fig. 2A, left), but it allowed this activity to develop in case a stimulus was simultaneously presented to the MT network (Fig. 2A, right). A plausible physiological substrate for this input could be found in the phasic alertness circuits recently identified in the superior temporal gyrus or in the thalamus (Sturm and Wilmes, 2001; Fan et al., 2005; Thiel and Fink, 2007). We thus assume that projections from these areas generate a slightly net increase of unspecific external input to PFC neurons during the C-period of our task. The C-period was followed by a delay period (D), where in the absence of all external inputs PFC maintained the information of the attended feature, if presented. Finally, in a test period (T), a test stimulus θS was presented to the MT network.
For comparison, when no attentional cue was shown in a stimulation trial (Fig. 2A, left), no persistent activity was produced in the PFC network, nor was there top-down signal to modulate the MT network response during the T-period. In this example, the attended direction and the stimulus were the same (θA = θS = 0°).
As can be seen in Figure 2B, the spiking response of a neuron with θpref = 0° was enhanced by the attentional signal (red) compared with control (black). The population activity pattern (the average firing rate during the T-period plotted for all neurons) exhibits sharpened selectivity, similar to that observed experimentally (Martinez-Trujillo and Treue, 2004): neural activity was increased at the focus of attention but suppressed on the surrounds (Fig. 2C, red) compared with the unattended case (black) (selective enhancement with inhibitory surround). Such sharpening of population activity occurred because the top-down projection from PFC not only provided local excitation, but also targeted MT interneurons that then projected unspecifically onto MT excitatory neurons and suppressed firing on the flanks (supplemental Fig. 2, available at www.jneurosci.org as supplemental material). Indeed, attention strongly increased peak inhibitory firing rate (∼25 Hz). When computed as a perecentage increase from baseline firing, this represents the same modulation (35%) as for excitatory neurons (Mitchell et al., 2007).
We confirmed that this enhanced selectivity was robust to parameter variations in the top-down projection, especially if changes of top-down synapses onto excitatory neurons and those onto inhibitory cells were approximately balanced (supplemental Fig. 3A,C, available at www.jneurosci.org as supplemental material). The selectivity enhancement is quantified by the ratio of firing rates with and without attention (Fig. 2C), called modulation ratio (Martinez-Trujillo and Treue, 2004), plotted as a function of the difference between the attended direction θA and the preference of the neuron θpref. The modulation ratio curve of the model (Fig. 2D) was quantitatively similar to that reported experimentally (Martinez-Trujillo and Treue, 2004).
The curve can be well fitted by G(θpref − θA) = 1.05 + 0.3cos(θpref − θA). In particular, note that, for large θpref − θA, the modulation ratio was smaller than 1, indicating the suppression of responses away from the attentional focus (selective enhancement with inhibitory surround).
Multiplicative gain modulation of tuning curves
We next examined whether the modulation ratio of a given neuron remained constant regardless of the stimulus direction θS. We ran a series of simulations with a fixed attended direction while varying from trial to trial the stimulus θS presented during the T-period (Fig. 3A). We observed that the tuning curve of a neuron was multiplicatively scaled by attention from the unattended tuning curve, whether the attended direction was the preferred (“att pref,” θA = 0°) or nonpreferred (“att null,” θA = 180°) (Fig. 3B). This MGM of tuning curves was robust to modifications in the top-down input (supplemental Fig. 3B,D, available at www.jneurosci.org as supplemental material) and concurred with experimental data obtained for spatial attention (McAdams and Maunsell, 1999; Treue and Martinez-Trujillo, 1999). Several neurophysiological mechanisms for multiplicative neuronal responses have been proposed (Salinas and Abbott, 1996; Chance et al., 2002; Hansel and van Vreeswijk, 2002; Murphy and Miller, 2003), but none has been explicitly tested in a biophysical, recurrent network model constrained by multiple neurophysiological data of attentionally modulated sensory responses. We found that the relationship between the firing rate R and the total external input current (the sum of bottom-up and top-down inputs, IS + IA) in our MT excitatory cells was very well described by a power law r = a(IS + IA + C)b (Fig. 3C). This supports the scenario described by Hansel and van Vreeswijk (2002) and Murphy and Miller (2003), namely a power-law input–output relationship that transforms additive inputs into approximately multiplicative outputs. The fit parameters a and b are the same in all task conditions (no attention, attention to preferred motion, and attention to null motion), but the parameter c differed for attentional and nonattentional conditions. This can be readily understood as follows: c incorporates all of the rest of currents impinging on the neurons apart from IS and IA, and in particular it contains an important contribution from nonspecific inhibitory inputs from local circuit interneurons. Because inhibitory neurons in the MT network also receive top-down input, the inhibitory population average activity increases in attention trials and c decreases for those trials, as shown in Figure 3C. An explicit mathematical description of how this power-law input–output relationship accounts for our simulation results can be found in the supplemental Methods and supplemental Figure 5 (available at www.jneurosci.org as supplemental material).
Consistent with this multiplicative scaling of neural tuning curves, the neural firing response can be expressed as follows: R(θpref, θA, θS) = G(θpref − θA)R0(θpref − θS), where R0(θpref − θS) is the neural activity in the unattended condition, and G(θpref − θA) is the attention-induced multiplicative factor. Note that this equation can be used to describe both population activity (with a fixed θS while θpref is varied) and tuning curve of a single neuron (with a fixed θpref while varying θS). In particular, it can be seen from this equation why single-neuron data of Martinez-Trujillo and Treue (2004) represent population activity. In that experiment, θA and θS were covaried while keeping θA − θS constant in all trials so that R(θpref, θA, θS) only depended on θpref − θS. Therefore, under this special condition, single-neuron responses (fixed θpref, across θS) are equivalent to the activity pattern of the neural population (fixed θS, across θpref).
Biased competition phenomenology and feature-similarity gain principle
Our model therefore reproduces salient experimentally observed effects: competition between stimuli in the sensory cortex (supplemental Fig. 1, available at www.jneurosci.org as supplemental material) (Snowden et al., 1991; Treue et al., 2000), attentional enhancement of population response selectivity (Fig. 2C) (Martinez-Trujillo and Treue, 2004), and attentional modulation of the neuronal gain (Fig. 3) (Treue and Martinez-Trujillo, 1999).
How closely does our model behavior adhere to either BC or FSGP? According to BC phenomenology, attention focused on one of two simultaneously presented stimuli should bring the activity of a neuron toward its firing rate when the attended stimulus is presented alone. FSGP, however, predicts that the attentional modulation factor of a neuron, deduced using a given stimulus, should be applicable to neural responses to other, arbitrary stimuli. We tested both of these predictions by looking at how a top-down input from the PFC network affected the processing of two stimuli presented simultaneously to the MT network during the T-period (Fig. 4A). In MT, this would correspond to two superimposed random dot patterns moving in different directions (transparent motion) (see right stimulus in Fig. 1) (Snowden et al., 1991; Treue et al., 2000), just one of them being behaviorally relevant (attended stimulus). Thus, FSGP requires that the modulation-ratio curve (Fig. 2D) should predict the population response to attended transparent motion from the corresponding activity in the unattended condition. This is indeed the case as shown in the full model simulation (Fig. 4B). The impressive agreement with the prediction demonstrates that FSGP naturally emerges in our microcircuit model. To strengthen further this point, we looked at the network response during the delay period, when no stimulus was being presented and only the baseline MT activity was affected by the top-down biasing input. Indeed, there is experimental evidence for the attentional modulation of baseline activity in extrastriate visual cortex (Chelazzi et al., 1993; Luck et al., 1993; Ferrera et al., 1994; Luck et al., 1997; Kastner et al., 1999; Bisley et al., 2004). Remarkably, even for the baseline delay responses in the MT network, which are significantly lower than evoked responses, attentional effects are extremely well predicted by the modulation ratios evaluated from the single motion stimulus presentation (Fig. 4C). All of this could be synthetically represented by plotting together the modulation ratio curves for each of the three types of stimulations that we used: no stimulus, single motion, and transparent motion. The three curves are indistinguishable from each other (Fig. 4D), confirming that our network model is a neurophysiologically plausible implementation of the FSGP (Treue and Martinez-Trujillo, 1999).
However, BC should manifest at the level of single neuron responses. As we mentioned before, we constrained the MT module network to reproduce competition in the presence of transparent motion stimuli (supplemental Fig. 1, available at www.jneurosci.org as supplemental material). As shown in Figure 5, A and C (dark gray), the response of a neuron to transparent motion, one at its preferred direction and the other at the null, is halfway between the responses when either motion component is presented alone (Fig. 5A,C, black and light gray). When attention is directed at either of the two simultaneously presented motion directions, the firing rate of neuronal response is pulled toward the condition when the attended stimulus is presented alone (Fig. 5B,D, dark and light green). Therefore, our model reproduces the BC and proves its consistency with MGM at the neuronal level (Fig. 3).
Dissecting circuit mechanisms
The neurophysiologically explicit model allowed us to dissect the circuit requirements for each of the multiple forms of attentional effects: selective enhancement with inhibitory surround of population responses, neuronal multiplicative scaling, and biased competition phenomenology. To this end, we differentially disabled certain synaptic pathways in the model. Such manipulations led to modifications in the three qualitative behaviors of the model that underlie the implementation of the attentional effects (Fig. 6A, data replotted from previous figures): sensory competition between stimuli (left), selective enhancement with inhibitory surround of population responses (middle), and neural gain modulation (right). When background Poisson inputs to MT neurons (see Materials and Methods) were reduced, rendering linear the input–output relationship of network neurons (Fig. 6B, right), the model still showed a suppressed maximal response to two stimuli with respect to isolated stimuli (competition; left panel) and attention still enhanced the selectivity of population activity by increasing activity at the peak and suppressing it at the flanks (selective enhancement with inhibitory surround; middle panel). However, neuronal multiplicative scaling no longer held as rescaled neuronal tuning curves for different attentional conditions did not overlap (right panel). Consequently, the modulation ratio depended on the stimulus presented and was different for single motion and transparent motion (right, left inset). Thus, strong unspecific and independent background synaptic inputs were necessary in our model to satisfy the FSGP.
When the inputs to MT interneurons (from both the bottom-up afferents and MT excitatory neurons) were blocked (Fig. 6C), competition was abolished (left panel), because the response of a neuron to superimposed preferred and nonpreferred stimuli exceeded the response to the preferred stimulus alone (left panel, inset). In contrast, selective enhancement with inhibitory surround of population activity (middle panel) and multiplicative gain of tuning curves (right panel) remained intact. However, when the top-down and recurrent inputs to MT interneurons were blocked (Fig. 6D), competition (left panel) and multiplicative scaling of tuning curve (right panel) were not affected qualitatively but the selectivity of population activity was no longer sharpened through off-focus suppression, abolishing both selective enhancement with inhibitory surround (middle panel) and correct attentional biasing of stimuli competition (middle panel, inset). Blocking only one of the pathways, rather than a pair simultaneously, was not sufficient to induce the observed qualitative changes (supplemental Fig. 4, available at www.jneurosci.org as supplemental material). These manipulations showed that BC was generated through an interplay of sensory competition (Fig. 6C) and selective enhancement with inhibitory surround of population activity (Fig. 6D), whereas FSGP relied essentially on the neuronal multiplicative mechanism (Fig. 6A). Thus, the two conceptual principles, BC and FSGP, can be mechanistically dissociable.
Discussion
We present here a biophysically detailed microcircuit model of two interacting cortical areas that integrates top-down input generation in a working memory module and the ensuing attentional modulations in a sensory module. We show that such a network is able to account for most experimental evidence of feature-based attentional modulations of neural and population responses in area MT.
The biophysical detail of our model allows us to formulate the plausible mechanistic bases of current conceptual interpretations of attentional processing at the microcircuit level. In particular, we provide the first biophysical instantiation of the feature-similarity gain principle (Treue and Martinez-Trujillo, 1999) in a cortical network, which is also compatible with the biased competition principle. Notice that we proved the consistency between FSGP and BC in a model that can only represent a single feature of the stimulus. To make explicit the interactions between these two principles in the context of mixed featural and spatial attention tasks would require a significantly more complex computational model that incorporates two continuous dimensions of the stimulus.
We want to emphasize here that, although there is general agreement on the significance of biased competition phenomenology for attention, there is a diversity of views on how this might be mechanistically implemented (Desimone, 1992). Indeed, although the original formulation of the biased competition interpretation (Desimone and Duncan, 1995) did not provide a specific mechanistic framework, some later schematical mathematical implementations (Reynolds et al., 1999; Reynolds and Chelazzi, 2004) suggested that attentional modulations could alter feedforward inputs to extrastriate cortex to generate BC phenomenology. This mechanism underlies a number of computational network models of BC (Hamker, 2004, 2005). Other mechanistic models of biased competition, instead, attribute attentional modulations to local inhibitory interactions between neurons activated both by bottom-up and top-down inputs (Deco and Rolls, 2005). Our model is also a BC model and is in line with this last family of models, and it shows how they can also manifest the feature-similarity gain principle. Note that other BC models (Reynolds et al., 1999, 2000; Reynolds and Chelazzi, 2004; Deco and Rolls, 2005) have also shown the capability to integrate consistently other neurophysiological effects of attention (biasing of baseline activity, modulation of responses to single stimuli) but not FSGP, and not with the mechanistic detail of the model reported here.
An important ingredient in our network is the normalization of inputs in area MT (supplemental Fig. 1, available at www.jneurosci.org as supplemental material). Normalization of responses to combined stimuli is a common characteristic of many visual areas (Snowden et al., 1991; Qian and Andersen, 1994; Carandini et al., 1997; Britten and Heuer, 1999).
This has been incorporated in various mathematical models, typically using some phenomenological construct such as divisive inhibition (Heeger, 1992; Carandini et al., 1997; Simoncelli and Heeger, 1998; Reynolds et al., 1999; Rust et al., 2006). In our biophysical network, normalization does not emerge from shunting inhibition (Holt and Koch, 1997) but from divergent feedforward inhibition, and it underlies the competition of stimuli in BC phenomenology (Fig. 6C).
Although our work suggests a unified and biophysically based model for feature-similarity gain principle and biased competition, we also identified dissociable circuit components that are critical to each of the two attentional effects. This is in contrast to a previous conceptual model suggesting that feature-similarity gain principle necessarily implies biased competition and vice versa (Boynton, 2005). Indeed, by selectively inactivating various pathways in our model, we found that our circuit model could implement either effect independently of the other (Fig. 6), thus proving that these conceptual models are not necessarily associated. Because experimental evidence indicates that attentional systems comply with both biased competition and with the feature-similarity gain principle, this imposes severe constraints on viable mechanistic schemes of neural circuits for attentional selection. We have presented here the first physiological model that accounts for both explicitly.
Anatomically, it is known that PFC and MT are reciprocally connected (Barbas, 1988; Schall et al., 1995; Burman et al., 2006). In a recent paper, Zaksas and Pasternak (2006) have examined neurons in MT and dorsolateral PFC in a delayed match-to-sample task using random-dot motion patterns. It was found that neurons in the dorsolateral PFC showed decaying selectivity for the cue stimulus through the delay period of the task, suggesting that they were not responsible for the memory maintenance, although they did retain information for a longer time than MT neurons during the delay period. Therefore, information storage for motion direction in working memory might reside in a frontal (or parietal) area that is different from dorsolateral PFC. Alternatively, working memory might implicate coding strategies other than sustained activation of neuronal firing (Zaksas and Pasternak, 2006), in contrast to the recurrent network model of working memory (Durstewitz et al., 2000; Wang, 2001; Constantinidis and Wang, 2004; Compte, 2006). However, the selectivity of MT neurons during the delay of a working memory task has not been detected in some studies (Ferrera et al., 1994) or has been found not to be stable over time (Bisley et al., 2004; Zaksas and Pasternak, 2006), possibly in relation to the matching-to-sample task design. In any event, the interaction between MT and PFC has not yet been analyzed physiologically in attention tasks, and will be needed to directly test our model predictions.
Our model can be extended in future research in several important ways. In particular, it has been reported that selective attention modulates sensory responses in a manner consistent with a contrast enhancement of the attended stimulus (Reynolds et al., 2000; Martinez-Trujillo and Treue, 2002) (but see Williford and Maunsell, 2006). It will be interesting to incorporate in our model contrast dependence and contrast adaptation that interacts with attentional signaling. Another recent finding that is capturing great interest is the attentional modulation of gamma band coherency of neural activity in sensory cortices (Fries et al., 2001; Bichot et al., 2005; Womelsdorf et al., 2006). Our model network does show attention-induced oscillatory activity in the gamma band, which emerges from the interplay of AMPA-type excitation and slower GABAA inhibition in the microcircuit (Compte et al., 2000). Preliminary data (data not shown) indicate that gamma oscillations do not affect the findings reported here qualitatively; but a thorough exploration of this issue will be undertaken in a separate work. Note that our model of spiking networks for both the sensory and the attentional source circuits is particularly well suited for elucidating the role of oscillation and synchronicity in attentional processing.
The consistent integration in a single physiological model of a significant amount of experimental data from MT, which is also shared by other extrastriate areas selective to different stimulus features, leads us to hypothesize that it contains the fundamental mechanistic backbone of a canonical attentional circuit. Indeed, key electrophysiological observations that are accounted for by our network model, sensory competition (Snowden et al., 1991; Treue et al., 2000), biased competition phenomenology (Treue and Maunsell, 1996), population activity selectivity enhancement (Martinez-Trujillo and Treue, 2004), and tuning curve multiplicative scaling (Treue and Martinez-Trujillo, 1999), have also been observed in other extrastriate cortical areas (notably V4), but all of these observations have not been conjointly reported in one same visual area apart from MT with a consistent attentional protocol. We propose that this basic architecture is a canonical model for attentional selection in the cortex that is replicated for each relevant attendable parametric feature of the stimulus, including orientation, spatial location, etc. (Maunsell and Treue, 2006).
Our hypothesis that the network architecture that we put forward here constitutes a canonical model that applies to all stimulus features coded by different neural systems along the visual pathway generates some specific predictions. First, the model suggests that population selectivity enhancement holds valid for other systems, such as modulation of receptive fields of neurons in V4 by spatial attention. This can be tested with neural recordings, by testing the RF of a neuron with an optimal stimulus positioned at different locations in the RF, once while the monkey passively fixates (unattended condition) and again while the monkey attends to the stimulus so that the attended location and stimulus location covary from trial to trial (attended condition) [paralleling the experiment in MT by Martinez-Trujillo and Treue (2004)]. Alternatively, the enhancement of population response selectivity could be explored directly using imaging techniques. Our model predicts that the attended response pattern will have sharpened selectivity with respect to the unattended profile, through the enhancement of activity in the center and suppression of activity in the periphery. Second, a clear distinction between selectivity enhancement for population activity and multiplicative scaling of single neural tuning curve should help disambiguate seemingly contrasting experimental results, such as previously reported selectivity modulation (Spitzer et al., 1988) and gain change (McAdams and Maunsell, 1999) of orientation selectivity in V4 neurons. Although McAdams and Maunsell (1999) gave a reasonable explanation of this discrepancy based on the different way in which selectivity measurements were taken in either study, our model suggests that selectivity enhancement at the level of population activity could have been an additional source of confounds. Our prediction is that appropriate experiments designed specifically to compare modulations of single neuron and population activity by attention to orientation, when decoupled from spatial attention, should disambiguate the contrasting reports and reveal the neurophysiological fingerprints of our canonical attentional model: competition, population selectivity enhancement, and single neuron multiplicative scaling. Third, the feature-similarity gain principle has so far been tested only with one stimulus (Martinez-Trujillo and Treue, 2004); our model predicts that the modulation ratio thus obtained can be used to predict MT population response to other stimuli, such as transparent motion (Fig. 4B). This is a critical test for the feature-similarity gain principle. An analogous prediction applies to V4 neural responses when attention selects one of the orientations in a plaid pattern. More generally, it would be intriguing to see whether the attentional modulation ratio can predict neural responses when attention is directed to one of an arbitrary number of stimuli in a cluttered visual scene, in visual search tasks. Finally, our model suggests that distinct synaptic pathways are critical to different aspects of attentional modulation (Fig. 6); experimental progress in this direction would represent a major step toward a mechanistic understanding of selective attention at the microcircuit level.
Footnotes
-
This work was supported by the Volkswagen Foundation, the Spanish Ministry of Education and Science (S.A., A.C.), the European Regional Development Fund (A.C.), and the Swartz Foundation (X.-J.W.). A.C. was supported by a Ramón y Cajal Research Fellowship of the Spanish Ministry of Education and Science and by the Researcher Stabilization Program of the Health Department of the Generalitat de Catalunya. We thank S. Treue for helpful discussions on feature-based attention, and E. Marder, J. Mazer, and A. Renart for helpful comments on a previous version of this manuscript.
- Correspondence should be addressed to Albert Compte, Institut d'Investigacions Biomèdiques August Pi i Sunyer, Carrer Villarroel 170, 08036 Barcelona, Spain.acompte{at}clinic.ub.es
References
- Barbas, 1988.↵
- Barceló et al., 2000.↵
- Bichot et al., 2005.↵
- Bisley et al., 2004.↵
- Born and Bradley, 2005.↵
- Boynton, 2005.↵
- Britten and Heuer, 1999.↵
- Burman et al., 2006.↵
- Carandini et al., 1997.↵
- Chance et al., 2002.↵
- Chelazzi et al., 1993.↵
- Colby and Goldberg, 1999.↵
- Compte, 2006.↵
- Compte et al., 2000.↵
- Constantinidis and Wang, 2004.↵
- Corbetta and Shulman, 2002.↵
- Deco and Rolls, 2005.↵
- Desimone, 1992.↵
- Desimone and Duncan, 1995.↵
- Durstewitz et al., 2000.↵
- Fan et al., 2005.↵
- Ferrera et al., 1994.↵
- Fries et al., 2001.↵
- Grent-t Jong and Woldorff, 2007.↵
- Hagler and Sereno, 2006.↵
- Hamker, 2004.↵
- Hamker, 2005.↵
- Hansel and van Vreeswijk, 2002.↵
- Heeger, 1992.↵
- Holt and Koch, 1997.↵
- Hopf et al., 2006.↵
- Hopfinger et al., 2000.↵
- Itti and Koch, 2001.↵
- Kastner et al., 1999.↵
- Lebedev et al., 2004.↵
- Luck et al., 1993.↵
- Luck et al., 1997.↵
- Martinez-Trujillo and Treue, 2002.↵
- Martinez-Trujillo and Treue, 2004.↵
- Maunsell and Van Essen, 1983.↵
- Maunsell and Treue, 2006.↵
- McAdams and Maunsell, 1999.↵
- Miller and Cohen, 2001.↵
- Mitchell and Sandberg, 2006.
- Moore, 2006.↵
- Moran and Desimone, 1985.↵
- Murphy and Miller, 2003.↵
- Qian and Andersen, 1994.↵
- Renart et al., 2003.↵
- Reynolds and Chelazzi, 2004.↵
- Reynolds et al., 1999.↵
- Reynolds et al., 2000.↵
- Rust et al., 2006.↵
- Salinas and Abbott, 1996.↵
- Schall et al., 1995.↵
- Simoncelli and Heeger, 1998.↵
- Snowden et al., 1991.↵
- Snowden et al., 1992.↵
- Spitzer et al., 1988.↵
- Sturm and Willmes, 2001.↵
- Thiel and Fink, 2007.↵
- Treue and Martinez-Trujillo, 1999.↵
- Treue and Maunsell, 1996.↵
- Treue et al., 2000.↵
- Wang, 2001.↵
- Williford and Maunsell, 2006.↵
- Womelsdorf et al., 2006.↵
- Zaksas and Pasternak, 2006.↵