Section 1
Introduction
2
Figure 1.1.1. Example of visual search display (Treisman & Gelade, 1980).
1.1.2 Mechanisms of Visual Search
There are different factors that may influence the allocation of attention in the visual
field, but they are mainly driven by two mechanisms, namely bottom-up and top-down
strategies (Yantis, 2000). Bottom-up upholders claim that several intrinsic features of the
stimuli (e.g., an item having a unique color, see Turatto & Galfano, 2000), or a sudden onset
of a stimulus (Jonides, 1981) can grab our attention and the eyes (Theeuwes, Kramer, Hahn &
Irwin, 1998) even if we are looking for something else. On the other hand, top-down factors
are related to the familiarity of the target: the more the target is known, the more attention is
modulated (Posner, 1980; Wolfe, 1994). In addition, attending the target in different spatial
locations (Miller, 1988) can suggest which items or locations are most likely attended.
If the dependent variable is the speed response, that is, how fast attention can be
directed to the target while rejecting distractors, the increase of this variable usually correlates
and is analyzed as a function of set size (or display size). Two possible and different functions
will be output if the target is present or absent. Slopes and intercepts of these functions are
Section 1
Introduction
3
used to infer mechanisms of the search (Wolfe, 2002). The slope of the RT x set-size function
is interpreted as representing a way to measure search independently from other factors such
as initial motor processes or late response selection processes and it can be considered as
measure of the efficiency of visual search (Wolfe, 1998): a slope of 0 ms/item is an index of
extremely efficient search, on the contrary a slope of, say, 50 ms/item is an index of
inefficient search (Figure 1.1.2a).
Figure 1.1.2a. On the left an example of efficient search display is represented; on the
right, the case of inefficient search
If attention examines the scene serially (item-by-item) in order to find the target, then
the search is said to be inefficient: RTs will dependently increase with the number of items on
the display. If the target is immediately visible in the scene because of intrinsic properties of
the stimulus, then the search is said to be highly efficient, leading to almost flat search slopes.
Bravo & Nakayama (1992) suggested that information processing in the efficient search could
be due to a parallel mechanism. On the contrary, other researchers suggested that attention is
deployed serially on the scene and can be attracted, for example, by the saliency of the target.
However, it isn’t still clear if there is a threshold establishing whether the search is efficient
or not, although there has been an open debate (Treisman, 1985; Wolfe, 1989; Wang, 1994).
Section 1
Introduction
4
Presumably, as Duncan and Humpreys suggest (1989), the efficiency of search is well
expressed by a continuum, in which the degree of similarity between target and distractors
rule on the efficiency of search: the more the similarity, the less the efficiency (Figure 1.1.2b).
Figure 1.1.2b. Example of RTs distributions as functions of Set-size. Each slope
represents a degree of visual search efficiency
Section 1
Introduction
5
1.2 Contextual Learning in Visual Search Tasks
1.2.1 Spatial Contextual Cueing paradigm
In the last decade, several studies have shown that during the search for a target among
distractors the visual system uses another powerful mechanism to increase search efficiency.
Chun and Jiang (Experiment 1, 1998; see Chun, 2000 for a review) have demonstrated that
during a visual search the spatial visual context can increase the efficiency redirecting
attention towards the target location. In this case, the context is defined by the layout of the
elements, namely by the spatial relationships between the position of the target and the
position of the distractors in the scene. Therefore, when different contexts are repeatedly
presented, these visual covariations can be implicitly learned, so that when an old context is
subsequently encountered the visual system “knows” where to find the most promising target
location in the scene.
Chun and Jiang used a classical display of visual search comprising eleven “L” letters
(distractors) and one “T” (target). Each distractor could have four different rotations (90°,
180°, 270°, 360°), while the target could rotate with respect to its canonical verse (90° or
270°). Target rotation was always randomly selected on each trial. The total number of trials
was 720, half of them displayed on the screen in order to keep the spatial relationship between
the target and the distractors constant (old pattern configuration). The spatial relationships
varied randomly on each trial in the remaining trials (new pattern configuration). In detail,
they divided the display in an imaginary grid of 6 x 8 possible locations. They assigned 12
spatial locations to the target for the old configuration and 12 for the new configuration. The
remaining locations were initially randomly generated and assigned to the old or new
configurations; subsequently, these could be repeated over time (old) or newly created as in
Section 1
Introduction
6
the random configurations. There were 12 old configurations and 12 new configurations in
each block. The experiment comprised 30 blocks divided in 6 epochs. Observers had to
indicate whether the “T” was 90° -right or –left tilted.
RTs in repeated configurations were sensitively slower than random configurations
over epochs, suggesting that observers became somehow able to recognize displays
previously seen over blocks and consequently to perform the task. At the end of the
experiment observers were asked to explicitly recognize a series of configurations (12 old, 12
new) and were not able to recognize the displays previously seen. This may suggest that
recognition is probably due to implicit learning.
1.2.2 The original study of Chun and Jiang (1998)
Chun and Jiang’s study (1998) has always been considered paradigmatic because it
didn’t only show contextual cueing effect for the first time, but it also well clarified through 6
experiments the benefits of contextual cueing. The benefit of context repetition is not only due
to a sort of familiarity given by repeatedly experiencing the superficial features of the stimuli.
If this was the case, contextual cueing would be exclusively ascribed to specific location
learning, namely as learning spatial relationships. This kind of learning is very robust across
repetitions, and it emerges even after a few repetitions, sometimes after two only. Authors
also showed that contextual learning can be transferred from a set of shapes to another one,
but maintaining spatial location restrictions, so that the only concern is location and not shape
(Figure 1.2.1).
Section 1
Introduction
7
Figure 1.2.1: a) An example of the classical Contextual Cueing display, used by Chun and
Jiang (1998), is reported; b) an example of RTs distribution in a standard Contextual Cueing
task
A crucial issue that Chun and Jiang investigated in Experiment 3 was whether
participants were learning to be generally faster to the old patterns of configuration, and
whether they were learning to detect target faster when the contexts were repeated. Authors
varied target location across configurations so that the target could appear in one of the
possible distractor locations. No advantage from context repetition was found. Somehow, the
target shifting (from its location to a previous distractor location) disrupted the memory trace
for that configuration, suggesting that the spatial relationships between items in the scene are
a crucial variable to build contextual associations. In addition, Chun and Jiang also showed
that a specific pattern of configuration could be the cue for two possible target locations
(Experiment 6, 1998), and that contextual cueing could not be the result of motor pattern
learning. Also, contextual cueing emerged when configurations were presented for 200 ms,
without having participants making any eye movement. This means that what is learned is a
specific spatial relationship, not a saccadic motor pattern (Experiment 5, 1998). Furthermore,
Section 1
Introduction
8
Chun and Jiang suggested that contextual cueing could be mainly due to an attentional
guidance mechanism (Experiment 4, 1998), and therefore measured the slope of the RT x Set-
size functions. As briefly stated in 1.1.2, if contextual learning is the result of a mechanism
that guides attention more efficiently to the target location, then the slope of the RT x set-size
function should have been less consistent over epochs and significantly different from the
slope of the new configurations. Indeed they showed that once learning occurred, the slope of
RT x set-size function changed, signalling faster attention allocation on the target. We will
further discuss the attentional guidance hypothesis in the next chapter, adding experimental
evidence against the guidance hypothesis.
At the end of their experimental sessions Chun and Jiang tested the degree of
participants’ awareness on previous learning using a recognition task. Surprisingly, they
found that participants were not able to recognize the configurations they had previously seen,
suggesting that the learning of configurations can be implicit. The implicit aspects of
contextual learning need to be addressed separately, and won’t be discussed in the present
work. We wish to review the most important studies about contextual cueing in order to give
a precise frame of contextual learning and its strong relationship with attention mechanisms.
Various studies have investigated different aspects of the Contextual Cueing Effect,
hypothesizing how and at what level the mechanism affects the context.
Section 1
Introduction
9
1.3 The information learnt in Contextual Cueing
1.3.1 Global or local?
As previously mentioned, in a contextual cueing paradigm participants give their
responses by pressing two keys on a keyboard, indicating whether the target was 90°-left or -
right tilted. The crucial point is that the target (left or right) is continuously randomized across
blocks, both in old and new patterns. This is an important step, as it leads to the possibility
that participants may learn the specific target or a motor response pattern. Chun and Jiang
(1998) also tried to avoid an RT benefit for repeated patterns due to the learning of an eye
movement pattern. Furthermore, it was demonstrated that perceptual familiarity cannot
explain the contextual cueing effect alone: the repetition of a search display does not give any
RT advantage if the context is not predictive of the target location (Chun and Jiang, 1998).
Familiarity with a context is not sufficient for search speed to improve; only when a given
context is repeatedly paired with the target location then it becomes useful to improve search
efficiency.
Contextual cueing studies are still debating whether contextual learning is global or
local, that is, whether participants learn particular spatial locations paired with the target, or
the whole configuration, meant as the complexity of the spatial relations between target and
distractors.
Chun and Jiang found (1998) that the contextual information can be also extracted and
learned from noisy input, so that contextual cueing persists even after a randomly generated
jitter perturbs the items within a specific configuration. Understanding the role of noise is an
important point in global/local debate about contextual learning. Olson and Chun in 2002
demonstrated that noise can also disrupt contextual cueing. In a study they manipulated the
Section 1
Introduction
10
search display using an invisible grid, so that half part of the distractors was far from target
and the other half was close to the target. When the far-from- and the close-to-the-target
conditions were repeated, a standard contextual cueing was observed; when the repeated
pattern was close to the target and the random one was positioned far from the target, again,
an old/new RT difference was observed, suggesting that contextual cueing can emerge even
as a result of the learning of a subset of repeated information. Surprisingly, when the far
information was predictive for the target location and the distractors close to the target were
randomly generated no contextual cueing emerged. However, when far distractors were
repeated and no random distractors were close to the target contextual cueing could be
observed. This means that noise can play a role in contextual learning, and what is crucial is
not whether repeated distractors are far from or close to the target, but whether there is some
noisy information that could disrupt contextual learning.
Another study that follows this research line comes from Hoffman and Sebald (2005).
They demonstrated that local information could help target identification and its
discrimination by repeating only the information close to the target: stimuli were presented
within an invisible circle (6 distractors and 1 target), in which target could appear in all the
possible locations of the invisible circle. For half part of trials target location was repeatedly
paired with the identity of the two items close to the target, whereas for the other half of trials
each location and distractors identity were randomly assigned. The two repeated distractors
close to the target were learned and this learning guided attention to the most prominent target
location.
Also, a recent study gives evidence that a local context can be predictive for the target
location. Brady and Chun (2007), modelling the classical contextual cueing paradigm and the
revised paradigm used by Olson and Chun (2002), were able to demonstrate that contextual
cueing could be also due to the repetition of the local information. Jiang and Wagner (2004)