Edited by Claudia Roda American University of Paris
Cambridge University Press
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore,
S˜ao Paulo, Delhi, Dubai, Tokyo, Mexico City
The Edinburgh Building, Cambridge CB2 8RU, UK
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore,
S˜ao Paulo, Delhi, Dubai, Tokyo, Mexico City
The Edinburgh Building, Cambridge CB2 8RU, UK
This book was originally conceived with the objective of disseminating
the results of the AtGentive project, an international research project
sponsored by the European Commission. As the book evolved, the desire
to provide a wider view of the research and applicative work in the area of
attention-aware systems resulted in the inclusion of many chapters coming
from other applied research projects. I am grateful to the authors who
have contributed not only with their excellent chapters but, very often,
also by providing comments and suggestions to authors of other chapters.
This has resulted in creating those bridges that are often missing between
different disciplines and between specific aspects of inquiry within this
area of research. It has been a pleasure and a rewarding learning experience
to be able to coordinate this work.
I also would like to thank the many reviewers who have commented on
individual chapters. The quality of this book has certainly gained from
their insights. I extend my appreciation to the publisher’s anonymous
reviewers who have provided comments and suggestions about the overall
structure and content of the book.
My gratitude goes to Jan Steyn and Antal Neville whose patient
and thorough work in proofreading and organizing references has been invaluable.
Special thanks to Hetty Reid, Commissioning Editor, and Tom
O’Reilly, Production Editor for Cambridge University Press, who have
supported and guided me during the whole process of creating this book,
together with Joanna Garbutt, Carrie Cheek and Oliver Lown. Finally, I
am sincerely grateful for the professional, careful and timely copy-editing
work of Diane Ilott, who has spent many hot summer days patiently fixing
the small details that make all the difference.
Introduction
Claudia Roda
In recent years it has been increasingly recognized that the advent of
information and communication technologies has dramatically shifted
the balance between the availability of information and the ability of
humans to process information. During the last century information was
a scarce resource. Now, human attention has become the scarce resource
whereas information (of all types and qualities) abounds. The appropriate
allocation of attention is a key factor determining the success of creative
activities, learning, collaboration and many other human pursuits. A suitable
choice of focus is essential for efficient time organization, sustained
deliberation and, ultimately, goal achievement and personal satisfaction.
Therefore, we must address the problem of how digital systems can
be designed so that, in addition to allowing fast access to information
and people, they also support human attentional processes. With the
aim of responding to this need, this book proposes an interdisciplinary
analysis of the issues related to the design of systems capable of supporting
the limited cognitive abilities of humans by assisting the processes
guiding attention allocation. Systems of this type have been referred to
in the literature as Attention-Aware Systems (Roda and Thomas 2006),
Attentive User Interfaces (Vertegaal 2003) or Notification User Interfaces
(McCrickard, Czerwinski and Bartram 2003) and they engender many
challenging questions (see, for example, Wood, Cox and Cheng 2006).
The design of such systems must obviously rest on a deep understanding
of the mechanisms guiding human attention. Psychologists have studied
attention from many different perspectives. In the nineteenth century,
when attention was mainly studied through introspection, William James
(considered by many the founder of American psychology) devoted a
chapter in his Principles of Psychology to human attention and observed:
Everyone knows what attention is. It is the taking possession by the mind, in clear
and vivid form, of one out of what seem several simultaneously possible objects
or trains of thought . . . It implies withdrawal from some things in order to deal
effectively with others (James 1890: 403–4).
In recent years it has been increasingly recognized that the advent of
information and communication technologies has dramatically shifted
the balance between the availability of information and the ability of
humans to process information. During the last century information was
a scarce resource. Now, human attention has become the scarce resource
whereas information (of all types and qualities) abounds. The appropriate
allocation of attention is a key factor determining the success of creative
activities, learning, collaboration and many other human pursuits. A suitable
choice of focus is essential for efficient time organization, sustained
deliberation and, ultimately, goal achievement and personal satisfaction.
Therefore, we must address the problem of how digital systems can
be designed so that, in addition to allowing fast access to information
and people, they also support human attentional processes. With the
aim of responding to this need, this book proposes an interdisciplinary
analysis of the issues related to the design of systems capable of supporting
the limited cognitive abilities of humans by assisting the processes
guiding attention allocation. Systems of this type have been referred to
in the literature as Attention-Aware Systems (Roda and Thomas 2006),
Attentive User Interfaces (Vertegaal 2003) or Notification User Interfaces
(McCrickard, Czerwinski and Bartram 2003) and they engender many
challenging questions (see, for example, Wood, Cox and Cheng 2006).
The design of such systems must obviously rest on a deep understanding
of the mechanisms guiding human attention. Psychologists have studied
attention from many different perspectives. In the nineteenth century,
when attention was mainly studied through introspection, William James
(considered by many the founder of American psychology) devoted a
chapter in his Principles of Psychology to human attention and observed:
Everyone knows what attention is. It is the taking possession by the mind, in clear
and vivid form, of one out of what seem several simultaneously possible objects
or trains of thought . . . It implies withdrawal from some things in order to deal
effectively with others (James 1890: 403–4).
However, as for many other things that ‘everyone knows’, such as rationality,
intelligence, memory and love, attention escapes a precise definition,
and more than a century after James’ writing, its mechanisms still
generate debates and controversy in the scientific community.
Since the mid-twentieth century, attention allocation has been viewed
as the process of selecting stimuli for processing, and research has focused
on the question of when and how this selection takes place. Proponents of
early selection theory (Broadbent 1958) argue that stimuli are filtered early,
at the perceptual level, on the basis of their physical properties so that
irrelevant (unattended) stimuli are not further processed. Proponents of
the modified early selection theory (Treisman 1960) maintain that the early
filter is not just on or off but that some stimuli are just attenuated rather
than completely filtered out, so that some irrelevant stimuli may reach
consciousness. Proponents of late selection theory (Deutsch and Deutsch
1963) argue that all stimuli are analysed (i.e., there is no filter at perceptual
level) but only pertinent stimuli are selected for awareness and
memorization. More recently some of the fundamental assumptions of
the early/late selection dichotomy have been questioned (Awh, Vogel and
Oh 2006; Vogel, Luck and Shapiro 1988) and the debate over early and
late selection has directly or indirectly raised many other related questions:
e.g., does attention modify the manner in which we perceive the
environment, or does it impact on our response to what we perceive?
This is an important question for the design of attention-aware systems.
For example, Posner (1980) suggests that cueing facilitates perception
and that different cues activate brain areas devoted to alerting and to orienting
attention (Posner and Fan 2007). This implies that it is possible
to help the user redirect attention, maintain attention on a certain item,
or simply alert him to possibly relevant stimuli. However, psychological
literature also tells us that certain stimuli may be perceived if uncued and
even if they are actively blocked. For example, in a noisy environment
such as a cocktail party we are able to block out noise and listen to just
one conversation amongst many (Cherry 1953), but why will some of
us very easily and almost necessarily notice our name if mentioned in a
nearby but unattended conversation? In trying to address this question,
Conway and his colleagues showed that ‘subjects who detect their name
in the irrelevant message have relatively low working-memory capacities,
suggesting that they have difficulty blocking out, or inhibiting, distracting
information’ (Conway, Cowan and Bunting 2001: 331). Similar results,
relating working-memory capacity and the ability to block distractors,
have been reported in the visual modality with experiments employing
neurophysiological measures (Fukuda and Vogel 2009). A better understanding
of these mechanisms could help us design systems that help
users who have more difficulties in maintaining focus with obvious applications
in, for example, in-car support systems, technology enhanced
learning applications, control room systems, etc. The study of this very
close relationship between attention and working memory has been a
very active area of research (Awh, Vogel and Oh 2006; Baddeley 2003;
Buehner et al. 2006; Engle 2002; Shelton, Elliott and Cowan 2008).
However, both attention and working memory realize multiple functions
implemented by a variety of processes that physically correspond to multiple
areas in the brain and therefore the interaction between attention
and working memory is difficult to grasp. Some of the chapters in this
book take different stands on this interaction. In chapter 4, Low, Jin
and Sweller base their analysis of the relationship between attention and
learning on an assumption of ‘equivalence between working memory
and attentional processes’; in chapter 5, Bowman and his colleagues see
attention as a mechanism that mediates the encoding and consolidation
of information in working memory; in chapter 9, Stojanov and Kulakov
indicate that activated items in working memory guide the perception processes.
Another area of research in cognitive psychology that has had a significant
impact on the field of human–computer interaction addresses the
question of whether all types of stimuli are treated by a central system or,
instead, several different systems manage different types of input. The
organization of attention over several channels associated with different
modalities was first proposed by Allport, Antonis and Reynolds (1972),
who suggested that a number of independent, parallel channels process
task demands. Users’ responses to messages in different modalities have
consequently been studied in relation to the optimization of interaction in
various applications (see, for example, chapters 4 and 7 of this volume).
The interaction between, and the integration of, these different channels
has not yet been extensively studied. The large majority of the studies of
attention have concentrated on either the sound modality or the visual
modality. Recent research, expecially when related to human–computer
interaction, is for the most part focused on visual attention. This greater
focus on visual attention is reflected in this book, with many chapters (3,
5, 7, 10) reporting results in this modality.
A final important issue, recurrent in this volume, addresses how to
facilitate the user in his perception and understanding of messages coming
from digital devices. It is commonly accepted that two types of
processes, bottom-up and top-down, guide attention and visual attention
in particular. Bottom-up processes, also called exogenous processes,
guide attention to salient elements of the environment; and topdown,
or endogenous, processes guide attention to elements of the
environment that are relevant to the current task. The definition of
what determines the saliency of elements of the environment, and the
creation of models that integrate both bottom-up and top-down processes,
has been a very active area of research (Cave 1999; Itti 2005;
Peters and Itti 2007). These issues are central to chapters 3 and 5 of this book.
A challenge that this book aims to address is the creation of a bridge
(or a set of bridges) between the research work carried out in cognitive
psychology and neuroscience, which reports fundamental results
on specific aspects of attentional processes, and the work carried out
in human–computer interaction that endeavours to apply these results.
The difficulty of this effort is mainly due to the fact that, in the former
work, experiments are carried out in controlled environments where the
conditions under which subjects are working are known, and effects are
observed over periods of time that are often very short (down to the millisecond).
Instead, in real-world situations, such as the ones addressed
by research in human–computer interaction, there is very little or no
control over the conditions under which users are working, and the
time lengths are much longer with effects that may span hours, days
or even months. To make things worse, addressing the problems faced
by human–computer interaction would require a holistic theory of attention,
which is still far from being achieved. As a result, the tools and
systems proposed in the chapters of this book necessarily focus only
on some aspects of attention. For example, chapter 8 focuses on the
effects of contextual information, chapter 10 on the conspicuity of visual
information, and chapter 12 on social aspects of attention. Nevertheless,
attention-aware applications have been shown to be greatly beneficial
in several areas, including the control of appliances and desktop
interfaces (chapter 7), robotics (chapter 9), visualization for decision
making (chapter 10), learning and training (chapters 8 and 11), and
online collaborative environments (chapter 12).
The book is organized in three parts, with chapters that focus mainly on
concepts in part I, chapters that focus mainly on theoretical and software
tools in part II, and chapters describing applications in part III.
Part I (Concepts) introduces the conceptual framework of research
aimed at modelling and supporting human attentional processes. The
chapters in this part analyse human attention in digital environments,
integrating results from several different disciplines, including cognitive
psychology, neuroscience, pedagogy and human–computer interaction.
Chapter 2 sets the scene by providing a broad overview of the main
issues addressed by attention research in cognitive psychology and neuroscience,
and their relevance for the design of digital devices.
In chapter 3, Ronald Rensink reviews one of the prevalent areas
of attention research, vision science. Drawing on his vast experience
in this subject, Rensink guides the reader through an exploration of
visual attention and the many processes involved in scene perception.
Based on this knowledge of scene perception, Rensink proposes that
displays may be designed so that they elicit particularly efficient users’ responses.
John Sweller, who co-authors chapter 4 with Renae Low and Putai
Jin, has developed cognitive load theory, one of the most influential theories
relating attention and learning. Cognitive load theory was originally
designed ‘to provide guidelines intended to assist in the presentation of
information in a manner that encourages learner activities that optimize
intellectual performance’ (Sweller, Merrienboer and Paas 1998: 251). In
chapter 4 the authors discuss the impact of cognitive load theory on the
design of digital tools supporting learning.
Part I closes with a chapter by Howard Bowman, Li Su, Brad Wyble
and Phil J. Barnard. The authors report on the results obtained in the
Salience Project,1 and elegantly analyse some aspects of attention that
have been the focus of recent research, including its temporal organization,
its redirection, and the role of long-term goals and emotional
significance in determining saliency.
Part II (Theoretical and software tools) analyses the theoretical and
computational mechanisms currently available for supporting human
attentional processes. These tools span very different areas of attentionrelated
services to users.
Chapter 6, contributed by Benoˆıt Morel and Laurent Ach, focuses on
the design of artificial characters that adapt to the attentional state of
the user. On the strength of over a decade of practice in creating 3D
embodied agents, the authors explain the role that attention plays in creating
engaging agents ‘that are capable of natural, intuitive, autonomous
and adaptive behaviours that account for variations in emotion, gesture,
mood, voice, culture and personality’.
In chapter 7, Kari-Jouko R¨aih¨a, Aulikki Hyrskykari and P¨aivi
Majaranta discuss eye-tracking technology based on their long experience
of leading some of the most successful research endeavours in
this field, including the European Network of Excellence COGAIN and
the EYE-to-IT project. Eye-tracking technology has historically been
central to the development of attention-aware applications because of
the very close relationship between gaze direction and attention. After
reviewing the psychological foundation of visual attention, the authors
address the question of the relation between attention and the point of
gaze as well as the use of the latter for the implementation of adaptive applications.
Chapter 8, authored by Hans-Christian Schmitz, Martin Wolpers,
Uwe Kirschenmann and Katja Niemann, proposes that metadata about
attention allocation can be captured and exploited to personalize information
and tasks environments. Significantly, on the basis of their extensive
application studies, the authors argue for the important role of attention
metadata for the support of cooperative work.
In chapter 9, Georgi Stojanov and Andrea Kulakov analyse how attention
may be modelled within a complete cognitive architecture. After
reviewing how attentional processes are represented in several known
cognitive architectures, the authors present their own cognitive architecture,
founded on robotics research, and they highlight the role played by
attentional processes.
Part III (Applications) presents several computing applications
designed to support attention in specific environments. The applications
presented in this part cover a wide variety of fields, showing the
relevance of attention-aware systems to fields as different as commandand-
control displays, technology-enhanced learning, and the support of
online communication and collaboration.
The application described by Frank Kooi in chapter 10 is the result
of the author’s very long experience in researching and implementating
visual displays. The objective of the two-depth layer display presented
by the author is to increase the amount of information available to the
user without increasing clutter. Based on knowledge of visual attentional
processes, Kooi proposes that, by using dual layer displays, search may
be made much more efficient in command-and-control displays.
Chapter 11, authored by Inge Molenaar, Carla van Boxtel, Peter
Sleegers and Claudia Roda, reports on a system designed to supply
adaptive and dynamic scaffolding through the analysis and support of
learners’ attentional processes. The experimental results clearly show
the potential of the application of attention management in technologyenhanced
learning environments.
Finally, in chapter 12, Thierry Nabeth and Nicolas Maisonneuve propose
an implementation of the general attention support model originally
proposed by Roda and Nabeth (2009). This model is based on
four levels of support: perception, deliberation, operation and metacognition.
Chapter 12 explains how this model may be implemented to support
social attention and describes the attention-aware social platform AtGentNet.
Product details
Price
|
|
---|---|
File Size
| 3,026 KB |
Pages
|
361 p |
File Type
|
PDF format |
ISBN
| 978-0-521-76565-7 Hardback |
Copyright
| Cambridge University Press 2011 |
Table of Contents
Acknowledgements page ix
Notes on contributors x
List of illustrations xvii
List of tables xx
1 Introduction 1
Part I Concepts
2 Human attention and its implications for
human–computer interaction 11
3 The management of visual attention in graphic displays 63
4 Cognitive load theory, attentional processes and
optimized learning outcomes in a digital environment 93
5 Salience sensitive control, temporal attention and
stimulus-rich reactive interfaces 114
Part II Theoretical and software tools
6 Attention-aware intelligent embodied agents 147
7 Tracking of visual attention and adaptive applications 166
8 Contextualized attention metadata 186
9 Modelling attention within a complete cognitive architecture 210
Part III Applications
10 A display with two depth layers: attentional
segregation and declutter 245
11 Attention management for self-regulated learning:
AtGentSchool 259
12 Managing attention in the social web: the
AtGentNet approach 281
Index of authors cited 311
Index 321
The colour plates appear between pages 204 and 205
-Illustrations-
2.1 What makes visual search fast? colour plate (CP)
3.1 Flicker paradigm page 67
3.2 Coherence theory 68
3.3 Inattentional blindness 69
3.4 Triadic architecture 72
3.5 Featural cues 75
3.6 Proto-object structure 75
3.7 Drawing of attention by configural focus 76
3.8 Reduction of clutter via grouping 82
3.9 Organizational structures 82
3.10 Effect of different sets of values 84
4.1 A conventional, split-attention geometry example 101
4.2 A physically integrated geometry example 102
5.1 The basic AB effect for letter stimuli 118
5.2 Task schema for the key-distractor blink (Barnard et al.2004) 119
5.3 Proportion of correct responses from both humans
and model simulations (Su et al. 2007) 119
5.4 Top-level structure of the ‘glance-look’ model with implicational subsystem attended 121
5.5 A neural network that integrates five LSA cosines to classify words as targets 124
5.6 Target report accuracy by serial position comparing
human data (Barnard et al. 2005) and model
simulations for high state and high trait anxious and low state anxious 125
5.7 The ‘glance-look’ model extended with body-state subsystem 126
5.8 Examples of raw P3s recorded from human participants (Su et al. 2009) 130
5.9 ERPs of a participant for target-seen and target-missed trials 131
5.10 Diagram of a brainwave-based receipt acknowledgement device 132
5.11 Examples of virtual P3s generated from model simulations (Su et al. 2009) 134
5.12 Performance (measured as probability of detecting
targets) of AB-unaware and AB-aware systems by
varying the window sizes of the stimuli (Su et al. 2009) 135
5.13 Top-level structure of the ‘glance-look’ model with
computer interaction (through device) and
implicational subsystem attended (Su et al. 2009) 136
5.14 Performance (measured as probability of detecting the
targets) of the reactive approach using EEG feedback
with variability in the P3 detection criterion (Su et al.2009) 138
6.1 Examples of different types of Cantoche embodied
agents (from realistic to cartoonish style) CP
6.2 The Cantoche Avatar Eva displays a series of
behaviours that highlight the advantages of full-body avatars CP
6.3 The Cantoche Avatar Dominique-Vivant Denon helps
users explore the Louvre website. Reprinted by permission CP
6.4 Living ActorTM technology: the three levels of control CP
7.1 Desk-mounted video-based eye trackers: ASL 4250R
at the top, SMI iViewX in the middle and Tobii T60 at the bottom CP
7.2 Put-That-There (Bolt 1980) C 1980 ACM, Inc. Reprinted by permission CP
7.3 Top: attentive television (Shell, Selker and Vertegaal,
2003) C 2003 ACM, Inc.; bottom: eyebox2 by Xuuk,
Inc., www.xuuk.com. Both images reprinted by permission CP
7.4 Nine instances of PONG, an attentive robot (Koons
and Flickner 2003). C 2003 ACM, Inc. Reprinted by permission CP
7.5 Joint attention and eye contact with a stuffed toy robot
(Yonezawa et al. 2007). Picture reprinted courtesy of Tomoko Yonezawa CP
7.6 Two adaptive attention-aware applications. Top: ship
database (Sibert and Jacob 2000) C 2000 ACM, Inc.
Reprinted by permission; bottom: iDict, a reading aid (Hyrskykari 2006) CP
8.1 Core elements of the CAM schema 193
8.2 CAM infrastructure 197
9.1 Main data structures and parallel processes
incorporated into the Vygo architecture 224
9.2 Expansion of an abstract schema up to concrete schemas 230
9.3 Learning systems 233
9.4 A part of the cognitive architecture, responsible for
video processing, having the general learning system at its core 234
9.5 Schematic of the Attention Window (AW) 236
9.6 Two examples of saccadic movements 237
10.1 Parallax CP
10.2 Two images from dual/single layer experiment CP
10.3 Search times 252
10.4 Schematic drawing of experimental set-up and the design of the target monitor 253
10.5 The data substantiating the claim that accommodation
and motion parallax substantially aid the ease of depth perception 254
10.6 Dual-layer display (Zon and Roerdink 2007). NLR. Reprinted by permission CP
10.7 Navigation display. C NLR. Reprinted by permission CP
11.1 Example of metacognitive planning intervention CP
12.1 A snapshot of the AtGentNet platform CP
12.2 The AtGentNet overall architecture CP
12.3 Who reads me? Who do I read? CP
12.4 Stated and observed competences and interests CP
Tables
5.1 Comparison of experimental results across twelve
human participants with model simulations (Su et al.2009) page 137
9.1 A coarse-grained view of the semantics of the attention
values attached to Novamente AGI architecture atoms 220
10.1 The depth cues which are directly relevant to depth-displays 247
11.1 A summary of the intervention categories and types 269
12.1 Supporting attention at different levels 290
12.2 Mechanisms of support at different levels 297
12.3 Agent interventions 299
●▬▬▬▬▬❂❂❂▬▬▬▬▬●
●▬▬❂❂▬▬●
●▬❂▬●
●❂●