Monday, December 8, 2008

Journal of Eye Movement Research: Special issue on eye tracking now online.

The special issue on "Eye Tracking and Usability Research" in the Journal of Eye Movement Research is now online. It features the following articles:
  • Helmert, J. R., Pannasch, S. & Velichkovsky, B. M. (2008). Eye tracking and Usability Research: an introduction to the special issue (editorial). Download as PDF.

  • Castellini, C. (2008). Gaze Tracking in Semi-Autonomous Grasping. Download as PDF

  • Helmert, J. R., Pannasch, S. & Velichkovsky, B. M. (2008). Influences of dwell time and cursor control on the performance in gaze driven typing. Download as PDF.

  • Huckauf, A. & Urbina, M. H. (2008). On object selection in gaze controlled environments. Download as PDF.

  • Hyrskykari, A. & Ovaska, S., Majaranta, P., Räihä, K.-J. & Lehtinen, M. (2008). Gaze Path Stimulation in Retrospective Think-Aloud. Download as PDF.

  • Pannasch, S., Helmert, J.R., Malischke, S., Storch, A. & Velichkovsky, B.M. (2008). Eye typing in application: A comparison of two systems with ALS patients. Download as PDF.

  • Zambarbieri, D., Carniglia, E. & Robino, C. (2008). Eye Tracking Analysis in Reading Online Newspapers. Download as PDF.

Monday, November 24, 2008

Our gaze controlled robot on the DR News

The Danish National Television "TV-Avisen" episode on our gaze controlled robot was broadcasted Friday 22nd November for the nine o´ clock news. Alternative versions (resolution) of the video clip can be found at the DR site.

View video

Friday, November 21, 2008

Eye movement control of remote robot

Yesterday we demonstrated our gaze navigated robot at the Microsoft Robotics event here at ITU Copenhagen. The "robot" transmits a video which is displayed on a client computer. By using an eye tracker we can direct the robot towards where the user is looking. The concept allows for a human-machine interaction with a direct mapping of the users intention. The Danish National TV (DR) came by today and recorded a demonstration. It will be shown tonight at the nine o´ clock news. Below is a video that John Paulin Hansen recorded yesterday which demonstrates the system. Please notice that the frame-rate of the video stream was well below average at the time of recording. It worked better today. In the coming week we'll look into alternative solutions (suggestions appreciated) The projects has been carried out in collaboration with Alexandre Alapetite from DTU. His low-cost, LEGO-based rapid mobile robot prototype, gives interesting possibilities to test some human-computer and human-robot interaction.

The virgin tour around the ITU office corridor (on YouTube)

Available on YouTube

Tuesday, November 18, 2008

A framework for gaze selection techniques (Tonder et al., 2008)

Martin van Tonder, Charmain Cilliers and Jean Greyling at the Nelson Mandela Metropolitan University, South Africa presented a platform independent framework in the proceedings of the 2008 annual research conference of the South African Institute of Computer Scientists. The framework is platform independent (relying on Java) and supports multiple interaction methods such as Kumars EyePoint, popups, as well as data logging and visualization.

Experimental gaze interaction techniques are typically prototyped from scratch using proprietary libraries provided by the manufacturers of eye tracking equipment. These libraries provide gaze data interfaces, but not any of the additional infrastructure that is common to the implementation of such techniques. This results in an unnecessary duplication of effort. In this paper, a framework for implementing gaze selection techniques is presented. It consists of two components: a gaze library to interface with the tracker and a set of classes which can be extended to implement different gaze selection techniques. The framework is tracker and operating system independent, ensuring compatibility with a wide range of systems. Support for user testing is also built into the system, enabling researchers to automate the presentation of est targets to users and record relevant test data. These features greatly simplify the process of implementing and evaluating new interaction techniques. The practicality and flexibility of the framework are demonstrated by the successful implementation of a number of gaze selection
  • van Tonder, M., Cilliers, C., and Greyling, J. 2008. A framework for gaze selection techniques. In Proceedings of the 2008 Annual Research Conference of the South African institute of Computer Scientists and information Technologists on IT Research in Developing Countries: Riding the Wave of Technology (Wilderness, South Africa, October 06 - 08, 2008). SAICSIT '08, vol. 338. ACM, New York, NY, 267-275. DOI=

Monday, November 17, 2008

Wearable Augmented Reality System using Gaze Interaction (Park et al., 2008)

Hyung Min Park, Seok Han Lee and Jong Soo Choi from the Graduate School of Advanced Imaging Science, Multimedia & Film at the University of Chung-Ang, Korea presented a paper on their Wearable Augmented Reality System (WARS) at the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality. They use a half-blink mode (called "aging") for selection which is detected by their custom eye tracking algorithms. See the end of the video.

Undisturbed interaction is essential to provide immersive AR environments. There have been a lot of approaches to interact with VEs (virtual environments) so far, especially in hand metaphor. When the user‟s hands are being used for hand-based work such as maintenance and repair, necessity of alternative interaction technique has arisen. In recent research, hands-free gaze information is adopted to AR to perform original actions in concurrence with interaction. [3, 4]. There has been little progress on that research, still at a pilot study in a laboratory setting. In this paper, we introduce such a simple WARS(wearable augmented reality system) equipped with an HMD, scene camera, eye tracker. We propose „Aging‟ technique improving traditional dwell-time selection, demonstrate AR gallery – dynamic exhibition space with wearable system.
Download paper as PDF.

Tuesday, November 11, 2008

Gaze vs. Mouse in Games: The Effects on User Experience (Gowases T, Bednarik R, Tukiainen M)

Tersia Gowases, Roman Bednarik (blog) and Markku Tukiainen at the Department of Computer Science and Statistics, University of Joensuu, Finland got a paper published in the proceedings for the 16th International Conference on Computers in Education (ICCE).

"We did a simple questionnaire-based analysis. The results of the analysis show some promises for implementing gaze-augmented problem-solving interfaces. Users of gaze-augmented interaction felt more immersed than the users of other two modes - dwell-time based and computer mouse. Immersion, engagement, and user-experience in general are important aspects in educational interfaces; learners engage in completing the tasks and, for example, when facing a difficult task they do not give up that easily. We also did analysis of the strategies, and we will report on those soon. We could not attend the conference, but didn’t want to disappoint eventual audience. We thus decided to send a video instead of us. " (from Romans blog)

"The possibilities of eye-tracking technologies in educational gaming are seemingly endless. The question we need to ask is what the effects of gaze-based interaction on user experience, strategy during learning and problem solving are. In this paper we evaluate the effects of two gaze based input techniques and mouse based interaction on user experience and immersion. In a between-subject study we found that although mouse interaction is the easiest and most natural way to interact during problemsolving, gaze-based interaction brings more subjective immersion. The findings provide a support for gaze interaction methods into computer-based educational environments." Download paper as PDF.

Some of this research has also been presented within the COGAIN association, see:
  • Gowases Tersia (2007) Gaze vs. Mouse: An evaluation of user experience and planning in problem solving games. Master’s thesis May 2, 2007. Department of Computer Science, University of Joensuu, Finland. Download as PDF

Monday, November 3, 2008

The Conductor Interaction Method (Rachovides et al)

Interesting concept combining gaze input with hand gestures by Dorothy Rachovides at the Digital World Research Centre together with James Walkerdine and Peter Phillips at the Computing Department Lancaster University.

"This article proposes an alternative interaction method, the conductor interaction method (CIM), which aims to provide a more natural and easier-to-learn interaction technique. This novel interaction method extends existing HCI methods by drawing upon techniques found in human-human interaction. It is argued that the use of a two-phased multimodal interaction mechanism, using gaze for selection and gesture for manipulation, incorporated within a metaphor-based environment, can provide a viable alternative for interacting with a computer (especially for novice users). Both the model and an implementation of the CIM within a system are presented in this article. This system formed the basis of a number of user studies that have been performed to assess the effectiveness of the CIM, the findings of which are discussed in this work.

More specifically the CIM aims to provide the following.

—A More Natural Interface. The CIM will have an interface that utilizes gaze and gestures, but is nevertheless capable of supporting sophisticated activities. The CIM provides an interaction technique that is as natural as possible and is close to the human-human interaction methods with which users are already familiar. The combination of gaze and gestures allows the user to perform not only simple interactions with a computer, but also more complex interacones such as the selecting, editing, and placing of media objects.

—A Metaphor Supported Interface. In order to help the user understand and exploit the gaze and gesture interface, two metaphors have been developed. An orchestra metaphor is used to provide the environment in which the user interacts. A conductor metaphor is used for interacting within this environment. These two metaphors are discussed next.

—A Two-Phased Interaction Method. The CIM uses an interaction process where each modality is specific and has a particular function. The interaction between user and interface can be seen as a dialog that is comprised of two phases. In the first phase, the user selects the on-screen object by gazing at it. In the second phase, with the gesture interface the user is able to manipulate the selected object. These distinct functions of gaze and gesture aim to increase system usability, as they are based on human-human interaction techniques, and also help to overcome issues such as the Midas Touch problem that often experienced by look-and-dwell systems. As the dialog combines two modalities in sequence, the gaze interface can be disabled after the first phase. This minimizes the possibility of accidentally selecting objects through the gaze interface. The Midas Touch problem can also be further addressed by ensuring that there is ample dead space between media objects.

—Significantly Reduced Learning Overhead. The CIM aims to reduce the overhead of learning to use the system by encouraging the use of gestures that users can easily associate with activities they perform in their everyday life. This transfer of experience can lead to a smaller learning overhead [Borchers 1997], allowing users to make the most of the system’s features in a shorter time.

Gaze and Voice Based Game Interaction (Wilcox et al., 2008)

"We present a 3rd person adventure puzzle game using a novel combination of non intrusive eyetracking technology and voice recognition for game communication. Figure 1 shows the game, and its first person sub games that make use of eye tracker functionality in contrasting ways: a catapult challenge (a) and a staring competition(b)."

"There are two different modes of control in the main game. The user can select objects by looking at them and perform ’look’, ’pickup’, ’walk’, ’speak’, ’use’ and other commands by vocalizing there respective words. Alternatively, they can perform each command by blinking and winking at objects. To play the catapult game for example, the user must look at the target and blink, wink or drag to fire a projectile towards the object under the crosshair. "

Their work was presented at the ACM SIGGRAPH 2008 with the associated poster:

Sunday, October 26, 2008

Low cost open source eye tracking from Argentina

By using low cost webcams such as the Lifecam VX-100 or similar this person from Argentina have produced an eye tracker capable of running the Gaze Talk interface. The total cost for the eye tracker hardware is US$ 40-50. The software runs on a typical desktop or laptop computer using the OpenCV based image processing algorithms.

"My goal is to develop an open source system that enables people with severe motor disabilities to interact with the computer using their eye movements."

The project is running for another three weeks and the outcome will be very interesting. Check out the development blog at

Thursday, September 18, 2008

The Inspection of Very Large Images by Eye-gaze Control

Nicholas Adams, Mark Witkowski and Robert Spence from the Department of Electrical and Electronic Engineering at the Imperial College London got the HCI 08 Award for International Excellence for work related to gaze interaction.

"The researchers presented novel methods for navigating and inspecting extremely large images solely or primarily using eye gaze control. The need to inspect large images occurs in, for example, mapping, medicine, astronomy and surveillance, and this project considered the inspection of very large aerial images, held in Google Earth. Comparative search and navigation tasks suggest that, while gaze methods are effective for image navigation, they lag behind more conventional methods, so interaction designers might consider combining these techniques for greatest effect." (BCS Interaction)


The increasing availability and accuracy of eye gaze detection equipment has encouraged its use for both investigation and control. In this paper we present novel methods for navigating and inspecting extremely large images solely or primarily using eye gaze control. We investigate the relative advantages and comparative properties of four related methods: Stare-to-Zoom (STZ), in which control of the image position and resolution level is determined solely by the user's gaze position on the screen; Head-to-Zoom (HTZ) and Dual-to-Zoom (DTZ), in which gaze control is augmented by head or mouse actions; and Mouse-to-Zoom (MTZ), using conventional mouse input as an experimental control.

The need to inspect large images occurs in many disciplines, such as mapping, medicine, astronomy and surveillance. Here we consider the inspection of very large aerial images, of which Google Earth is both an example and the one employed in our study. We perform comparative search and navigation tasks with each of the methods described, and record user opinions using the Swedish User-Viewer Presence Questionnaire. We conclude that, while gaze methods are effective for image navigation, they, as yet, lag behind more conventional methods and interaction designers may well consider combining these techniques for greatest effect.

This paper is the short version of Nicolas Adams Masters thesis which I stumbled upon before creating this blog. A early version appeared as a short paper for COGAIN06.

Monday, September 15, 2008

Apple develops gaze assisted interaction?

Apple recently registered a patent for merging several modalities including gaze vectors for novel interaction methods. The direction of gaze is to be used in combination with finger gestures (or other input devices) to modify the object that the user is currently looking at. Will be interesting to see what types of devices they are aiming for. May not be high precision eye tracking since stability and high accuracy is hard to obtain for a 100% population in all environments.

From the patent document:
"There are many possible applications that would benefit from the temporal fusion of gaze vectors with multi-touch movement data. For the purpose of example, one simple application will be discussed here: Consider a typical computer screen, which has several windows displayed. Assume that the user wishes to bring forward the window in the lower left corner, which is currently underneath two other windows. Without gaze vector fusion there are two means to do this, and both involve movement of the hand to another position. The first means is to move the mouse pointer over the window of interest and click the mouse button. The second means is to use a hot-key combination to cycle through the screen windows until the one of interest is brought forward. Voice input could also be used but it would be less efficient than the other means. With gaze vector fusion, the task is greatly simplified. For example, the user directs his gaze to the window of interest and then taps a specific chord on the multi-touch surface. The operation requires no translation of the hands and is very fast to perform."

"For another example, assume the user wishes to resize and reposition an iTunes window positioned in the upper left of a display screen. This can be accomplished using a multi-touch system by moving the mouse pointer into the iTunes window and executing a resize and reposition gesture. While this means is already an improvement over using just a mouse its efficiency can be further improved by the temporal fusion of gaze vector data. "

TeleGaze (Hemin, 2008)

"This research investigates the use of eye-gaze tracking in controlling the navigation of mobile robots remotely through a purpose built interface that is called TeleGaze. Controlling mobile robots from a remote location requires the user to continuously monitor the status of the robot through some sort of feedback system. Assuming that a vision-based feedback system is used such as video cameras mounted onboard the robot; this requires the eyes of the user to be engaged in the monitoring process throughout the whole duration of the operation. Meanwhile, the hands of the user need to be engaged, either partially or fully, in the driving task using any input devices. Therefore, the aim of this research is to build a vision based interface that enables the user to monitor as well as control the navigation of the robot using only his/her eyes as inputs to the system since the eyes are engaged in performing some tasks anyway. This will free the hands of the user for other tasks while controlling the navigation is done through the TeleGaze interface. "

TeleGaze experimental platform consists of a mobile robot, an eye gaze tracking equipment and a teleoperation station that the user interacts with. The TeleGaze interface runs on the teleoperation station PC and interprets inputs from the eyes into controlling commands. Meanwhile, presenting the user with the images that come back from the vision system mounted on the robotic platform.

More information at Hemin Sh. Omers website.

Associated publications:
  • Hemin Omer Latif, Nasser Sherkat and Ahmad Lotfi, "TeleGaze: Teleoperation through Eye Gaze", 7th IEEE International Conference on Cybernetic Intelligent Systems 2008, London, UK. Conference website:
  • Hemin Omaer Latif, Nasser Sherkat and Ahmad Lotfi, "Remote Control of Mobile Robots through Human Eye Gaze: The Design and Evaluation of an Interface", SPIE Europe Security and Defence 2008, Cardiff, UK. Conference website:

COGAIN 2008 Proceedings now online


Overcoming Technical Challenges in Mobile and Other Systems
  • Off-the-Shelf Mobile Gaze Interaction
    J. San Agustin and J. P. Hansen, IT University of Copenhagen, Denmark
  • Fast and Easy Calibration for a Head-Mounted Eye Tracker
    C. Cudel, S Bernet, and M Basset, University of Haute Alsace, France
  • Magic Environment
    L. Figueiredo, T. Nunes, F. Caetano, and A. Gomes, ESTG/IPG, Portugal
  • AI Support for a Gaze-Controlled Wheelchair
    P. Novák, T. Krajník, L. Přeučil, M. Fejtová, and O. Štěpánková. Czech Technical University, Czech Republic)
  • A Comparison of Pupil Centre Estimation Algorithms
    D. Droege, C Schmidt, and D. Paulus University of Koblenz-Landau, Germany

Broadening Gaze-Based Interaction Techniques
  • User Performance of Gaze-Based Interaction with On-line Virtual Communities
    H. Istance, De Montfort University, UK, A. Hyrskykari, University of Tampere, Finland, S. Vickers, De Montfort University, UK and N. Ali, University of Tampere, Finland

  • Multimodal Gaze Interaction in 3D Virtual Environments
    E. Castellina and F. Corno, Politecnico di Torino, Italy
  • How Can Tiny Buttons Be Hit Using Gaze Only?
    H. Skovsgaard, J. P. Hansen, IT University of Copenhagen, Denmark. J. Mateo, Wright State University, Ohio, US
  • Gesturing with Gaze
    H. Heikkilä, University of Tampere, Finland
  • NeoVisus: Gaze Driven Interface Components
    M. Tall, Sweden

Focusing on the User: Evaluating Needs and Solutions
  • Evaluations of Interactive Guideboard with Gaze-Communicative Stuffed-Toy Robot
    T. Yonezawa, H. Yamazoe, A. Utsumi, and S. Abe, ATR Intelligent Robotics and Communications Laboratories, Japan
  • Gaze-Contingent Passwords at the ATM
    P. Dunphy, A. Fitch, and P. Oliver, Newcastle University, UK
  • Scrollable Keyboards for Eye Typing
    O Špakov and P. Majaranta, University of Tampere, Finland
  • The Use of Eye-Gaze Data in the Evaluation of Assistive Technology Software for Older People.
    S. Judge, Barnsley District Hospital Foundation, UK and S. Blackburn, Sheffield University, UK
  • A Case Study Describing Development of an Eye Gaze Setup for a Patient with 'Locked-in Syndrome' to Facilitate Communication, Environmental Control and Computer Access.
    Z. Robertson and M. Friday, Barnsley General Hospital, UK

Friday, September 12, 2008

COGAIN 2008 Video

Some highlights from the visit to COGAIN 2008 last week in Prague which was a great event. It demonstrates the mobile solution integrating a head mounted display and an eye tracker by Javier San Agustín. A sneak peak of the NeoVisus iTube interface running on the SMI IViewX RED. A demonstration of the Neural Impulse Actuator from OCZ Technolgies by Henrik Skovsgaard. Demo of the gaze controlled wheelchair developed by Falck Igel and Alea Technologies. Thanks to John Paulin Hansen for creating the video.

Thursday, August 28, 2008

Mixed reality systems for technical maintenance and gaze-controlled interaction (Gustafsson et al)

To follow up on the wearable display with an integrated eye tracker one possible application is in the domain of mixed reality. This allows for interfaces to be projected on top of a video stream (ie. the "world view") Thus blending the physical and virtual world. The paper below investigates how this could be used to assist technical maintenance of advanced systems such as fighter jets. It´s an early prototype but the field is very promising especially when an eye tracker is involved.

"The purpose of this project is to build up knowledge about how future Mixed Reality (MR) systems should be designed concerning technical solutions, aspects of Human-Machine-Interaction (HMI) and logistics. The report describes the work performed in phase2. Regarding hardware a hand-held MR-unit, a wearable MR-system and a gaze-controlled MR-unit have been developed. The work regarding software has continued with the same software architecture and MR-tool as in the former phase 1. A number of improvements, extensions and minor changes have been conducted as well as a general update. The work also includes experiments with two test case applications, "Turn-Round af Gripen (JAS) and "Starting Up Diathermy Apparatus" Comprehensive literature searches and surveys of knowledge of HMI aspects have been conducted, especially regarding gaze-controlled interaction. The report also includes a brief overview of ohter projects withing the area of Mixed Reality."
  • Gustafsson, T., Carleberg, P., Svensson, P., Nilsson, S., Le Duc, M., Sivertun, Å., Mixed Reality Systems for Technical Maintenance and Gaze-Controlled Interaction. Progress Report Phase 2 to FMV., 2005. Download paper as PDF

Sunday, August 24, 2008

Nokia Research: Near Eye Display with integrated eye tracker

During my week in Tampere I had the opportunity to visit Nokia Research to get a hands on with a prototype that integrates a head mounted display with an eye tracker. Due to a NDA I am unable to reveal the contents of the discussion but it does work and it was a very neat experience with great potential. Would love to see a commercial application down the road. For more information there is a paper available:
Hands-On with the Nokia NED w/ integrated eye tracker

Paper abstract:
"Near-to-Eye Display (NED) offers a big screen experience to the user anywhere, anytime. It provides a way to perceive a larger image than the physical device itself is. Commercially available NEDs tend to be quite bulky and uncomfortable to wear. However, by using very thin plastic light guides with diffractive structures on the surfaces, many of the known deficiencies can be notably reduced. These Exit Pupil Expander (EPE) light guides enable a thin, light, user friendly and high performing see-through NED, which we have demonstrated. To be able to interact with the displayed UI efficiently, we have also integrated a video-based gaze tracker into the NED. The narrow light beam of an infrared light source is divided and expanded inside the same EPEs to produce wide collimated beams out from the EPE towards the eyes. Miniature video camera images the cornea and eye gaze direction is accurately calculated by locating the pupil and the glints of the infrared beams. After a simple and robust per-user calibration, the data from the highly integrated gaze tracker reflects the user focus point in the displayed image which can be used as an input device for the NED system. Realizable applications go from eye typing to playing games, and far beyond."

GaCIT in Tampere, day 5.

On Friday, the last day of GaCIT, Ed Cutrell from Microsoft Research gave a talk concerning usability evaluation and how eye tracking can give a deliver a deeper understanding. While it has been somewhat abused to convince the managment with pretty pictures of heat maps it adds value to a design inquiry as an additional source of behavioral evidence. Careful consideration of the experiment design is needed. Sometimes studies in the lab lacks the ecological validity of the real in-the-field research, more on this further on.

The ability to manipulate independent variables, enforce consistency and control are important concerns. For example running a web site test against the site online may produce faulty data since the content of the site may change for each visit. This is referred to as the stimuli sensitivity and increses in-between power since all subjects are exposed to exactly the same stimuli. Another issue is the task sensitivity. The task must reflect what the results are supposed to illustrate (ie. reading a text does not contain elements of manipulation. People are in general very task oriented, instructed to read they will ignore certain elements (eg. banners etc.)

A couple of real world examples including the Fluent UI (Office 2008), Phlat and Search Engine Results Pages (SERP) were introduced.

The Fluent UI is the new interface used in Office 2008. It resembles a big change compared with the traditional Office interface. The Fluent UI is task and context dependent compared to the rather static traditional setup of menubars and icons cluttering the screen.

Example of the Fluent UI (Microsoft, 2008)

The use of eye trackers illustrated how users interacted with the interface. This may not always occur in the manner the designer intended. Visualization of eye movement gives developers and designers a lot of instant aha-experiences.

At Microsoft it is common to work around personas in multiple categories. These are abstract representations of user groups that help to illustrate the lifes and needs for "typical" users. For example, Nicolas, is a tech-savvy IT professional while Jennifer is a young hip girl who spend a lot of time on YouTube or hang around town with her shiny iPod (err.. Zune that is)

More information on the use of personas as a design method:
  • J. Grudin, J. Pruitt (2002) Personas, Participatory Design and Product Development: An Infrastructure for Engagement (Microsoft Research) Download as Word doc.

  • J. Grudin (2006) Why Personas Work: The Psychological Evidence (Microsoft Research) Download as Word doc.
Moving on, the Phlat projects aims at solving the issues surrounding navigating and searching large amounts of personal data, sometimes up to 50GB of data. Eye trackers were used to evaluate the users behavior agains the interface. Since the information managed using the application is personal there were several privacy issues. To copy all the information onto the computers in the lab was not a feasible solution. Instead the participants used the Remote Desktop functionality which allowed the lab computers to be hooked up with the participants personal computers. The eye trackers then recorded the local monitor which displayed the remote computer screen. This gives much higher ecological validity since the information used has personal/affective meaning.

Phlat - interface for personal information navigation and search (Microsoft)
The use of eye trackers for evaluating websites has been performed in several projects. Such as J. Nielsens F-Shaped Pattern For Reading Web Content and Enquiros Search Engine Results (Golden Triangle). Ed Cutrell decided to investigate how search engine results pages are viewed and what strategies users had. The results gave some interesting insight in how the decision making process goes and which links are see vs clicked. Much of the remaining part of the talk was concerned with the design, execution and results of the study, great stuff!

Further reading:
Unfortunately I had to catch a flight back home in the afternoon so I missed Howell Istance last talk. However, I´ll get a new opportunity to hear one of his excellent presentation in a weeks time at COGAIN2008.

GaCIT 2008. B. Velichovsky associated litterature

Experimental effects, e.g. distractor effect
  • Pannasch, S., Dornhoefer, S.M., Unema, P.J.A. & Velichkovsky, B.M. (2001). The omnipresent prolongation of visual fixations: saccades are inhibited by changes in situation and in subject's activity. Vision Research. 41(25-26), 3345-51. Download Full Article [PDF]
  • Velichkovsky, B.M., Dornhoefer, S.M. , Kopf, M., Helmert, J. & Joos, M. (2002). Change detection and occlusion modes in static and dynamic road-traffic scenarios. Transportation Research, Part F. 5(2), 99-109. Download Full Article [PDF]
Ambient vs. Focal
  • Velichkovsky, B.M., Rothert, A., Kopf, M., Dornhoefer, S.M. & Joos, M. (2002). Towards an express diagnostics for level of processing and hazard perception. Transportation Research, Part F. 5(2), 145-156. Download Full Article [PDF]
  • Unema, P., Pannasch, S., Joos, M. & Velichkovsky, B.M. (2005). Time-course of information processing during scene perception: The relationship between saccade amplitude and fixation duration. Visual Cognition, 12(3), 473-494. Download Full Article [PDF]
  • Velichkovsky, B.M., Joos, M., Helmert, J.R., & Pannasch, S. (2005). Two visual systems and their eye movements: evidence from static and dynamic scene perception. CogSci 2005: Proceedings of the XXVII Conference of the Cognitive Science Society. July 21-23 Stresa, Italy, pp. 2283-2288. Download Full Article [PDF]
Levels of processing
  • Velichkovsky, B.M. (2005). Modularity of cognitive organization: Why it is so appealing and why it is wrong. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: Understanding the development and evolution of natural complex systems. Cambridge, MA: MIT Press.
  • Graupner, S. T., Velichkovsky, B. M., Pannasch, S., & Marx, J. (2007). Surprise, surprise: Two distinct components in the visually evoked distractor effect. Psychophysiology, 44(2), 251-261. Download Full Article [PDF]
  • Velichkovsky, B.M. (2007) Towards an Evolutionary Framework for Human Cognitive Neuroscience. Theoretical Biology, 2(1), 3-6. Download Full Article [PDF]
  • Velichkovsky,B.M. (2002). Heterarchy of cognition: The depths and the highs of a framework for memory research. Memory, 10(5/6), 405-419 (Special Issue on Levels-of-Processing-Approach).

Saturday, August 23, 2008

GaCIT in Tampere, day 4.

A follow up the hands-on session was held by Andrew Duchowski. This time to investigate eye movements on moving stimulo (ie. video clips) A classic experiment from the Cognitive Science domain was used as stimuli (the umbrella woman) It serves as a very nice example on how to use eye trackers in a practical experiment.

The task objective is to either count the passes of the white or black team. The experiment illustrates the inattentional blindness which causes certain objects in the movie to go unnoticed.
More information on the phenomenon can be found in the following papers:
  • Becklen, Robert and Cervone, Daniel (1983) Selective looking and the noticing of unexpected events. Memory and Cognition, 11, 601-608.
  • Simons, Daniel J. and Chabris, Christopher F. (1999). Gorillas in our midst: sustained inattentional blindness for dynamic events, Perception, 28, pp.1059-1074.
  • Rensink, Ronald A. (2000). When Good Observers Go Bad: Change Blindness, Inattentional Blindness, and Visual Experience, Psyche, 6(09), August 2000. Commentary on: A. Mack and I. Rock (1998) Inattentional Blindness. MIT Press.
Defining areas of interest (AOI) often creates the tedious process of keyframing where the object has to be defined in each frame of the video. Automatic matchmoving/rotoscoping software does exists but it often does not perform a perfect segmentation of the moving objects. Dixon et al. have performed research in this area, more information can be found in the following papers:
The afternoon was used to participant presentations which covered a rather wide range of topics, visual cognition, expert vs novices gaze patterns, gaze interaction, HCI and usability research.

GaCIT in Tampere, day 3.

In the morning Howell Istance of De Montford University, currently at University of Tampere, gave a very intersting lecture concerning gaze interaction, it was divided into three parts 1) games 2) mobile devices 3) stereoscopic displays

This is an area for gaze interaction which have a high potential and since the gaming industry has grown to be a hugh industy it may help to make eye trackers accessible/affordable. The development would be benificial for users with motor impairments. A couple of examples for implementations were then introduced. The first one was a first person shoother running on a XBOX360:
The experimental setup evaluation contained 10 repeated trials to look at learning (6 subjects). Three different configurations were used 1) gamepad controller moving and aiming (no gaze) 2) gamepad controller moving and gaze aiming and 3) gamepad controller moving forward only, gaze aiming and steering of the movement.
However, twice as many shots were fired that missed in the gaze condition which can be described as a "machine gun" approach. Noteworthy is that no filtering was applied to the gaze position.
Howell have conducted a analysis of common tasks in gaming, below is a representation of the amount of actions in the Guild Wars game. The two bars indicate 1) novices and 2) experienced users.

Controlling all of these different actions requires switching of task mode. This is very challenging considering only on input modality (gaze) with no method of "clicking".

There are several ways a gaze interface can be constructed. From a bottom up approach. First the position of gaze can be used to emulate the mouse cursor (on a system level) Second, a transparent overlay can be placed on top of the application. Third, a specific gaze interface can be developed (which has been my own approach) This requires a modification of the original application which is not always possible.

The Snap/Clutch interaction method developed by Stephen Vickers who is working with Howell operates on the system level to emulate the mouse. This allows for specific gaze gestures to be interpretated which is used to switch mode. For example a quick glace to the left of the screen will activate a left mouse button click mode. When a eye fixation is detected in a specific region a left mouse click will be issued to that area.

When this is applied to games such as World of Warcraft (demo) specific regions of the screen can be used to issue movement actions towards that direction. The image below illustrates these regions overlaid on the screen. When a fixation is issued in the A region an action to move towards that direction is issued to the game it self.

Stephen Vickers gaze driven World of Warcraft interface.

After lunch we had a hands-on session with the Snap/Clutch interaction method where eight Tobii eye trackers were used for a round multiplayer of WoW! Very different from a traditional mouse/keyboard setup and takes some time to get used to.

  • Istance, H.O.,Bates, R., Hyrskykari, A. and Vickers, S. Snap Clutch, a Moded Approach to Solving the Midas Touch Problem. Proceedings of the 2008 symposium on Eye Tracking Research & Applications; ETRA 2008. Savannah, GA. 26th-28th March 2008. Download
  • Bates, R., Istance, H.O., and Vickers, S. Gaze Interaction with Virtual On-Line Communities: Levelling the Playing Field for Disabled Users. Proceedings of the 4th Cambridge Workshop on Universal Access and Assistive Technology; CWUAAT 2008. University of Cambridge, 13th-16th April 2008. Download

The second part of the lecture concerned gaze interaction for mobile phones. This allows for ubiquitous computing where the eye tracker is integrated with a wearable display. As a new field it is surrounded with certain issues (stability, processing power, variation in lightning etc.) but all of which will be solved over time. The big question is what the "killer-application" will be. ( entertainment?) A researcher from Nokia attended the lecture and introduced a prototype system. Luckily I had the chance to visit their research department the following day to get a hands-on with their head mounted display with a integrated eye tracker (more on this in another post)

The third part was about stereoscopic displays which adds a third dimension (depth) to the traditional X and Y axis. There are several projects around the world working towards making this everyday reality. However, tracking the depth of gaze fixation is limited. The vergence (as seen by the distance between both pupils) eye movements are hard to measure when the distance to objects move above two meters.

Calculating convergence angles
d = 100 cm tan θ = 3.3 / 100; θ = 1.89 deg.
d = 200 cm tan θ = 3.3 / 200; θ = 0.96 deg.

Related papers on stereoscopic eye tracking:
The afternoon was spent with a guided tour around Tampere followed by a splendid dinner at a "viking" themed restaurant.

GaCIT in Tampere, day 2.

The second of GaCIT in Tampere started off with a hands-on lab by Andrew Duchowski. This session followed up on the introduction the day before. The software of choice was Tobii Studio which is an integrated solution for displaying stimuli and visualization of eye movements (scanpaths, heat-maps etc.) Multiple types of stimuli can be used, including text, images, video, websites etc.

The "experiment" consisted of two ads shown below. The hypothesis to be investigated was that the direction of gaze would attract more attention towards the text compared to the picture where the baby is facing the camera.

After calibrating the user the stimulus is observed for a specific amout of time. When the recording has completed a replay of the eye movements can be visually overlaid ontop of the stimuli. Furthermore, several recordings can be incorporated into one clip. Indeed the results indicate support for the hypothesis. Simply put, faces attract attention and the direction of gaze guides it further.

After lunch Boris Velichkovsky gave a lecture on cognitive technologies. After a quick recap of the talk the day before about the visual system the NBIC report was introduced. This concerns the converging technologies of Nano-, Bio-, Information Technology and Cognitive Science.

Notable advances in these fields contain the Z3 computer (Infotech, 1941), DNA (Bio, 1953), Computed Tomography scan (Nano, 1972) and Short Term Memory (CogSci, 1968) All of which has dramtically improved human understanding and capabilities.

Another interesting topic concerned the superior visual recognition skills humans have. Research have demonstrated that we are able to recognize up to 2000 photos after two weeks with a 90% accuracy. Obviously the visual system is our strongest sense, however much of our computer interaction as a whole is driven by a one way information flow. Taking the advaces in
bi-directional OLED microdisplays in to account the field of augmented reality have a bright future. These devices act as both camera and displaying information at the same time. Add an eye tracker to the device and we have some really intresting opportunities.

Boris also discussed the research of Jaak Panksepp concerning the basic emotional systems in the mammals. (emo-systems, anatomical areas and neurotransmitters for modulation)

To sum up the second day was diverse in topics but non the less interesting and demonstrates the diversity of knowledge and skills needed for todays researchers.

Tuesday, August 19, 2008

GaCIT in Tampere, day 1.

The first day of the summer school on gaze, communication and interaction technology (GACIT, pronounced gaze-it) were started of by a talk with Boris Velichkovsky. The topic was visual cognition, eye movement and attention. These are some of the notes I made.

Some basic findings of the visual system were introduced. In general, the visual system is divided into two pathways, the dorsal and ventral system. The dorsal goes from the striate cortex (back of the brain) and upwards (towards the posterior parietal). This pathway of visual information concerns the spatial arrangement of objects in our environment. Hence, it is commonly termed the "where" pathway. The other visual pathway goes towards the temporal lobes (just above your ears) and concerns the shape and identification of specific objects, this is the "what" pathway. The ambient system responds early (0-250ms.) after which the focal system takes over.

These two systems are represented by the focal (what) and ambient (where) attention systems. The ambient system has a overall crude but fast response in lower luminance, the focal attention system works the opposite with fine, but slow, spatial resolution. Additional, the talk covered the cognitive models of attention such as Posner, Broadbent etc. (see attention on Wikipedia)

A great deal of the talk concerned the freezing effect (inhibition of saccades and a prolonged fixation) which can to some extent be predicted. The onset of a dangerous "event" can be seen before the acctual response (the prolonged fixation) Just before the fixation (500ms) the predicition can be made with a 95% success. The inhibition comes in two waves where the first one is issued by the affective repsonse of the amygdala (after 80 ms.) which acts on the superior colliculus to inihibit near saccades. A habituating effect on this affective response can be seen where the second wave of inhitition (+170ms.) becomes less apperent, the initial response is however unaffected.

While driving a car and talking on the phone the lack of attention leads to eye movements with shorter fixation durations. This gives an approximated spatial localization of objects. It is the combination of a) duration of the fixation and b) the surrounding saccades that determines the quality of recognition. A short fixation followed by a long subsequent saccade leads to low recognition results. A short fixation followed by a short saccade gives higher recognition scores. A long fixation followed by either short or long saccades leads to equally high recognition results.

Furthermore, a short saccade within the parafoveal region leads to a high level of neural activity (EEG) after 90ms. This differs from long saccades which gives no noticable variance in cortical activity (compared to the base line)

However, despite the classification into two major visual systems, the attentional system can be divided into 4-6 layers of organization. Hence there is no singel point of attention. They have developed during the evolution of the mind to support various cognitive demands.

For example, the role of emotional respones in social communication can be seen with the strong response to facial expressions. Studies have shown that male responds extremely fast to face-to-face images of other males expressing aggressive facial gestures. This low level response happens much faster that our conscious awareness (as low as 80ms. if I recall correctly) Additionally, the eyes are much faster than we can consciously comprehend, as we are not aware of all the eye movements our eyes perform.

In the afternoon Andrew Duchowski from Clemson University gave a talk about eye tracking and eye movement analysis. Various historical apparatus and techniques were introduced (such as infrared corneal reflection) Followed by a research methodology and guidelines for conducting research. A pratical example of a study conducted by Mr. Nalangula at Clemson was described. This compared expert vs. novices viewing of errornous circuit boards. Results indicate that the experts scanpath can improve the results of the novices (ie. detecting more errors) than those who received no training. A few guidelines on how to use visualization were shown (clusters, heatmaps etc.)

The day ended with a nice dinner and a traditional Finnish smoke-sauna followed by a swim in the lake. Thanks goes to Uni. Tampere UCIT group for my best sauna experience to this date.

Tuesday, July 22, 2008

Eye gestures (Hemmert, 2007)

Fabian Hemmert at the Potsdam University of Applied Sciences published his MA thesis in 2007. He put up a site with extensive information and demonstrations of his research in eye gesture such as winks, squints, blinks etc. See the videos or thesis. Good work and great approach!

One example:

"Looking with one eye is a simple action. Seeing the screen with only one eye might therefore be used to switch the view to an alternate perspective on the screen contents: a filter for quick toggling. In this example, closing one eye filters out information on screen to a subset of the original data, such as an overview over the browser page or only the five most recently edited files. It was to see how the users would accept the functionality at the cost of having to close one eye, a not totally natural action." (Source)

Monday, July 21, 2008

SMI Experiment Suite 360

Video demonstrates the easy workflow of Experiment Suite 360: Experiment Builder, iView RED: X Non-Invasive Eye Tracker and BeGaze Analaysis Software. It provides a set of examples of what eye tracking can be used for. Furthermore, the remote based system (IView RED) is the same eye tracker that was used for developing the NeoVisus prototype (although the interface works on multiple systems)

Tuesday, July 15, 2008

Sebastian Hillaire at IRISA Rennes, France

Sebastian Hillaire is a Ph.D student at the IRISA Rennes in France, member of the BUNRAKU and France Telecom R&D. His work is situated around using eye trackers for improving the depth-of-field visual scene in 3D environments. He has published two papers on the topic:

Automatic, Real-Time, Depth-of-Field Blur Effect for First-Person Navigation in Virtual Environment (2008)

"We studied the use of visual blur effects for first-person navigation in virtual environments. First, we introduce new techniques to improve real-time Depth-of-Field blur rendering: a novel blur computation based on the GPU, an auto-focus zone to automatically compute the user’s focal distance without an eye-tracking system, and a temporal filtering that simulates the accommodation phenomenon. Secondly, using an eye-tracking system, we analyzed users’ focus point during first-person navigation in order to set the parameters of our algorithm. Lastly, we report on an experiment conducted to study the influence of our blur effects on performance and subjective preference of first-person shooter gamers. Our results suggest that our blur effects could improve fun or realism of rendering, making them suitable for video gamers, depending however on their level of expertise."

Screenshot from the algorithm implemented in Quake 3 Arena.

  • Sébastien Hillaire, Anatole Lécuyer, Rémi Cozot, Géry Casiez
    Automatic, Real-Time, Depth-of-Field Blur Effect for First-Person Navigation in Virtual Environment. To appear in IEEE Computer Graphics and Application (CG&A), 2008 , pp. ??-??
    Source code (please refer to my IEEE VR 2008 publication)

Using an Eye-Tracking System to Improve Depth-of-Field Blur Effects and Camera Motions in Virtual Environments (2008)

We describes the use of user’s focus point to improve some visual effects in virtual environments (VE). First, we describe how to retrieve user’s focus point in the 3D VE using an eye-tracking system. Then, we propose the adaptation of two rendering techniques which aim at improving users’ sensations during first-person navigation in VE using his/her focus point: (1) a camera motion which simulates eyes movement when walking, i.e., corresponding to vestibulo-ocular and vestibulocollic reflexes when the eyes compensate body and head movements in order to maintain gaze on a specific target, and (2) a Depth-of-Field (DoF) blur effect which simulates the fact that humans perceive sharp objects only within some range of distances around the focal distance.

Second, we describe the results of an experiment conducted to study users’ subjective preferences concerning these visual effects during first-person navigation in VE. It showed that participants globally preferred the use of these effects when they are dynamically adapted to the focus point in the VE. Taken together, our results suggest that the use of visual effects exploiting users’ focus point could be used in several VR applications involving firstperson navigation such as the visit of architectural site, training simulations, video games, etc."

Sébastien Hillaire, Anatole Lécuyer, Rémi Cozot, Géry Casiez
Using an Eye-Tracking System to Improve Depth-of-Field Blur Effects and Camera Motions in Virtual Environments. Proceedings of IEEE Virtual Reality (VR) Reno, Nevada, USA, 2008, pp. 47-51. Download paper as PDF.

QuakeIII DoF&Cam sources (depth-of-field, auto-focus zone and camera motion algorithms are under GPL with APP protection)

Passive eye tracking while playing Civilization IV

While the SMI iView X RED eye tracker used in this video is not used for driving the interaction it showcases how eye tracking can be used for usability evaluations in interaction design (Civilization does steal my attention on occations, Sid Meier is just a brilliant game designer)