- Helmert, J. R., Pannasch, S. & Velichkovsky, B. M. (2008). Eye tracking and Usability Research: an introduction to the special issue (editorial). Download as PDF.
- Castellini, C. (2008). Gaze Tracking in Semi-Autonomous Grasping. Download as PDF
- Helmert, J. R., Pannasch, S. & Velichkovsky, B. M. (2008). Influences of dwell time and cursor control on the performance in gaze driven typing. Download as PDF.
- Huckauf, A. & Urbina, M. H. (2008). On object selection in gaze controlled environments. Download as PDF.
- Hyrskykari, A. & Ovaska, S., Majaranta, P., Räihä, K.-J. & Lehtinen, M. (2008). Gaze Path Stimulation in Retrospective Think-Aloud. Download as PDF.
- Pannasch, S., Helmert, J.R., Malischke, S., Storch, A. & Velichkovsky, B.M. (2008). Eye typing in application: A comparison of two systems with ALS patients. Download as PDF.
- Zambarbieri, D., Carniglia, E. & Robino, C. (2008). Eye Tracking Analysis in Reading Online Newspapers. Download as PDF.
Monday, December 8, 2008
Monday, November 24, 2008
Friday, November 21, 2008
The virgin tour around the ITU office corridor (on YouTube)
Available on YouTube
Tuesday, November 18, 2008
Experimental gaze interaction techniques are typically prototyped from scratch using proprietary libraries provided by the manufacturers of eye tracking equipment. These libraries provide gaze data interfaces, but not any of the additional infrastructure that is common to the implementation of such techniques. This results in an unnecessary duplication of effort. In this paper, a framework for implementing gaze selection techniques is presented. It consists of two components: a gaze library to interface with the tracker and a set of classes which can be extended to implement different gaze selection techniques. The framework is tracker and operating system independent, ensuring compatibility with a wide range of systems. Support for user testing is also built into the system, enabling researchers to automate the presentation of est targets to users and record relevant test data. These features greatly simplify the process of implementing and evaluating new interaction techniques. The practicality and flexibility of the framework are demonstrated by the successful implementation of a number of gaze selection
- van Tonder, M., Cilliers, C., and Greyling, J. 2008. A framework for gaze selection techniques. In Proceedings of the 2008 Annual Research Conference of the South African institute of Computer Scientists and information Technologists on IT Research in Developing Countries: Riding the Wave of Technology (Wilderness, South Africa, October 06 - 08, 2008). SAICSIT '08, vol. 338. ACM, New York, NY, 267-275. DOI= http://doi.acm.org/10.1145/1456659.1456690
Monday, November 17, 2008
Tuesday, November 11, 2008
"We did a simple questionnaire-based analysis. The results of the analysis show some promises for implementing gaze-augmented problem-solving interfaces. Users of gaze-augmented interaction felt more immersed than the users of other two modes - dwell-time based and computer mouse. Immersion, engagement, and user-experience in general are important aspects in educational interfaces; learners engage in completing the tasks and, for example, when facing a difficult task they do not give up that easily. We also did analysis of the strategies, and we will report on those soon. We could not attend the conference, but didn’t want to disappoint eventual audience. We thus decided to send a video instead of us. " (from Romans blog)
Some of this research has also been presented within the COGAIN association, see:
- Gowases Tersia (2007) Gaze vs. Mouse: An evaluation of user experience and planning in problem solving games. Master’s thesis May 2, 2007. Department of Computer Science, University of Joensuu, Finland. Download as PDF
Monday, November 3, 2008
"This article proposes an alternative interaction method, the conductor interaction method (CIM), which aims to provide a more natural and easier-to-learn interaction technique. This novel interaction method extends existing HCI methods by drawing upon techniques found in human-human interaction. It is argued that the use of a two-phased multimodal interaction mechanism, using gaze for selection and gesture for manipulation, incorporated within a metaphor-based environment, can provide a viable alternative for interacting with a computer (especially for novice users). Both the model and an implementation of the CIM within a system are presented in this article. This system formed the basis of a number of user studies that have been performed to assess the effectiveness of the CIM, the findings of which are discussed in this work.
More specifically the CIM aims to provide the following.
—A More Natural Interface. The CIM will have an interface that utilizes gaze and gestures, but is nevertheless capable of supporting sophisticated activities. The CIM provides an interaction technique that is as natural as possible and is close to the human-human interaction methods with which users are already familiar. The combination of gaze and gestures allows the user to perform not only simple interactions with a computer, but also more complex interacones such as the selecting, editing, and placing of media objects.
—A Metaphor Supported Interface. In order to help the user understand and exploit the gaze and gesture interface, two metaphors have been developed. An orchestra metaphor is used to provide the environment in which the user interacts. A conductor metaphor is used for interacting within this environment. These two metaphors are discussed next.
—A Two-Phased Interaction Method. The CIM uses an interaction process where each modality is specific and has a particular function. The interaction between user and interface can be seen as a dialog that is comprised of two phases. In the first phase, the user selects the on-screen object by gazing at it. In the second phase, with the gesture interface the user is able to manipulate the selected object. These distinct functions of gaze and gesture aim to increase system usability, as they are based on human-human interaction techniques, and also help to overcome issues such as the Midas Touch problem that often experienced by look-and-dwell systems. As the dialog combines two modalities in sequence, the gaze interface can be disabled after the first phase. This minimizes the possibility of accidentally selecting objects through the gaze interface. The Midas Touch problem can also be further addressed by ensuring that there is ample dead space between media objects.
—Significantly Reduced Learning Overhead. The CIM aims to reduce the overhead of learning to use the system by encouraging the use of gestures that users can easily associate with activities they perform in their everyday life. This transfer of experience can lead to a smaller learning overhead [Borchers 1997], allowing users to make the most of the system’s features in a shorter time.
- Rachovides, D., Walkerdine, J., and Phillips, P. 2007. The conductor interaction method. ACM Trans. Multimedia Comput. Commun. Appl. 3, 4 (Dec. 2007), 1-23. DOI= http://doi.acm.org/10.1145/1314303.1314312
Their work was presented at the ACM SIGGRAPH 2008 with the associated poster:
- Wilcox, T., Evans, M., Pearce, C., Pollard, N., and Sundstedt, V. 2008. Gaze and voice based game interaction: the revenge of the killer penguins. In ACM SIGGRAPH 2008 Posters (Los Angeles, California, August 11 - 15, 2008).
Sunday, October 26, 2008
The project is running for another three weeks and the outcome will be very interesting. Check out the development blog at http://www.eyegazetracking.com/
Thursday, September 18, 2008
"The researchers presented novel methods for navigating and inspecting extremely large images solely or primarily using eye gaze control. The need to inspect large images occurs in, for example, mapping, medicine, astronomy and surveillance, and this project considered the inspection of very large aerial images, held in Google Earth. Comparative search and navigation tasks suggest that, while gaze methods are effective for image navigation, they lag behind more conventional methods, so interaction designers might consider combining these techniques for greatest effect." (BCS Interaction)
The increasing availability and accuracy of eye gaze detection equipment has encouraged its use for both investigation and control. In this paper we present novel methods for navigating and inspecting extremely large images solely or primarily using eye gaze control. We investigate the relative advantages and comparative properties of four related methods: Stare-to-Zoom (STZ), in which control of the image position and resolution level is determined solely by the user's gaze position on the screen; Head-to-Zoom (HTZ) and Dual-to-Zoom (DTZ), in which gaze control is augmented by head or mouse actions; and Mouse-to-Zoom (MTZ), using conventional mouse input as an experimental control.
The need to inspect large images occurs in many disciplines, such as mapping, medicine, astronomy and surveillance. Here we consider the inspection of very large aerial images, of which Google Earth is both an example and the one employed in our study. We perform comparative search and navigation tasks with each of the methods described, and record user opinions using the Swedish User-Viewer Presence Questionnaire. We conclude that, while gaze methods are effective for image navigation, they, as yet, lag behind more conventional methods and interaction designers may well consider combining these techniques for greatest effect.
Monday, September 15, 2008
From the patent document:
"There are many possible applications that would benefit from the temporal fusion of gaze vectors with multi-touch movement data. For the purpose of example, one simple application will be discussed here: Consider a typical computer screen, which has several windows displayed. Assume that the user wishes to bring forward the window in the lower left corner, which is currently underneath two other windows. Without gaze vector fusion there are two means to do this, and both involve movement of the hand to another position. The first means is to move the mouse pointer over the window of interest and click the mouse button. The second means is to use a hot-key combination to cycle through the screen windows until the one of interest is brought forward. Voice input could also be used but it would be less efficient than the other means. With gaze vector fusion, the task is greatly simplified. For example, the user directs his gaze to the window of interest and then taps a specific chord on the multi-touch surface. The operation requires no translation of the hands and is very fast to perform."
"For another example, assume the user wishes to resize and reposition an iTunes window positioned in the upper left of a display screen. This can be accomplished using a multi-touch system by moving the mouse pointer into the iTunes window and executing a resize and reposition gesture. While this means is already an improvement over using just a mouse its efficiency can be further improved by the temporal fusion of gaze vector data. "
- Hemin Omer Latif, Nasser Sherkat and Ahmad Lotfi, "TeleGaze: Teleoperation through Eye Gaze", 7th IEEE International Conference on Cybernetic Intelligent Systems 2008, London, UK. Conference website: www.cybernetic.org.uk/cis2008
- Hemin Omaer Latif, Nasser Sherkat and Ahmad Lotfi, "Remote Control of Mobile Robots through Human Eye Gaze: The Design and Evaluation of an Interface", SPIE Europe Security and Defence 2008, Cardiff, UK. Conference website: http://spie.org/security-defence-europe.xml
Overcoming Technical Challenges in Mobile and Other Systems
- Off-the-Shelf Mobile Gaze Interaction
J. San Agustin and J. P. Hansen, IT University of Copenhagen, Denmark
- Fast and Easy Calibration for a Head-Mounted Eye Tracker
C. Cudel, S Bernet, and M Basset, University of Haute Alsace, France
- Magic Environment
L. Figueiredo, T. Nunes, F. Caetano, and A. Gomes, ESTG/IPG, Portugal
- AI Support for a Gaze-Controlled Wheelchair
P. Novák, T. Krajník, L. Přeučil, M. Fejtová, and O. Štěpánková. Czech Technical University, Czech Republic)
- A Comparison of Pupil Centre Estimation Algorithms
D. Droege, C Schmidt, and D. Paulus University of Koblenz-Landau, Germany
- User Performance of Gaze-Based Interaction with On-line Virtual Communities
H. Istance, De Montfort University, UK, A. Hyrskykari, University of Tampere, Finland, S. Vickers, De Montfort University, UK and N. Ali, University of Tampere, Finland
- Multimodal Gaze Interaction in 3D Virtual Environments
E. Castellina and F. Corno, Politecnico di Torino, Italy
- How Can Tiny Buttons Be Hit Using Gaze Only?
H. Skovsgaard, J. P. Hansen, IT University of Copenhagen, Denmark. J. Mateo, Wright State University, Ohio, US
- Gesturing with Gaze
H. Heikkilä, University of Tampere, Finland
- NeoVisus: Gaze Driven Interface Components
M. Tall, Sweden
- Evaluations of Interactive Guideboard with Gaze-Communicative Stuffed-Toy Robot
T. Yonezawa, H. Yamazoe, A. Utsumi, and S. Abe, ATR Intelligent Robotics and Communications Laboratories, Japan
- Gaze-Contingent Passwords at the ATM
P. Dunphy, A. Fitch, and P. Oliver, Newcastle University, UK
- Scrollable Keyboards for Eye Typing
O Špakov and P. Majaranta, University of Tampere, Finland
- The Use of Eye-Gaze Data in the Evaluation of Assistive Technology Software for Older People.
S. Judge, Barnsley District Hospital Foundation, UK and S. Blackburn, Sheffield University, UK
- A Case Study Describing Development of an Eye Gaze Setup for a Patient with 'Locked-in Syndrome' to Facilitate Communication, Environmental Control and Computer Access.
Z. Robertson and M. Friday, Barnsley General Hospital, UK
Friday, September 12, 2008
Thursday, August 28, 2008
- Gustafsson, T., Carleberg, P., Svensson, P., Nilsson, S., Le Duc, M., Sivertun, Å., Mixed Reality Systems for Technical Maintenance and Gaze-Controlled Interaction. Progress Report Phase 2 to FMV., 2005. Download paper as PDF
Sunday, August 24, 2008
- T. Järvenpää, V. Aaltonen (2008) Compact near-to-eye display with integrated gaze tracker. (SPIE Proceedings paper)
"Near-to-Eye Display (NED) offers a big screen experience to the user anywhere, anytime. It provides a way to perceive a larger image than the physical device itself is. Commercially available NEDs tend to be quite bulky and uncomfortable to wear. However, by using very thin plastic light guides with diffractive structures on the surfaces, many of the known deficiencies can be notably reduced. These Exit Pupil Expander (EPE) light guides enable a thin, light, user friendly and high performing see-through NED, which we have demonstrated. To be able to interact with the displayed UI efficiently, we have also integrated a video-based gaze tracker into the NED. The narrow light beam of an infrared light source is divided and expanded inside the same EPEs to produce wide collimated beams out from the EPE towards the eyes. Miniature video camera images the cornea and eye gaze direction is accurately calculated by locating the pupil and the glints of the infrared beams. After a simple and robust per-user calibration, the data from the highly integrated gaze tracker reflects the user focus point in the displayed image which can be used as an input device for the NED system. Realizable applications go from eye typing to playing games, and far beyond."
The ability to manipulate independent variables, enforce consistency and control are important concerns. For example running a web site test against the site online may produce faulty data since the content of the site may change for each visit. This is referred to as the stimuli sensitivity and increses in-between power since all subjects are exposed to exactly the same stimuli. Another issue is the task sensitivity. The task must reflect what the results are supposed to illustrate (ie. reading a text does not contain elements of manipulation. People are in general very task oriented, instructed to read they will ignore certain elements (eg. banners etc.)
A couple of real world examples including the Fluent UI (Office 2008), Phlat and Search Engine Results Pages (SERP) were introduced.
The Fluent UI is the new interface used in Office 2008. It resembles a big change compared with the traditional Office interface. The Fluent UI is task and context dependent compared to the rather static traditional setup of menubars and icons cluttering the screen.
At Microsoft it is common to work around personas in multiple categories. These are abstract representations of user groups that help to illustrate the lifes and needs for "typical" users. For example, Nicolas, is a tech-savvy IT professional while Jennifer is a young hip girl who spend a lot of time on YouTube or hang around town with her shiny iPod (err.. Zune that is)
More information on the use of personas as a design method:
- J. Grudin, J. Pruitt (2002) Personas, Participatory Design and Product Development: An Infrastructure for Engagement (Microsoft Research) Download as Word doc.
- J. Grudin (2006) Why Personas Work: The Psychological Evidence (Microsoft Research) Download as Word doc.
- Cutrell, E., Robbins, D.C., Dumais, S.T. & Sarin, R. (2006). Fast, flexible filtering with Phlat - Personal search and organization made easy. In Proceedings of CHI'06, Human Factors in Computing Systems, (Montréal, April 2006), ACM press, 261-270. Try Phlat!
- Cutrell, E. & Guan, Z. (2007). What are you looking for? An eye-tracking study of information usage in Web Search. In Proceedings of CHI'07, Human Factors in Computing Systems, (San José), ACM press, 407-416.
- Guan, Z. & Cutrell, E. (2007). An eye-tracking study of the effect of target rank on Web search. In Proceedings of CHI'07, Human Factors in Computing Systems, (San José), ACM press, 417-420.
- Pannasch, S., Dornhoefer, S.M., Unema, P.J.A. & Velichkovsky, B.M. (2001). The omnipresent prolongation of visual fixations: saccades are inhibited by changes in situation and in subject's activity. Vision Research. 41(25-26), 3345-51. Download Full Article [PDF]
- Velichkovsky, B.M., Dornhoefer, S.M. , Kopf, M., Helmert, J. & Joos, M. (2002). Change detection and occlusion modes in static and dynamic road-traffic scenarios. Transportation Research, Part F. 5(2), 99-109. Download Full Article [PDF]
- Velichkovsky, B.M., Rothert, A., Kopf, M., Dornhoefer, S.M. & Joos, M. (2002). Towards an express diagnostics for level of processing and hazard perception. Transportation Research, Part F. 5(2), 145-156. Download Full Article [PDF]
- Unema, P., Pannasch, S., Joos, M. & Velichkovsky, B.M. (2005). Time-course of information processing during scene perception: The relationship between saccade amplitude and fixation duration. Visual Cognition, 12(3), 473-494. Download Full Article [PDF]
- Velichkovsky, B.M., Joos, M., Helmert, J.R., & Pannasch, S. (2005). Two visual systems and their eye movements: evidence from static and dynamic scene perception. CogSci 2005: Proceedings of the XXVII Conference of the Cognitive Science Society. July 21-23 Stresa, Italy, pp. 2283-2288. Download Full Article [PDF]
- Velichkovsky, B.M. (2005). Modularity of cognitive organization: Why it is so appealing and why it is wrong. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: Understanding the development and evolution of natural complex systems. Cambridge, MA: MIT Press.
- Graupner, S. T., Velichkovsky, B. M., Pannasch, S., & Marx, J. (2007). Surprise, surprise: Two distinct components in the visually evoked distractor effect. Psychophysiology, 44(2), 251-261. Download Full Article [PDF]
- Velichkovsky, B.M. (2007) Towards an Evolutionary Framework for Human Cognitive Neuroscience. Theoretical Biology, 2(1), 3-6. Download Full Article [PDF]
- Velichkovsky,B.M. (2002). Heterarchy of cognition: The depths and the highs of a framework for memory research. Memory, 10(5/6), 405-419 (Special Issue on Levels-of-Processing-Approach).
Saturday, August 23, 2008
The task objective is to either count the passes of the white or black team. The experiment illustrates the inattentional blindness which causes certain objects in the movie to go unnoticed.
More information on the phenomenon can be found in the following papers:
- Becklen, Robert and Cervone, Daniel (1983) Selective looking and the noticing of unexpected events. Memory and Cognition, 11, 601-608.
- Simons, Daniel J. and Chabris, Christopher F. (1999). Gorillas in our midst: sustained inattentional blindness for dynamic events, Perception, 28, pp.1059-1074.
- Rensink, Ronald A. (2000). When Good Observers Go Bad: Change Blindness, Inattentional Blindness, and Visual Experience, Psyche, 6(09), August 2000. Commentary on: A. Mack and I. Rock (1998) Inattentional Blindness. MIT Press.
- Dixon, T.D., Nikolov, S.G. et al. (2006). Scanpath analysis of fused multi-sensor images with luminance change: A pilot study. Proceedings of the 9th International Conference on Information Fusion, Florence, Italy, July 2006.
- Dixon, T.D, G Nikolov, J Lewis, J. Li, E Fernández, J M Noyes, T Troscianko, D R Bull, C N Canagarajah. Multi-Sensor Fused Video Assessment using Scanpath Analysis. Biologically Inspired Information Fusion (BIIF 2006) WorkshopUniversity of Surrey, UK, 22nd–23rd August 2006.
This is an area for gaze interaction which have a high potential and since the gaming industry has grown to be a hugh industy it may help to make eye trackers accessible/affordable. The development would be benificial for users with motor impairments. A couple of examples for implementations were then introduced. The first one was a first person shoother running on a XBOX360:
The experimental setup evaluation contained 10 repeated trials to look at learning (6 subjects). Three different configurations were used 1) gamepad controller moving and aiming (no gaze) 2) gamepad controller moving and gaze aiming and 3) gamepad controller moving forward only, gaze aiming and steering of the movement.
However, twice as many shots were fired that missed in the gaze condition which can be described as a "machine gun" approach. Noteworthy is that no filtering was applied to the gaze position.
Howell have conducted a analysis of common tasks in gaming, below is a representation of the amount of actions in the Guild Wars game. The two bars indicate 1) novices and 2) experienced users.
Controlling all of these different actions requires switching of task mode. This is very challenging considering only on input modality (gaze) with no method of "clicking".
There are several ways a gaze interface can be constructed. From a bottom up approach. First the position of gaze can be used to emulate the mouse cursor (on a system level) Second, a transparent overlay can be placed on top of the application. Third, a specific gaze interface can be developed (which has been my own approach) This requires a modification of the original application which is not always possible.
The Snap/Clutch interaction method developed by Stephen Vickers who is working with Howell operates on the system level to emulate the mouse. This allows for specific gaze gestures to be interpretated which is used to switch mode. For example a quick glace to the left of the screen will activate a left mouse button click mode. When a eye fixation is detected in a specific region a left mouse click will be issued to that area.
When this is applied to games such as World of Warcraft (demo) specific regions of the screen can be used to issue movement actions towards that direction. The image below illustrates these regions overlaid on the screen. When a fixation is issued in the A region an action to move towards that direction is issued to the game it self.
After lunch we had a hands-on session with the Snap/Clutch interaction method where eight Tobii eye trackers were used for a round multiplayer of WoW! Very different from a traditional mouse/keyboard setup and takes some time to get used to.
- Istance, H.O.,Bates, R., Hyrskykari, A. and Vickers, S. Snap Clutch, a Moded Approach to Solving the Midas Touch Problem. Proceedings of the 2008 symposium on Eye Tracking Research & Applications; ETRA 2008. Savannah, GA. 26th-28th March 2008. Download
Bates, R., Istance, H.O., and Vickers, S. Gaze Interaction with Virtual On-Line Communities: Levelling the Playing Field for Disabled Users. Proceedings of the 4th Cambridge Workshop on Universal Access and Assistive Technology; CWUAAT 2008. University of Cambridge, 13th-16th April 2008. Download
The second part of the lecture concerned gaze interaction for mobile phones. This allows for ubiquitous computing where the eye tracker is integrated with a wearable display. As a new field it is surrounded with certain issues (stability, processing power, variation in lightning etc.) but all of which will be solved over time. The big question is what the "killer-application" will be. ( entertainment?) A researcher from Nokia attended the lecture and introduced a prototype system. Luckily I had the chance to visit their research department the following day to get a hands-on with their head mounted display with a integrated eye tracker (more on this in another post)
The third part was about stereoscopic displays which adds a third dimension (depth) to the traditional X and Y axis. There are several projects around the world working towards making this everyday reality. However, tracking the depth of gaze fixation is limited. The vergence (as seen by the distance between both pupils) eye movements are hard to measure when the distance to objects move above two meters.
Calculating convergence angles
d = 100 cm tan θ = 3.3 / 100; θ = 1.89 deg.
d = 200 cm tan θ = 3.3 / 200; θ = 0.96 deg.
Related papers on stereoscopic eye tracking:
- Essig, K., Pomplun, M. & Ritter, H. (2006). A neural network for 3D gaze recording with binocular eye trackers. International Journal of Parallel, Emergent, and Distributed Systems, 21 (2), 79-95.
- Y-M Kwon, K-W Jeon, J Ki, Q M. Shahab, S Jo and S-K Kim (2006). 3D Gaze Estimation and Interaction to Stereo Display The International Journal of Virtual Reality, 5(3):41-45
The "experiment" consisted of two ads shown below. The hypothesis to be investigated was that the direction of gaze would attract more attention towards the text compared to the picture where the baby is facing the camera.
After calibrating the user the stimulus is observed for a specific amout of time. When the recording has completed a replay of the eye movements can be visually overlaid ontop of the stimuli. Furthermore, several recordings can be incorporated into one clip. Indeed the results indicate support for the hypothesis. Simply put, faces attract attention and the direction of gaze guides it further.
After lunch Boris Velichkovsky gave a lecture on cognitive technologies. After a quick recap of the talk the day before about the visual system the NBIC report was introduced. This concerns the converging technologies of Nano-, Bio-, Information Technology and Cognitive Science.
Notable advances in these fields contain the Z3 computer (Infotech, 1941), DNA (Bio, 1953), Computed Tomography scan (Nano, 1972) and Short Term Memory (CogSci, 1968) All of which has dramtically improved human understanding and capabilities.
Another interesting topic concerned the superior visual recognition skills humans have. Research have demonstrated that we are able to recognize up to 2000 photos after two weeks with a 90% accuracy. Obviously the visual system is our strongest sense, however much of our computer interaction as a whole is driven by a one way information flow. Taking the advaces in
bi-directional OLED microdisplays in to account the field of augmented reality have a bright future. These devices act as both camera and displaying information at the same time. Add an eye tracker to the device and we have some really intresting opportunities.
Boris also discussed the research of Jaak Panksepp concerning the basic emotional systems in the mammals. (emo-systems, anatomical areas and neurotransmitters for modulation)
To sum up the second day was diverse in topics but non the less interesting and demonstrates the diversity of knowledge and skills needed for todays researchers.
Tuesday, August 19, 2008
Some basic findings of the visual system were introduced. In general, the visual system is divided into two pathways, the dorsal and ventral system. The dorsal goes from the striate cortex (back of the brain) and upwards (towards the posterior parietal). This pathway of visual information concerns the spatial arrangement of objects in our environment. Hence, it is commonly termed the "where" pathway. The other visual pathway goes towards the temporal lobes (just above your ears) and concerns the shape and identification of specific objects, this is the "what" pathway. The ambient system responds early (0-250ms.) after which the focal system takes over.
These two systems are represented by the focal (what) and ambient (where) attention systems. The ambient system has a overall crude but fast response in lower luminance, the focal attention system works the opposite with fine, but slow, spatial resolution. Additional, the talk covered the cognitive models of attention such as Posner, Broadbent etc. (see attention on Wikipedia)
A great deal of the talk concerned the freezing effect (inhibition of saccades and a prolonged fixation) which can to some extent be predicted. The onset of a dangerous "event" can be seen before the acctual response (the prolonged fixation) Just before the fixation (500ms) the predicition can be made with a 95% success. The inhibition comes in two waves where the first one is issued by the affective repsonse of the amygdala (after 80 ms.) which acts on the superior colliculus to inihibit near saccades. A habituating effect on this affective response can be seen where the second wave of inhitition (+170ms.) becomes less apperent, the initial response is however unaffected.
While driving a car and talking on the phone the lack of attention leads to eye movements with shorter fixation durations. This gives an approximated spatial localization of objects. It is the combination of a) duration of the fixation and b) the surrounding saccades that determines the quality of recognition. A short fixation followed by a long subsequent saccade leads to low recognition results. A short fixation followed by a short saccade gives higher recognition scores. A long fixation followed by either short or long saccades leads to equally high recognition results.
Furthermore, a short saccade within the parafoveal region leads to a high level of neural activity (EEG) after 90ms. This differs from long saccades which gives no noticable variance in cortical activity (compared to the base line)
However, despite the classification into two major visual systems, the attentional system can be divided into 4-6 layers of organization. Hence there is no singel point of attention. They have developed during the evolution of the mind to support various cognitive demands.
For example, the role of emotional respones in social communication can be seen with the strong response to facial expressions. Studies have shown that male responds extremely fast to face-to-face images of other males expressing aggressive facial gestures. This low level response happens much faster that our conscious awareness (as low as 80ms. if I recall correctly) Additionally, the eyes are much faster than we can consciously comprehend, as we are not aware of all the eye movements our eyes perform.
In the afternoon Andrew Duchowski from Clemson University gave a talk about eye tracking and eye movement analysis. Various historical apparatus and techniques were introduced (such as infrared corneal reflection) Followed by a research methodology and guidelines for conducting research. A pratical example of a study conducted by Mr. Nalangula at Clemson was described. This compared expert vs. novices viewing of errornous circuit boards. Results indicate that the experts scanpath can improve the results of the novices (ie. detecting more errors) than those who received no training. A few guidelines on how to use visualization were shown (clusters, heatmaps etc.)
The day ended with a nice dinner and a traditional Finnish smoke-sauna followed by a swim in the lake. Thanks goes to Uni. Tampere UCIT group for my best sauna experience to this date.
Tuesday, July 22, 2008
"Looking with one eye is a simple action. Seeing the screen with only one eye might therefore be used to switch the view to an alternate perspective on the screen contents: a filter for quick toggling. In this example, closing one eye filters out information on screen to a subset of the original data, such as an overview over the browser page or only the five most recently edited files. It was to see how the users would accept the functionality at the cost of having to close one eye, a not totally natural action." (Source)
Monday, July 21, 2008
Tuesday, July 15, 2008
Automatic, Real-Time, Depth-of-Field Blur Effect for First-Person Navigation in Virtual Environment (2008)
"We studied the use of visual blur effects for first-person navigation in virtual environments. First, we introduce new techniques to improve real-time Depth-of-Field blur rendering: a novel blur computation based on the GPU, an auto-focus zone to automatically compute the user’s focal distance without an eye-tracking system, and a temporal filtering that simulates the accommodation phenomenon. Secondly, using an eye-tracking system, we analyzed users’ focus point during first-person navigation in order to set the parameters of our algorithm. Lastly, we report on an experiment conducted to study the influence of our blur effects on performance and subjective preference of first-person shooter gamers. Our results suggest that our blur effects could improve fun or realism of rendering, making them suitable for video gamers, depending however on their level of expertise."
"We describes the use of user’s focus point to improve some visual effects in virtual environments (VE). First, we describe how to retrieve user’s focus point in the 3D VE using an eye-tracking system. Then, we propose the adaptation of two rendering techniques which aim at improving users’ sensations during first-person navigation in VE using his/her focus point: (1) a camera motion which simulates eyes movement when walking, i.e., corresponding to vestibulo-ocular and vestibulocollic reflexes when the eyes compensate body and head movements in order to maintain gaze on a specific target, and (2) a Depth-of-Field (DoF) blur effect which simulates the fact that humans perceive sharp objects only within some range of distances around the focal distance.
Second, we describe the results of an experiment conducted to study users’ subjective preferences concerning these visual effects during first-person navigation in VE. It showed that participants globally preferred the use of these effects when they are dynamically adapted to the focus point in the VE. Taken together, our results suggest that the use of visual effects exploiting users’ focus point could be used in several VR applications involving firstperson navigation such as the visit of architectural site, training simulations, video games, etc."
Sébastien Hillaire, Anatole Lécuyer, Rémi Cozot, Géry Casiez
Using an Eye-Tracking System to Improve Depth-of-Field Blur Effects and Camera Motions in Virtual Environments. Proceedings of IEEE Virtual Reality (VR) Reno, Nevada, USA, 2008, pp. 47-51. Download paper as PDF.
QuakeIII DoF&Cam sources (depth-of-field, auto-focus zone and camera motion algorithms are under GPL with APP protection)
Thursday, July 10, 2008
Summary of the thesis "Eye Gaze Interactive ACT workstation"
"Ongoing research is devoted to finding ways to improve performance and reduce workload of Air Traffic Controllers (ATCos) because their task is critical to the safe and efficient flow of air traffic. A new intuitive input method, known as eye gaze interaction, was expected to reduce the work- and task load imposed on the controllers by facilitating the interaction between the human and the ATC workstation. In turn, this may improve performance because the freed mental resources can be devoted to more critical aspects of the job, such as strategic planning. The objective of this Master thesis research was to explore how human computer interaction (HCI) in the ATC task can be improved using eye gaze input techniques and whether this will reduce workload for ATCos.
In conclusion, the results of eye gaze interaction are very promising for selection of aircraft on a radar screen. For entering instructions it was less advantageous. This is explained by the fact that in the first task the interaction is more intuitive while the latter is more a conscious selection task. For application in work environments with large displays or multiple displays eye gaze interaction is considered very promising. "
Download paper as pdf
Wednesday, July 9, 2008
Information about Gazetalk 5 eye communication system
GazeTalk is a predictive text entry system that has a restricted on-screen keyboard with ambiguous layout for severely disabled people. The main reason for using such a keyboard layout is that it enables the use of an eye tracker with a low spatial resolution (e.g., a web-camera based eye tracker).
The goal of the GazeTalk project is to develop an eye-tracking based AAC system that supports several languages, facilitates fast text entry, and is both sufficiently feature-complete to be deployed as the primary AAC tool for users, yet sufficiently flexible and technically advanced to be used for research purposes. The system is designed for several target languages, initially Danish, English, Italian, German and Japanese.
- web – browser
- Multimedia – player
- PDF – reader
- letter and word prediction, and word completion
- speech output
- can be operated by gaze, headtracking, mouse, joystick, or any other pointing device
- supports step-scanning (new!)
- supports users with low precision in their movements, or trackers with low accuracy
- allows the user to use Dasher inside GazeTalk and to transfer the text written in Dasher back to GazeTalk
GazeTalk 5.0 has been designed and developed by theat the and the IT-Lab at the Royal School of Library and Information Science, Copenhagen.
- Introduction to GazeTalk and its features
Click here to play the video
- Watch the video "Introduction to GazeTalk and its features" (same as above!) in Windows Media File (wmv) format (7.5 MB).
- Watch a YouTube video on Gaze Interaction for People with ALS/MND .
- Watch another YouTube video on a Talk Given with the Eyes Only by Arne Lykke Larsen
- Watch a YouTube video on ALS-Communication and GazeTalk , where the developer, Dr. John Paulin Hansen explains how GazeTalk works (in Danish, subtitles in English). The video includes short clips of using GazeTalk and a brief interview of John Paulin Hansen. The video was produced by , intiative: Birger Bergmann Jeppesen (bigerbj (at) webspeed (dot) dk), for more information, see )