- Majaranta, P. (2009) Text Entry by Eye Gaze. Dissertations in Interactive Technology, number 11, University of Tampere (ISBN 978-951-44-7786-7). Also available in Acta Electronica Universitatis Tamperensis; 869 (978-951-44-7787-4).
Thursday, August 6, 2009
Päivi Majaranta PhD Thesis on Text Entry by Eye Gaze
Wednesday, May 13, 2009
GaCIT 2009 : Summer School on Gaze, Communication, and Interaction Technology
The GaCIT workshop is organized by the graduate school on User-Centered Information Technology at the University of Tampere, Finland (map). The workshop runs between July 27-31. I attended last year and found it to be great week with interesting talks and social events. See the day-by-day coverage of the GaCIT 2008.
Topics and speakers:
Introduction to Gaze-based Communication (Howell Istance)
Evaluation of Text Entry Techniques (Scott MacKenzie)
Survey of text entry methods. Models, metrics, and procedures for evaluating text entry methods.Details of Keyboards and Users Matter (Päivi Majaranta)
Issues specific to eye-tracker use of soft keyboards, special issues in evaluating text entry techniques with users that use eye trackers for communication.Communication by Eyes without Computers (TBA)
Introduction to eye-based communication using low-tech devices.Gesture-based Text Entr Techniques (Poika Isokoski)
Overview of studies evaluating techniques such as Dasher, QuikWrite and EdgeWrite in the eye-tracker contextLow-cost Devices and the Future of Gaze-based Text Entry (John Paulin Hansen)
Low-cost eye tracking and its implications for text entry systems. Future of gaze-based text entry.Dwell-free text entry techniques (Anke Huckauf)
Introduction to gaze-based techniques that do not utilize the dwell-time protocol for item selection.
Sunday, August 24, 2008
Nokia Research: Near Eye Display with integrated eye tracker
- T. Järvenpää, V. Aaltonen (2008) Compact near-to-eye display with integrated gaze tracker. (SPIE Proceedings paper)
"Near-to-Eye Display (NED) offers a big screen experience to the user anywhere, anytime. It provides a way to perceive a larger image than the physical device itself is. Commercially available NEDs tend to be quite bulky and uncomfortable to wear. However, by using very thin plastic light guides with diffractive structures on the surfaces, many of the known deficiencies can be notably reduced. These Exit Pupil Expander (EPE) light guides enable a thin, light, user friendly and high performing see-through NED, which we have demonstrated. To be able to interact with the displayed UI efficiently, we have also integrated a video-based gaze tracker into the NED. The narrow light beam of an infrared light source is divided and expanded inside the same EPEs to produce wide collimated beams out from the EPE towards the eyes. Miniature video camera images the cornea and eye gaze direction is accurately calculated by locating the pupil and the glints of the infrared beams. After a simple and robust per-user calibration, the data from the highly integrated gaze tracker reflects the user focus point in the displayed image which can be used as an input device for the NED system. Realizable applications go from eye typing to playing games, and far beyond."
GaCIT in Tampere, day 5.
The ability to manipulate independent variables, enforce consistency and control are important concerns. For example running a web site test against the site online may produce faulty data since the content of the site may change for each visit. This is referred to as the stimuli sensitivity and increses in-between power since all subjects are exposed to exactly the same stimuli. Another issue is the task sensitivity. The task must reflect what the results are supposed to illustrate (ie. reading a text does not contain elements of manipulation. People are in general very task oriented, instructed to read they will ignore certain elements (eg. banners etc.)
A couple of real world examples including the Fluent UI (Office 2008), Phlat and Search Engine Results Pages (SERP) were introduced.
The Fluent UI is the new interface used in Office 2008. It resembles a big change compared with the traditional Office interface. The Fluent UI is task and context dependent compared to the rather static traditional setup of menubars and icons cluttering the screen.
At Microsoft it is common to work around personas in multiple categories. These are abstract representations of user groups that help to illustrate the lifes and needs for "typical" users. For example, Nicolas, is a tech-savvy IT professional while Jennifer is a young hip girl who spend a lot of time on YouTube or hang around town with her shiny iPod (err.. Zune that is)
More information on the use of personas as a design method:
- J. Grudin, J. Pruitt (2002) Personas, Participatory Design and Product Development: An Infrastructure for Engagement (Microsoft Research) Download as Word doc.
- J. Grudin (2006) Why Personas Work: The Psychological Evidence (Microsoft Research) Download as Word doc.
- Cutrell, E., Robbins, D.C., Dumais, S.T. & Sarin, R. (2006). Fast, flexible filtering with Phlat - Personal search and organization made easy. In Proceedings of CHI'06, Human Factors in Computing Systems, (Montréal, April 2006), ACM press, 261-270. Try Phlat!
Further reading:
- Cutrell, E. & Guan, Z. (2007). What are you looking for? An eye-tracking study of information usage in Web Search. In Proceedings of CHI'07, Human Factors in Computing Systems, (San José), ACM press, 407-416.
- Guan, Z. & Cutrell, E. (2007). An eye-tracking study of the effect of target rank on Web search. In Proceedings of CHI'07, Human Factors in Computing Systems, (San José), ACM press, 417-420.
GaCIT 2008. B. Velichovsky associated litterature
- Pannasch, S., Dornhoefer, S.M., Unema, P.J.A. & Velichkovsky, B.M. (2001). The omnipresent prolongation of visual fixations: saccades are inhibited by changes in situation and in subject's activity. Vision Research. 41(25-26), 3345-51. Download Full Article [PDF]
- Velichkovsky, B.M., Dornhoefer, S.M. , Kopf, M., Helmert, J. & Joos, M. (2002). Change detection and occlusion modes in static and dynamic road-traffic scenarios. Transportation Research, Part F. 5(2), 99-109. Download Full Article [PDF]
- Velichkovsky, B.M., Rothert, A., Kopf, M., Dornhoefer, S.M. & Joos, M. (2002). Towards an express diagnostics for level of processing and hazard perception. Transportation Research, Part F. 5(2), 145-156. Download Full Article [PDF]
- Unema, P., Pannasch, S., Joos, M. & Velichkovsky, B.M. (2005). Time-course of information processing during scene perception: The relationship between saccade amplitude and fixation duration. Visual Cognition, 12(3), 473-494. Download Full Article [PDF]
- Velichkovsky, B.M., Joos, M., Helmert, J.R., & Pannasch, S. (2005). Two visual systems and their eye movements: evidence from static and dynamic scene perception. CogSci 2005: Proceedings of the XXVII Conference of the Cognitive Science Society. July 21-23 Stresa, Italy, pp. 2283-2288. Download Full Article [PDF]
- Velichkovsky, B.M. (2005). Modularity of cognitive organization: Why it is so appealing and why it is wrong. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: Understanding the development and evolution of natural complex systems. Cambridge, MA: MIT Press.
- Graupner, S. T., Velichkovsky, B. M., Pannasch, S., & Marx, J. (2007). Surprise, surprise: Two distinct components in the visually evoked distractor effect. Psychophysiology, 44(2), 251-261. Download Full Article [PDF]
- Velichkovsky, B.M. (2007) Towards an Evolutionary Framework for Human Cognitive Neuroscience. Theoretical Biology, 2(1), 3-6. Download Full Article [PDF]
- Velichkovsky,B.M. (2002). Heterarchy of cognition: The depths and the highs of a framework for memory research. Memory, 10(5/6), 405-419 (Special Issue on Levels-of-Processing-Approach).
Saturday, August 23, 2008
GaCIT in Tampere, day 4.
The task objective is to either count the passes of the white or black team. The experiment illustrates the inattentional blindness which causes certain objects in the movie to go unnoticed.
More information on the phenomenon can be found in the following papers:
- Becklen, Robert and Cervone, Daniel (1983) Selective looking and the noticing of unexpected events. Memory and Cognition, 11, 601-608.
- Simons, Daniel J. and Chabris, Christopher F. (1999). Gorillas in our midst: sustained inattentional blindness for dynamic events, Perception, 28, pp.1059-1074.
- Rensink, Ronald A. (2000). When Good Observers Go Bad: Change Blindness, Inattentional Blindness, and Visual Experience, Psyche, 6(09), August 2000. Commentary on: A. Mack and I. Rock (1998) Inattentional Blindness. MIT Press.
- Dixon, T.D., Nikolov, S.G. et al. (2006). Scanpath analysis of fused multi-sensor images with luminance change: A pilot study. Proceedings of the 9th International Conference on Information Fusion, Florence, Italy, July 2006.
- Dixon, T.D, G Nikolov, J Lewis, J. Li, E Fernández, J M Noyes, T Troscianko, D R Bull, C N Canagarajah. Multi-Sensor Fused Video Assessment using Scanpath Analysis. Biologically Inspired Information Fusion (BIIF 2006) WorkshopUniversity of Surrey, UK, 22nd–23rd August 2006.
GaCIT in Tampere, day 3.
Games
This is an area for gaze interaction which have a high potential and since the gaming industry has grown to be a hugh industy it may help to make eye trackers accessible/affordable. The development would be benificial for users with motor impairments. A couple of examples for implementations were then introduced. The first one was a first person shoother running on a XBOX360:
The experimental setup evaluation contained 10 repeated trials to look at learning (6 subjects). Three different configurations were used 1) gamepad controller moving and aiming (no gaze) 2) gamepad controller moving and gaze aiming and 3) gamepad controller moving forward only, gaze aiming and steering of the movement.
Results:
However, twice as many shots were fired that missed in the gaze condition which can be described as a "machine gun" approach. Noteworthy is that no filtering was applied to the gaze position.
Howell have conducted a analysis of common tasks in gaming, below is a representation of the amount of actions in the Guild Wars game. The two bars indicate 1) novices and 2) experienced users.
Controlling all of these different actions requires switching of task mode. This is very challenging considering only on input modality (gaze) with no method of "clicking".
There are several ways a gaze interface can be constructed. From a bottom up approach. First the position of gaze can be used to emulate the mouse cursor (on a system level) Second, a transparent overlay can be placed on top of the application. Third, a specific gaze interface can be developed (which has been my own approach) This requires a modification of the original application which is not always possible.
The Snap/Clutch interaction method developed by Stephen Vickers who is working with Howell operates on the system level to emulate the mouse. This allows for specific gaze gestures to be interpretated which is used to switch mode. For example a quick glace to the left of the screen will activate a left mouse button click mode. When a eye fixation is detected in a specific region a left mouse click will be issued to that area.
When this is applied to games such as World of Warcraft (demo) specific regions of the screen can be used to issue movement actions towards that direction. The image below illustrates these regions overlaid on the screen. When a fixation is issued in the A region an action to move towards that direction is issued to the game it self.
After lunch we had a hands-on session with the Snap/Clutch interaction method where eight Tobii eye trackers were used for a round multiplayer of WoW! Very different from a traditional mouse/keyboard setup and takes some time to get used to.
- Istance, H.O.,Bates, R., Hyrskykari, A. and Vickers, S. Snap Clutch, a Moded Approach to Solving the Midas Touch Problem. Proceedings of the 2008 symposium on Eye Tracking Research & Applications; ETRA 2008. Savannah, GA. 26th-28th March 2008. Download
Bates, R., Istance, H.O., and Vickers, S. Gaze Interaction with Virtual On-Line Communities: Levelling the Playing Field for Disabled Users. Proceedings of the 4th Cambridge Workshop on Universal Access and Assistive Technology; CWUAAT 2008. University of Cambridge, 13th-16th April 2008. Download
The second part of the lecture concerned gaze interaction for mobile phones. This allows for ubiquitous computing where the eye tracker is integrated with a wearable display. As a new field it is surrounded with certain issues (stability, processing power, variation in lightning etc.) but all of which will be solved over time. The big question is what the "killer-application" will be. ( entertainment?) A researcher from Nokia attended the lecture and introduced a prototype system. Luckily I had the chance to visit their research department the following day to get a hands-on with their head mounted display with a integrated eye tracker (more on this in another post)
The third part was about stereoscopic displays which adds a third dimension (depth) to the traditional X and Y axis. There are several projects around the world working towards making this everyday reality. However, tracking the depth of gaze fixation is limited. The vergence (as seen by the distance between both pupils) eye movements are hard to measure when the distance to objects move above two meters.
Calculating convergence angles
d = 100 cm tan θ = 3.3 / 100; θ = 1.89 deg.
d = 200 cm tan θ = 3.3 / 200; θ = 0.96 deg.
Related papers on stereoscopic eye tracking:
- Essig, K., Pomplun, M. & Ritter, H. (2006). A neural network for 3D gaze recording with binocular eye trackers. International Journal of Parallel, Emergent, and Distributed Systems, 21 (2), 79-95.
- Y-M Kwon, K-W Jeon, J Ki, Q M. Shahab, S Jo and S-K Kim (2006). 3D Gaze Estimation and Interaction to Stereo Display The International Journal of Virtual Reality, 5(3):41-45
GaCIT in Tampere, day 2.
The "experiment" consisted of two ads shown below. The hypothesis to be investigated was that the direction of gaze would attract more attention towards the text compared to the picture where the baby is facing the camera.
After calibrating the user the stimulus is observed for a specific amout of time. When the recording has completed a replay of the eye movements can be visually overlaid ontop of the stimuli. Furthermore, several recordings can be incorporated into one clip. Indeed the results indicate support for the hypothesis. Simply put, faces attract attention and the direction of gaze guides it further.
After lunch Boris Velichkovsky gave a lecture on cognitive technologies. After a quick recap of the talk the day before about the visual system the NBIC report was introduced. This concerns the converging technologies of Nano-, Bio-, Information Technology and Cognitive Science.
Notable advances in these fields contain the Z3 computer (Infotech, 1941), DNA (Bio, 1953), Computed Tomography scan (Nano, 1972) and Short Term Memory (CogSci, 1968) All of which has dramtically improved human understanding and capabilities.
Another interesting topic concerned the superior visual recognition skills humans have. Research have demonstrated that we are able to recognize up to 2000 photos after two weeks with a 90% accuracy. Obviously the visual system is our strongest sense, however much of our computer interaction as a whole is driven by a one way information flow. Taking the advaces in
bi-directional OLED microdisplays in to account the field of augmented reality have a bright future. These devices act as both camera and displaying information at the same time. Add an eye tracker to the device and we have some really intresting opportunities.
Boris also discussed the research of Jaak Panksepp concerning the basic emotional systems in the mammals. (emo-systems, anatomical areas and neurotransmitters for modulation)
To sum up the second day was diverse in topics but non the less interesting and demonstrates the diversity of knowledge and skills needed for todays researchers.
Tuesday, August 19, 2008
GaCIT in Tampere, day 1.
Some basic findings of the visual system were introduced. In general, the visual system is divided into two pathways, the dorsal and ventral system. The dorsal goes from the striate cortex (back of the brain) and upwards (towards the posterior parietal). This pathway of visual information concerns the spatial arrangement of objects in our environment. Hence, it is commonly termed the "where" pathway. The other visual pathway goes towards the temporal lobes (just above your ears) and concerns the shape and identification of specific objects, this is the "what" pathway. The ambient system responds early (0-250ms.) after which the focal system takes over.
These two systems are represented by the focal (what) and ambient (where) attention systems. The ambient system has a overall crude but fast response in lower luminance, the focal attention system works the opposite with fine, but slow, spatial resolution. Additional, the talk covered the cognitive models of attention such as Posner, Broadbent etc. (see attention on Wikipedia)
A great deal of the talk concerned the freezing effect (inhibition of saccades and a prolonged fixation) which can to some extent be predicted. The onset of a dangerous "event" can be seen before the acctual response (the prolonged fixation) Just before the fixation (500ms) the predicition can be made with a 95% success. The inhibition comes in two waves where the first one is issued by the affective repsonse of the amygdala (after 80 ms.) which acts on the superior colliculus to inihibit near saccades. A habituating effect on this affective response can be seen where the second wave of inhitition (+170ms.) becomes less apperent, the initial response is however unaffected.
While driving a car and talking on the phone the lack of attention leads to eye movements with shorter fixation durations. This gives an approximated spatial localization of objects. It is the combination of a) duration of the fixation and b) the surrounding saccades that determines the quality of recognition. A short fixation followed by a long subsequent saccade leads to low recognition results. A short fixation followed by a short saccade gives higher recognition scores. A long fixation followed by either short or long saccades leads to equally high recognition results.
Furthermore, a short saccade within the parafoveal region leads to a high level of neural activity (EEG) after 90ms. This differs from long saccades which gives no noticable variance in cortical activity (compared to the base line)
However, despite the classification into two major visual systems, the attentional system can be divided into 4-6 layers of organization. Hence there is no singel point of attention. They have developed during the evolution of the mind to support various cognitive demands.
For example, the role of emotional respones in social communication can be seen with the strong response to facial expressions. Studies have shown that male responds extremely fast to face-to-face images of other males expressing aggressive facial gestures. This low level response happens much faster that our conscious awareness (as low as 80ms. if I recall correctly) Additionally, the eyes are much faster than we can consciously comprehend, as we are not aware of all the eye movements our eyes perform.
In the afternoon Andrew Duchowski from Clemson University gave a talk about eye tracking and eye movement analysis. Various historical apparatus and techniques were introduced (such as infrared corneal reflection) Followed by a research methodology and guidelines for conducting research. A pratical example of a study conducted by Mr. Nalangula at Clemson was described. This compared expert vs. novices viewing of errornous circuit boards. Results indicate that the experts scanpath can improve the results of the novices (ie. detecting more errors) than those who received no training. A few guidelines on how to use visualization were shown (clusters, heatmaps etc.)
The day ended with a nice dinner and a traditional Finnish smoke-sauna followed by a swim in the lake. Thanks goes to Uni. Tampere UCIT group for my best sauna experience to this date.