Martin Tall On Gaze Interaction: interface design

Showing posts with label interface design. Show all posts

Friday, June 8, 2012

Eyecatcher - A 3D prototype combining Eyetracking with a Gestural Camera

Eyecatcher is a prototype combining eyetracking with a gestural camera on a dual screen setup. Created for the Oilrig process industry, this project was a collaborative exploration between ABB Corporate Research and Interactive Institute Umeå (blog).

Monday, April 18, 2011

AutomotiveUI'11 - 3rd International Conference On Automotive User Interfaces and Interactive Vehicular Applications

"In-car interactive technology is becoming ubiquitous and cars are increasingly connected to the outside world. Drivers and passengers use this technology because it provides valuable services. Some technology, such as collision warning systems, assists drivers in performing their primary in-vehicle task (driving). Other technology provides information on myriad subjects or offers entertainment to the driver and passengers.

The challenge that arises from the proliferation of in-car devices is that they may distract drivers from the primary task of driving, with possibly disastrous results. Thus, one of the major goals of this conference is to explore ways in which in-car user interfaces can be designed so as to lessen driver distraction while still enabling valuable services. This is challenging, especially given that the design of in-car devices, which was historically the responsibility of car manufacturers and their parts suppliers, is now a responsibility shared among a large and ever-changing group of parties. These parties include car OEMs, Tier 1 and Tier 2 suppliers of factory-installed electronics, as well as the manufacturers of hardware and software that is brought into the car, for example on personal navigation devices, smartphones, and tablets.

As we consider driving safety, our focus in designing in-car user interfaces should not be purely on eliminating distractions. In-car user interfaces also offer the opportunity to improve the driver¹s performance, for example by increasing her awareness of upcoming hazards. They can also enhance the experience of all kinds of passengers in the car. To this end, a further goal of AutomotiveUI 2011 is the exploration of in-car interfaces that address the varying needs of different types of users (including disabled drivers, elderly drivers or passengers, and the users of rear-seat entertainment systems). Overall our goal is to advance the state of the art in vehicular user experiences, in order to make cars both safer and more enjoyable places to spend time." http://www.auto-ui.org

Topics include, but are not limited to:
* new concepts for in-car user interfaces
* multimodal in-car user interfaces
* in-car speech and audio user interfaces
* text input and output while driving
* multimedia interfaces for in-car entertainment
* evaluation and benchmarking of in-car user interfaces
* assistive technology in the vehicular context
* methods and tools for automotive user interface research
* development methods and tools for automotive user interfaces
* automotive user interface frameworks and toolkits
* detecting and estimating user intentions
* detecting/measuring driver distraction and estimating cognitive load
* biometrics and physiological sensors as a user interface component
* sensors and context for interactive experiences in the car
* user interfaces for information access (search, browsing, etc.) while driving
* user interfaces for navigation or route guidance
* applications and user interfaces for inter-vehicle communication
* in-car gaming and entertainment
* different user groups and user group characteristics
* in-situ studies of automotive user interface approaches
* general automotive user experience research
* driving safety research using real vehicles and simulators
* subliminal techniques for workload reduction

SUBMISSIONS
AutomotiveUI 2011 invites submissions in the following categories:

* Papers (Submission Deadline: July 11th, 2011)
* Workshops (Submission Deadline: July 25th, 2011)
* Posters & Interactive Demos (Submission Deadline: Oct. 10th, 2011)
* Industrial Showcase (Submission Deadline: Oct. 10th, 2011)

For more information on the submission categories please check http://www.auto-ui.org/11/submit.php

Thursday, January 13, 2011

Taiwanese Utechzone, the Spring gaze interaction system

UTechZone a Taiwanese company have launched the Spring gaze interaction system for individuals with ALS or similar conditions. It provides the basic functionality including text entry, email, web, media etc. in a format that reminds much of the MyTobii software. The tracker can be mounted in various ways including wheelchairs and desks with the accessories. A nice feature is the built in TV tuner which is accessible through the gaze interface. The performance of the actual tracking system and accuracy in gaze estimation is unknown, only specified to a 7x4 grid. Track-box is specified to 17cm x 10cm x 15cm with a working range of 55-70 cm.

The system runs on Windows XP and a computer equipped with an Intel Dual Core CPU, 2GB RAM, a 500GB HD combined with a 17" monitor.
Supported languages are Traditional Chinese, Simplified Chinese, English and Japanese. All countries with pretty big markets. Price unknown but probably less than a Tobii. Get the product brochure (pdf).

Tuesday, June 15, 2010

Speech Dasher: Fast Writing using Speech and Gaze (K. Vertanen & D. MacKay, 2010)

A new version of the Dasher typing interface utilizes speech recognition provided by the CMU PocketSphinx software doubles the typing performance measured in words per minute. From a previous 20 WPM to 40 WPM, close to what a professional keyboard jockey may produce.

Abstract
Speech Dasher allows writing using a combination of speech and a zooming interface. Users ﬁrst speak what they want to write and then they navigate through the space of recognition hypotheses to correct any errors. Speech Dasher’s model combines information from a speech recognizer, from the
user, and from a letter-based language model. This allows fast writing of anything predicted by the recognizer while also providing seamless fallback to letter-by-letter spelling for words not in the recognizer’s predictions. In a formative user study, expert users wrote at 40 (corrected) words per
minute. They did this despite a recognition word error rate of 22%. Furthermore, they did this using only speech and the direction of their gaze (obtained via an eye tracker).

Speech Dasher: Fast Writing using Speech and Gaze
Keith Vertanen and David J.C. MacKay. CHI '10: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, To appear. [Abstract+videos, PDF, BibTeX]

Wednesday, October 21, 2009

Nokia near-eye display gaze interaction update

The Nokia near-eye gaze interaction platform that I tried in Finland last year has been further improved. The cap used to support the weight has been replaced with a sturdy frame and the overall prototype seems lighter and also incorporates headphones. The new gaze based navigation interface support photo browsing based on the Image Space application, allowing location based accesses to user generated content. See the concept video at the bottom for their futuristic concept. Nokia research website. The prototype will be displayed at the International Symposium on Mixed and Augmented Reality conference in Orlando, October 19-22.

Friday, September 18, 2009

The EyeWriter project

For some time I've been following the EyeWriter project which aims at enabling Tony, who has ALS, to draw graffiti using eye gaze alone. The open source eye tracker is available at Google code and is based on C++, OpenFrameworks and OpenCV. The current version supports basic pupil tracking based on image thresholding and blob detection but they are aiming for remote tracking using IR glints. Keep up the great work guys!

The Eyewriter from Evan Roth on Vimeo.

eyewriter tracking software walkthrough from thesystemis on Vimeo.

More information is found at http://fffff.at/eyewriter/

Monday, September 14, 2009

GaZIR: Gaze-based Zooming Interface for Image Retrieval (Kozma L., Klami A., Kaski S., 2009)

From the Helsinki Institute for Information Technology, Finland, comes a research prototype called GaZIR for gaze based image retrieval built by Laszlo Kozma, Arto Klami and Samuel Kaski. The GaZIR prototype uses a light-weight logistic regression model as a mechanism for predicting relevance based on eye movement data (such as viewing time, revisit counts, fixation length etc.) All occurring on-line in real time. The system is build around the PicSOM (paper) retrieval engine which is based on tree structured self-organizing maps (TS-SOMs). When provided a set of reference images the PicSOM engine goes online to download a set of similar images (based on color, texture or shape)

Abstract
"We introduce GaZIR, a gaze-based interface for browsing and searching for images. The system computes on-line predictions of relevance of images based on implicit feedback, and when the user zooms in, the images predicted to be the most relevant are brought out. The key novelty is that the relevance feedback is inferred from implicit cues obtained in real-time from the gaze pattern, using an estimator learned during a separate training phase. The natural zooming interface can be connected to any content-based information retrieval engine operating on user feedback. We show with experiments on one engine that there is sufficient amount of information in the gaze patterns to make the estimated relevance feedback a viable choice to complement or even replace explicit feedback by pointing-and-clicking."

Fig1. "Screenshot of the GaZIR interface. Relevance feedback gathered from outer rings influences the images retrieved for the inner rings, and the user can zoom in to reveal more rings."

Fig2. "Precision-recall and ROC curves for userindependent relevance prediction model. The predictions (solid line) are clearly above the baseline of random ranking (dash-dotted line), showing that relevance of images can be predicted from eye movements. The retrieval accuracy is also above the baseline provided by a naive model making a binary relevance judgement based on whether the image was viewed or not (dashed line), demonstrating the gain from more advanced gaze modeling."

Fig 3. "Retrieval performance in real user experiments. The bars indicate the proportion of relevant images shown during the search in six different search tasks for three different feedback methods. Explicit denotes the standard point-and-click feedback, predicted means implicit feedback inferred from gaze, and random is the baseline of providing random feedback. In all cases both actual feedback types outperform the baseline, but the relative performance of explicit and implicit feedback depends on the search task."

László Kozma, Arto Klami, and Samuel Kaski: GaZIR: Gaze-based Zooming Interface for Image Retrieval. To appear in Proceedings of 11th Conference on Multimodal Interfaces and The Sixth Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI), Boston, MA, USA, Novermber 2-6, 2009. (abstract, pdf)

Tuesday, August 18, 2009

COGAIN Student Competition Results

Lasse Farnung Laursen, a Ph.D student with the Department of Informatics and Mathematical Modeling at the Technical University of Denmark, won this years COGAIN student competition with the leisure application called GazeTrain.

"GazeTrain (illustrated in the screenshot below) is an action oriented puzzle game, that can be controlled by eye movements. In GazeTrain you must guide a train by placing track tiles in front of it. As you guide the train, you must collect various cargo and drop them off at the nearest city thereby earning money. For further details regarding how to play the game, we encourage you to read the tutorial accessible from the main menu. The game is quite customizable as the dwell time and several other parameters can be adjusted to best suit your play-style." (Source)

The GazeTrain game.

Runner ups, sharing the second place were

Music Editor, developed by Ainhoa Yera Gil, Public University of Navarre, Spain. Music Editor is a gaze-operated application that allows the user to compose, edit and play music by eye movements. The reviewers appreciated it that "a user can not only play but can actually create something" and that "Music Editor is well suited for gaze control".

Gaze Based Sudoku, developed by Juha Hjelm and Mari Pesonen, University of Tampere, Finland. The game can be operated by eye movements and it has three difficulty levels. Reviewers especially appreciated how "the separation between viewing and controlling and between sudoku grid and number selection panel is solved" and that the game "has no time constraints" so it is "relaxing" to play.

Friday, November 21, 2008

Eye movement control of remote robot

Yesterday we demonstrated our gaze navigated robot at the Microsoft Robotics event here at ITU Copenhagen. The "robot" transmits a video which is displayed on a client computer. By using an eye tracker we can direct the robot towards where the user is looking. The concept allows for a human-machine interaction with a direct mapping of the users intention. The Danish National TV (DR) came by today and recorded a demonstration. It will be shown tonight at the nine o´ clock news. Below is a video that John Paulin Hansen recorded yesterday which demonstrates the system. Please notice that the frame-rate of the video stream was well below average at the time of recording. It worked better today. In the coming week we'll look into alternative solutions (suggestions appreciated) The projects has been carried out in collaboration with Alexandre Alapetite from DTU. His low-cost, LEGO-based rapid mobile robot prototype, gives interesting possibilities to test some human-computer and human-robot interaction.

The virgin tour around the ITU office corridor (on YouTube)

Available on YouTube

Tuesday, November 18, 2008

A framework for gaze selection techniques (Tonder et al., 2008)

Martin van Tonder, Charmain Cilliers and Jean Greyling at the Nelson Mandela Metropolitan University, South Africa presented a platform independent framework in the proceedings of the 2008 annual research conference of the South African Institute of Computer Scientists. The framework is platform independent (relying on Java) and supports multiple interaction methods such as Kumars EyePoint, popups, as well as data logging and visualization.

Abstract
Experimental gaze interaction techniques are typically prototyped from scratch using proprietary libraries provided by the manufacturers of eye tracking equipment. These libraries provide gaze data interfaces, but not any of the additional infrastructure that is common to the implementation of such techniques. This results in an unnecessary duplication of effort. In this paper, a framework for implementing gaze selection techniques is presented. It consists of two components: a gaze library to interface with the tracker and a set of classes which can be extended to implement different gaze selection techniques. The framework is tracker and operating system independent, ensuring compatibility with a wide range of systems. Support for user testing is also built into the system, enabling researchers to automate the presentation of est targets to users and record relevant test data. These features greatly simplify the process of implementing and evaluating new interaction techniques. The practicality and flexibility of the framework are demonstrated by the successful implementation of a number of gaze selection
techniques.

van Tonder, M., Cilliers, C., and Greyling, J. 2008. A framework for gaze selection techniques. In Proceedings of the 2008 Annual Research Conference of the South African institute of Computer Scientists and information Technologists on IT Research in Developing Countries: Riding the Wave of Technology (Wilderness, South Africa, October 06 - 08, 2008). SAICSIT '08, vol. 338. ACM, New York, NY, 267-275. DOI= http://doi.acm.org/10.1145/1456659.1456690

Monday, November 3, 2008

The Conductor Interaction Method (Rachovides et al)

Interesting concept combining gaze input with hand gestures by Dorothy Rachovides at the Digital World Research Centre together with James Walkerdine and Peter Phillips at the Computing Department Lancaster University.

"This article proposes an alternative interaction method, the conductor interaction method (CIM), which aims to provide a more natural and easier-to-learn interaction technique. This novel interaction method extends existing HCI methods by drawing upon techniques found in human-human interaction. It is argued that the use of a two-phased multimodal interaction mechanism, using gaze for selection and gesture for manipulation, incorporated within a metaphor-based environment, can provide a viable alternative for interacting with a computer (especially for novice users). Both the model and an implementation of the CIM within a system are presented in this article. This system formed the basis of a number of user studies that have been performed to assess the effectiveness of the CIM, the findings of which are discussed in this work.

More specifically the CIM aims to provide the following.

—A More Natural Interface. The CIM will have an interface that utilizes gaze and gestures, but is nevertheless capable of supporting sophisticated activities. The CIM provides an interaction technique that is as natural as possible and is close to the human-human interaction methods with which users are already familiar. The combination of gaze and gestures allows the user to perform not only simple interactions with a computer, but also more complex interacones such as the selecting, editing, and placing of media objects.

—A Metaphor Supported Interface. In order to help the user understand and exploit the gaze and gesture interface, two metaphors have been developed. An orchestra metaphor is used to provide the environment in which the user interacts. A conductor metaphor is used for interacting within this environment. These two metaphors are discussed next.

—A Two-Phased Interaction Method. The CIM uses an interaction process where each modality is specific and has a particular function. The interaction between user and interface can be seen as a dialog that is comprised of two phases. In the first phase, the user selects the on-screen object by gazing at it. In the second phase, with the gesture interface the user is able to manipulate the selected object. These distinct functions of gaze and gesture aim to increase system usability, as they are based on human-human interaction techniques, and also help to overcome issues such as the Midas Touch problem that often experienced by look-and-dwell systems. As the dialog combines two modalities in sequence, the gaze interface can be disabled after the first phase. This minimizes the possibility of accidentally selecting objects through the gaze interface. The Midas Touch problem can also be further addressed by ensuring that there is ample dead space between media objects.

—Significantly Reduced Learning Overhead. The CIM aims to reduce the overhead of learning to use the system by encouraging the use of gestures that users can easily associate with activities they perform in their everyday life. This transfer of experience can lead to a smaller learning overhead [Borchers 1997], allowing users to make the most of the system’s features in a shorter time.

Rachovides, D., Walkerdine, J., and Phillips, P. 2007. The conductor interaction method. ACM Trans. Multimedia Comput. Commun. Appl. 3, 4 (Dec. 2007), 1-23. DOI= http://doi.acm.org/10.1145/1314303.1314312

Thursday, September 18, 2008

The Inspection of Very Large Images by Eye-gaze Control

Nicholas Adams, Mark Witkowski and Robert Spence from the Department of Electrical and Electronic Engineering at the Imperial College London got the HCI 08 Award for International Excellence for work related to gaze interaction.

"The researchers presented novel methods for navigating and inspecting extremely large images solely or primarily using eye gaze control. The need to inspect large images occurs in, for example, mapping, medicine, astronomy and surveillance, and this project considered the inspection of very large aerial images, held in Google Earth. Comparative search and navigation tasks suggest that, while gaze methods are effective for image navigation, they lag behind more conventional methods, so interaction designers might consider combining these techniques for greatest effect." (BCS Interaction)

Abstract

The increasing availability and accuracy of eye gaze detection equipment has encouraged its use for both investigation and control. In this paper we present novel methods for navigating and inspecting extremely large images solely or primarily using eye gaze control. We investigate the relative advantages and comparative properties of four related methods: Stare-to-Zoom (STZ), in which control of the image position and resolution level is determined solely by the user's gaze position on the screen; Head-to-Zoom (HTZ) and Dual-to-Zoom (DTZ), in which gaze control is augmented by head or mouse actions; and Mouse-to-Zoom (MTZ), using conventional mouse input as an experimental control.

The need to inspect large images occurs in many disciplines, such as mapping, medicine, astronomy and surveillance. Here we consider the inspection of very large aerial images, of which Google Earth is both an example and the one employed in our study. We perform comparative search and navigation tasks with each of the methods described, and record user opinions using the Swedish User-Viewer Presence Questionnaire. We conclude that, while gaze methods are effective for image navigation, they, as yet, lag behind more conventional methods and interaction designers may well consider combining these techniques for greatest effect.

Download paper as PDF

This paper is the short version of Nicolas Adams Masters thesis which I stumbled upon before creating this blog. A early version appeared as a short paper for COGAIN06.

Tuesday, July 22, 2008

Eye gestures (Hemmert, 2007)

Fabian Hemmert at the Potsdam University of Applied Sciences published his MA thesis in 2007. He put up a site with extensive information and demonstrations of his research in eye gesture such as winks, squints, blinks etc. See the videos or thesis. Good work and great approach!

One example:

"Looking with one eye is a simple action. Seeing the screen with only one eye might therefore be used to switch the view to an alternate perspective on the screen contents: a filter for quick toggling. In this example, closing one eye filters out information on screen to a subset of the original data, such as an overview over the browser page or only the five most recently edited files. It was to see how the users would accept the functionality at the cost of having to close one eye, a not totally natural action." (Source)

Tuesday, June 3, 2008

Eye typing at the Bauhaus University of Weimar

The Psychophysiology and Perception group, part of the faculty of Media at the Bauhaus University of Weimar are conducting research on gaze based text entry. Their past research projects include the Qwerty on-screen dwell based keyboard, IWrite, pEYEWrite and StarWrite. Thanks to Mario Urbina for notification.

QWRTY

"Qwerty is based on dwell time selection. Here the user has to stare for 500 ms a determinate character to select it. QWERTY served us, as comparison base line for the new eye typing systems. It was implemented in C++ using QT libraries."

IWrite

"A simple way to perform a selection based on saccadic movement is to select an item by looking at it and confirm its selection by gazing towards a defined place or item. Iwrite is based on screen buttons. We implemented an outer frame as screen button. That is to say, characters are selected by gazing towards the outer frame of the application. This lets the text window in the middle of the screen for comfortable and safe text review. The order of the characters, parallel to the display borders, should reduce errors like the unintentional selection of items situated in the path as one moves across to the screen button.The strength of this interface lies on its simplicity of use. Additionally, it takes full advantage of the velocity of short saccade selection. Number and symbol entry mode was implemented for this editor in the lower frame. Iwrite was implemented in C++ using QT libraries."

See Iwrite movie
Download Iwrite movie (1.37Mb)

PEYEWrite

"Pie menus have already been shown to be powerful menus for mouse or stylus control. They are two-dimensional, circular menus, containing menu items displayed as pie-formed slices. Finding a trade-off between user interfaces for novice and expert users is one of the main challenges in the design of an interface, especially in gaze control, as it is less conventional and utilized than input controlled by hand. One of the main advantages of pie menus is that interaction is very easy to learn. A pie menu presents items always in the same position, so users can match predetermined gestures with their corresponding actions. We therefore decided to transfer pie menus to gaze control and try it out for an eye typing approach. We designed the Pie menu for six items and two depth layers. With this configuration we can present (6 x 6) 36 items. The first layer contains groups of five letters ordered in pie slices.."

See pEYEwrite movie
Download pEYEwrite movie (1.68Mb)

StarWrite

In StarWrite, selection is also based on saccadic movements to avoid dwell times. The idea of StarWrite is to combine eye typing movements with feedback. Users, mostly novices, tend to look to the text field after each selection to check what has been written. Here letters are typed by dragging them into the text field. This provides instantaneous visual feedback and should spare checking saccades towards text field. When a character is fixated, both it and its neighbors are highlighted and enlarged in order to facilitate the character selection. In order to use x- and y-coordinates for target selection, letters were arranged alphabetically on a half-circle in the upper part of the monitor. The text window appeared in the lower field. StarWrite provides a lower case as well, upper case, and numerical entry modes, that can be switched by fixating for 500 milliseconds the corresponding buttons, situated on the lower part of the application. There are also placed the space, delete and enter keys, which are driven by a 500 ms dwell time too. StarWrite was implemented in C++ using OpenGL libraries for the visualization."

See StarWrite movie
Download StarWrite movie (1.42Mb)

Associated publications

Huckauf, A. and Urbina, M. H. 2008. Gazing with pEYEs: towards a universal input for various applications. In Proceedings of the 2008 Symposium on Eye Tracking Research & Applications (Savannah, Georgia, March 26 - 28, 2008). ETRA '08. ACM, New York, NY, 51-54. [URL] [PDF] [BIB]

Urbina, M. H. and Huckauf, A. 2007. Dwell time free eye typing approaches. In Proceedings of the 3rd Conference on Communication by Gaze Interaction - COGAIN 2007, September 2007, Leicester, UK, 65--70. Available online at http://www.cogain.org/cogain2007/COGAIN2007Proceedings.pdf [PDF] [BIB]

Huckauf, A. and Urbina, M. 2007. Gazing with pEYE: new concepts in eye typing. In Proceedings of the 4th Symposium on Applied Perception in Graphics and Visualization (Tubingen, Germany, July 25 - 27, 2007). APGV '07, vol. 253. ACM, New York, NY, 141-141. [URL] [PDF] [BIB]

Urbina, M. H. and Huckauf, A. 2007. pEYEdit: Gaze-based text entry via pie menus. In Conference Abstracts. 14th European Conference on Eye Movements ECEM2007. Kliegl, R. & Brenstein, R. (Eds.) (2007), 165-165.

Tuesday, April 15, 2008

Gaze Interaction Demo (Powerwall@Konstanz Uni.)

During the last few years quite a few wall sized displays have been used for novel interaction methods. Not seldomly these have been used with multi-touch, such as the Jeff Han´s FTIR technology. This is the first demonstration I have seen where eye tracking is used for a similar purpose. A German Ph.D candidate, Jo Bieg, is working on this out of the HCI department at the University of Konstanz. The Powerwall is 5.20 x 2.15M and has a resolution of 4640 x 1920.

The demonstration can be view at a better quality (10Mb)

Also make sure to check out the 360 deg. Globorama display demonstration. It does not use eye tracking for interaction but a laser pointer. Nevertheless, really cool immersive experience, especially the Google Earth zoom in to 360 panoramas.

Friday, March 28, 2008

Gaze Media Player

The component I´ve been working on is now capable of the basics. Great feeling just looking a song titles and then skipping through the playlist by gaze =) There is room for improvements, would be nice have a component like a slider where one could go to a specific part of the song. (don´t know how many times I´ve been playing guitar to a song while learning it and going back and forth between mouse and guitar. Let´s see what a weekend could do =)

Screenshot of the music player component, layout not finalized.. Updated version has a song progression bar and volume controls.

The play-button houses another ellipse shaped menu with the regular controls for (next, previous, play, stop)

Wednesday, March 26, 2008

Last week of prototyping

Being back from a short easter holiday the final week of developing the prototype is here. The functionality of the interface is OK but there is major tweaking and bug testing to be performed. The deadline for any form of new features is Monday 31st of March. I intend to use all of April for setting up the evaluation experiments and procedures. As always there is so much more that I would like to incorporate, every day brings new ideas.

The final version will include:

A configurable dwell button component. This enables drag and drop dwell buttons into Windows projects with individual configuration on dwell-time and icons etc.
A novel gaze based menu system that utilized saccade selection (two step dwell) This highly configurable interface component displays itself when activated. It aims at solving the midas touch problem while utilizing the screen real estate in a better way. The two steps in the activation process can be set to specific activation-speeds (dwells) More on this later..
A gaze-based memory game using a 36 card layout. By perfoming a dwell the cards "turn" over and shows its symbol (flag). The user the selects another card. If matching then remove them. If different, turn them back over. Got some nice graphical effects.
A gaze based picture viewer that zooms into the photos in gaze of the user. More on this later.
A gaze based media/music player. This component will scan the computer for artists, albums and songs. These items are then accessible by a gaze driven interface where the user can create play-lists and perform the usual functions (volume+-, pause, stop, next, previous etc.)
Perhaps just one little surprise more..

Eight weeks so far. Curiosity, dedication and plain ol´ hard work, nothing else to it.

Monday, March 17, 2008

Inspiration: Takehiko Ohno

Working out of NTT Cyber Solutions Laboratories in Japan Takehiko interests lies mainly in eye tracking technology and human computer interaction. He has published several papers on eye tracking technology and interaction methods. The QuickGlance selection method aims to solve the well known Midas-touch problem. The interface contains a specific selection area next to each choice/item which must be fixated to activate the function. There are two major advantages with this. First, the user can look around at the menu items without worrying about accidentally activating something. Second, advanced users can go for the activation area directly without even reading the menu text. Just like most people know the order/location of items on the Windows Start-menu. On the downside this means that all options are displayed on the screen all the time.

Takehiko additionally have published several articles on FreeGaze, a remote based system which allows the user to move around his head freely. The FreeGaze eye tracker at NTT has a 0.28 degree of accuracy and is based on a rather wide stereoscopic corneal reflection method using serveral image processing algorithms described in the papers, which are well written and worth reading.

His research highlights the importance of providing feedback to the user as the major method of reducing error rates. Something that I've taken to heart.

Monday, March 3, 2008

Zooming and Expanding Interfaces / Custom componenets

The inspiration I got from the reviewed papers on using a zooming interaction style to developing a set of zoom based interface components. The interaction style is suitable for gaze to overcome the inaccuracy and jitter of eye movements. My intention is that the interface components should be completely standalone, customizable and straightforward to use. Ideally included in new projects by importing one file and writing one line of code.

The first component is a dwell-based menu button that on fixation will a) provide a dwelltime indicator by animating a small glow effect surrounding the button image and b) after 200ms expand an ellipse that houses the menu options. This produces a two step dwell activation while making use of the display area in a much more dynamic way. The animation is put in place to keep the users fixation remained at the button for the duration of the dwell time. The items in the menu are displayed when the ellipse has reached its full size.

This gives the user a feedback in his parafoveal region and at the same time the glow of the button icon has stopped indicating a full dwelltime execution. (bit hard to describe in words, perhaps easier to understand from the images below) The parafoveal region of our visual field is located just outside the foveal region (where the full resolution vision takes place). The foveal area is about the size of a thumbnail on an armslengths distance, items in the parafoveal region still can be seen but the resolution/sharpness is reduced. We do see them but have to make a short saccade for them to be in full resolution. In other words the menu items pop out at a distance that attracts a short saccade which is easily discriminated by the eye tracker. (Just4fun test your visual field)

Before the button has received focus

Upon fixation the button image displays an animated glow effect indicating the dwell process. The image above illustrates how the menu items pops out on the ellipse surface at the end of the dwell. Note that the ellipse grows in size during a 300ms period, exact timing is configurable by passing a parameter in the XAML design page.

The second prototype I have been working on is also inspired by the usage of expanding surfaces. The purpose is a gaze driven photo gallery where thumbnail sized image previews becomes enlarged upon glancing at them. The enlarged view displays an icon which can be fixated to make the photo appear in full size.

Displaying all the images in the users "My pictures" folder.

Second step, glancing at the photos. Dynamically resized. Optionally further enlarged.

Upon glancing at the thumbnails they become enlarged which activates the icon at the bottom of each photo. This enables the user to make a second fixation on it to bring the photo into a large view. This view has to two icons to navigate back and forth (next photo). By fixating outside the photo the view goes back to the overview.

Wednesday, February 6, 2008

Better feedback!

Since having a pointer representing the gaze position on the screen becomes distracting some other form of feedback is needed. As mentioned before having a pointer will cause your eyes to more or less automatically fixate on the moving object. Remember, the eyes are never still and even more the eye tracker does create additional jitter.

What we need is a subtle way of showing the user that the eye tracker has captured the gaze coordinates to the same location as he/she is looking at. It's time for trying to make it look a bit nicer than the previous example where the whole background color of the button would change on a gaze fixation.

The reason for choosing to work with Windows Presentation Foundation is that it provides rich functionality for building modern interfaces. For example you can define a trigger to an event, such as GazeEnter (ie. MouseEnter) on a button and then apply a build in graphical effect on the object. These effects are rendering in real time such as a glowing shadow around the object or a gaussian filter that gives the object an out of focus effect. Very useful for this project. Let's give it a try.

This is the "normal" button. Notice the out-of-focus effect on the globe in the center.

Upon receving a glance the event "Image.IsMouseOver" event is trigged. This starts the built-in rendering function BitmapEffect OuterGlowBitmapEffect which generates a nice red glowing border around the button.

The XAML design code (screenshot) for the button. Click to enlarge.

Notice the GlowColor and GlowSize attributes to manipulate the rendering of the effects.
To apply this to the button we define the element Style="{StaticResource GlowButton}" inside the button tag. Further the globe in the center can be brought back in focus and highlighed with a green glow surrounding it inside the button canvas.

The globe is defined as an image object. Upon focus the triggers will set the gaussian blur effect to zero, which means in focus. The glow effect produces the green circle surrounding the globe.

Putting it all together in a nice looking interface, using the Glass Window style, it looks promising and a real improvement since yesterdays boring interface. Providing a small surrounding glow of giving the image focus upon fixation is much better than changing the whole button color. The examples here are perhaps somewhat less subtle than they should, just to demonstrated the effect.

Screenshots of the second prototype with new U.I effects and events.

The "normal" screen with no gaze input.

OnGazeEnter. Generate a glowing border around the button

The other example with a green glow and the globe in full focus.