Martin Tall On Gaze Interaction

Wednesday, February 6, 2008

Better feedback!

Since having a pointer representing the gaze position on the screen becomes distracting some other form of feedback is needed. As mentioned before having a pointer will cause your eyes to more or less automatically fixate on the moving object. Remember, the eyes are never still and even more the eye tracker does create additional jitter.

What we need is a subtle way of showing the user that the eye tracker has captured the gaze coordinates to the same location as he/she is looking at. It's time for trying to make it look a bit nicer than the previous example where the whole background color of the button would change on a gaze fixation.

The reason for choosing to work with Windows Presentation Foundation is that it provides rich functionality for building modern interfaces. For example you can define a trigger to an event, such as GazeEnter (ie. MouseEnter) on a button and then apply a build in graphical effect on the object. These effects are rendering in real time such as a glowing shadow around the object or a gaussian filter that gives the object an out of focus effect. Very useful for this project. Let's give it a try.

This is the "normal" button. Notice the out-of-focus effect on the globe in the center.

Upon receving a glance the event "Image.IsMouseOver" event is trigged. This starts the built-in rendering function BitmapEffect OuterGlowBitmapEffect which generates a nice red glowing border around the button.

The XAML design code (screenshot) for the button. Click to enlarge.

Notice the GlowColor and GlowSize attributes to manipulate the rendering of the effects.
To apply this to the button we define the element Style="{StaticResource GlowButton}" inside the button tag. Further the globe in the center can be brought back in focus and highlighed with a green glow surrounding it inside the button canvas.

The globe is defined as an image object. Upon focus the triggers will set the gaussian blur effect to zero, which means in focus. The glow effect produces the green circle surrounding the globe.

Putting it all together in a nice looking interface, using the Glass Window style, it looks promising and a real improvement since yesterdays boring interface. Providing a small surrounding glow of giving the image focus upon fixation is much better than changing the whole button color. The examples here are perhaps somewhat less subtle than they should, just to demonstrated the effect.

Screenshots of the second prototype with new U.I effects and events.

The "normal" screen with no gaze input.

OnGazeEnter. Generate a glowing border around the button

The other example with a green glow and the globe in full focus.

Tuesday, February 5, 2008

Enabling feedback (OnGazeOver)

By taking away the mouse pointer we also take away the feedback on where the acctual "pointer" is. We know where we are looking but this might not be where the eye tracker have measured the gaze vector to be. There is a clear need for feedback to be sure what object is fixated ie. that the right object is chosen.

In the previous post we gained control over the mouse pointer and use this in secret (hidden) to point with the gaze X and Y coordinates. Now we have access to a whole range of triggers and functions such as MouseEnter or IsMouseOver (replace "mouse" with "gaze" and you get the idea)

For the next small test it's time to introduce Windows Presentation Foundation (WPF) and
XAML layout templates. In my opinion the best to come out of Microsoft in a while.

It enables you to create applications (fast) that look and feel like 2008. No more battleship gray control panels and boring layouts. The first advantage I see it the separation of design into XML files (not all different from HTML) and code in traditional .cs files plus a lot of support for things you really want to do (animations, 3D, Internet, media etc.) If you develop Windows applications and have not got around to test WPF yet you certainly give it a spin.

For example the button we for providing the user with feedback on gaze-position can be defined as this:

[Windows1.xaml]

Screenshot, code will render a button in the browser.

The built in trigger supports the IsMouseOver event that the UI component provides, and there is many types of behavior supported. All the styles and behaviors can be defined in groups and templates which enables a very powerful structure thats easy to maintain. Additionally it is rather easy to define your own events that should be fired on f.ex onGazeEnter.

While exploring I've placed buttons like this in 4x4, 5x5, 8x6 and 9x9 grids to test how easily they can be discriminated. The 48 button version seemed to have a reasonable distance between the objects and large enough button area for a stable selection. Real experiments with a whole range of users is needed to make design guidelines like this (further down the line)

Inital version. Providing feedback on gaze position.

Redirecting the Gaze X/Y to replace Mouse X/Y

To utilize the wide range of functionality that has been developed for mouse based interaction in our gaze interaction software we need to replace the mouse X/Y coordinates with the gaze X/Y.

This requires one to dig rather deep into the Windows system and DLL files. The function Move takes two integers and you guessed right, the X and Y position of the mouse pointer.
This method is to be called every time the tracker provides us with new gaze data. Subsequently the position of the pointer should be moved to the new position.

So first we modify the EyeTrackerServer class

   // The decoded string from tracker UDP stream
datareceived = System.Text.Encoding.ASCII.GetString(received);

// Create and instance of the object that redirect the gaze to the mouse
RedirectGazeToMouse gazeMouse = new RedirectGazeToMouse();

if (datareceived.Length > 0)
{
// Extract the X & Y coordinates from the UDP stream from the tracker
extractGazePosition(datareceived);

// Move the mousepointer according to the gaze X&Y coordinates
gazeMouse.Move(gazeData.GazePositionY, gazeData.GazePositionX);
}

It is very distracting to actually see the mouse pointer moving on the screen, updated 50 times per second, because you would fixate on it and it would move slightly (remember eye trackers are not as precise) so one would end up "chasing" it around. Let's hide the mouse pointer.
In the main application window (typically Windows1.xaml.cs or similar) this is done by placing the one line in the constructor:


public Window1()
{    
InitializeComponent();

// Hide the mouse cursor since it is replace by gaze coordinates
Cursor = Cursors.None;

// Initialize and start the Eye Tracker
myServer = new EyeTrackerServer();
myServer.start();
}

And here is the RedirectGazeToMouse class. (many thanks to Pinvoke.net)


using System;
using System.Collections.Generic;
using System.Text;
using System.Runtime.InteropServices;
using System.Windows;

namespace GazeHero
{

public class RedirectGazeToMouse
{
[DllImport("user32.dll", EntryPoint = "SendInput", SetLastError = true)]

static extern uint SendInput(uint nInputs, INPUT[] pInputs, int cbSize);

[DllImport("user32.dll", EntryPoint = "GetMessageExtraInfo", SetLastError = true)]

static extern IntPtr GetMessageExtraInfo();

private enum InputType
{
INPUT_MOUSE = 0,
INPUT_KEYBOARD = 1,
INPUT_HARDWARE = 2
}

[Flags()]
private enum MOUSEEVENTF
{
MOVE =        0x0001,  // mouse move
LEFTDOWN =    0x0002,  // left button down
LEFTUP =      0x0004,  // left button up
RIGHTDOWN =   0x0008,  // right button down
RIGHTUP =     0x0010,  // right button up
MIDDLEDOWN =  0x0020,  // middle button down
MIDDLEUP =    0x0040,  // middle button up
XDOWN =       0x0080,  // x button down
XUP =         0x0100,  // x button down
WHEEL =       0x0800,  // wheel button rolled
VIRTUALDESK = 0x4000,  // map to entire virtual desktop
ABSOLUTE =    0x8000,  // absolute move
}

[Flags()]
private enum KEYEVENTF
{
EXTENDEDKEY = 0x0001,
KEYUP =       0x0002,
UNICODE =     0x0004,
SCANCODE =    0x0008,
}

[StructLayout(LayoutKind.Sequential)]
private struct MOUSEINPUT
{
public int dx;
public int dy;
public int mouseData;
public int dwFlags;
public int time;
public IntPtr dwExtraInfo;
}

[StructLayout(LayoutKind.Sequential)]
private struct KEYBDINPUT
{
    public short wVk;
    public short wScan;
    public int dwFlags;
    public int time;
    public IntPtr dwExtraInfo;
}

[StructLayout(LayoutKind.Sequential)]
private struct HARDWAREINPUT
{
    public int uMsg;
    public short wParamL;
    public short wParamH;
}

[StructLayout(LayoutKind.Explicit)]
private struct INPUT
{
    [FieldOffset(0)]
    public int type;
    [FieldOffset(4)]
    public MOUSEINPUT mi;
    [FieldOffset(4)]
    public KEYBDINPUT ki;
    [FieldOffset(4)]
    public HARDWAREINPUT hi;
}

This function moves the cursor to a specific point at the screen. X coordinate of the position as pixel Y coordinate of the position as pixel
Returns 0 if there was an error otherwise 1


public uint Move(int x, int y)
{
 double ScreenWidth = System.Windows.SystemParameters.PrimaryScreenWidth;
 double ScreenHeight = System.Windows.SystemParameters.PrimaryScreenHeight;

 INPUT input_move = new INPUT();
 input_move.mi.dx = (int)Math.Round(x * (65535 / ScreenWidth), 0);
 input_move.mi.dy = (int)Math.Round(y * (65535 / ScreenHeight), 0);

 input_move.mi.mouseData = 0;
 input_move.mi.dwFlags = (int)(MOUSEEVENTF.MOVE | MOUSEEVENTF.ABSOLUTE);
 INPUT[] input = {input_move};

 return SendInput(1, input, Marshal.SizeOf(input_move));
}


CURRENTLY NOT USED IN THE GAZE PROJECT BUT COULD BE FURTHER ON..
This function simulates a simple mouseclick at the current cursor position
All right if it is 2. All below indicates an error.

 public static uint Click()
 {
   INPUT input_down = new INPUT();
   input_down.mi.dx = 0;
   input_down.mi.dy = 0;
   input_down.mi.mouseData = 0;
   input_down.mi.dwFlags = (int)MOUSEEVENTF.LEFTDOWN;
   INPUT input_up = input_down;
   input_up.mi.dwFlags = (int)MOUSEEVENTF.LEFTUP;
   INPUT[] input = {input_down, input_up};

   return SendInput(2, input, Marshal.SizeOf(input_down));
 }

}

}

Mouse vs. Gaze

Gaze interaction differs from the mouse mainly on some aspects. First, it is not mechanical. This means that the X,Y position will always, more or less, move while the mouse remains exactly the same if left alone. Second, there is no clicking to activate functions. Initial ideas might be to blink but it does not stand for a permanent solution since we blink out of reflexes all the time. Third, we use our eyes to investigate the environment. Performing gestures such as looking left-right-left is not natural. In other words, performing motor tasks with our eyes does not feel right. The eyes keeps track of the state of the world and the interface should provide just this.

It's time to investigate how how to optimally provide the user with a good sense of feedback. In most areas of software interfaces the phenomenon of roll-over highlighting has been very successful. For example when you move the mouse over buttons they change background color or show a small animation. This type of feedback is crucial for building good interfaces. It shows the user exactly what function can be executed and gives a good clue on what to expect. My initial idea is to throw a grid of gray buttons on the screen, lets say 6 x 6, and change their background color when the gaze enters.

How to solve this?
Even if we are developing a new type of interface it doesn't necessarily mean that we should turn our back on the world as we know it. Most programming languages today support the mouse and keyboard as the main input devices. This provides us with a range of functionality that could be useful, even for developing gaze interfaces.

For example, most U.I components such as button or images have methods and properties that are bound to mouse events. These are usually defined as OnMouseOver or MouseEnter etc. These events can be used to execute specific parts of the code, for example to change the color of a button or detect when a user clicks an object.

Somehow we need to take control over the mouse and replace the X and Y coordinates it receives from the input device with the gaze X/Y coordinates we have extracted from the eye tracker. After a day of searching, trying and curing headaches this turns out to be possible.

Follow me to the next post and I'll show how it's done =)

Day 3:1 - Quick fix for noisy data

Yesterday I was able to plot the recorded gaze data and visually illustrate it by drawing small rectangles where my gaze was positioned. As mentioned the gaze data was rather noisy. Upon fixating on a specific area the points would still spray out within an area about the size of a soda cap. I decided to have a better look at the calibration process. This is done in the IView program supplied by SMI. By running the process in a full size window I was able to get a better calibration, increasing the number of calibration points gives a better result.

Still the gaze data is rather noisy and it is a well known problem within the field. This has previously been solved by applying an algorithm ro smoothen the X and Y position. My initial solution is to compare the received data with the last reading. If it is within a radius of 20 pixels it will be considered to be the same spot as the previous fixation.


 if (isSmoothGazeDataOn)
 {
   // If new gaze X point is outside plus/minus the smooth-radius, set new gaze pos.
   if (newGazeX > this.gazePositionX + this.smoothRadius || 
       newGazeX < this.gazePositionX - this.smoothRadius)

       this.gazePositionX = newGazeX;


   // If new gaze Y point is outside plus/minus the smooth-radius, set new gaze pos.
   if (newGazeY > this.gazePositionY + this.smoothRadius || 
       newGazeY < this.gazePositionY - this.smoothRadius)

       this.gazePositionY = newGazeY;
  }
  else // Gaze position is equal to pure data. No filtering.
  {
     this.gazePositionX = newGazeX;
     this.GazePositionY = newGazeY;
  }
}

It is a very simple function for stabilizing the gaze data (somewhat). A solution with a buffers and a function to averaging over more readings might be better but for now this does the trick (the gaze plotter became more stable upon fixating on one specific area)

Day 2 - Plotting the gaze

After successfully hooking up a socket to listen to the UDP stream from the SMI IViewX RED eye tracker the next objective was to draw or plot the gaze on the screen so that I could visually see where I was gazing (!) Or, where the eye tracker would suggest that my gaze was directed.

The UDP stream provided me with the X and Y coordinates of my gaze. Now I needed a Windows program that would enable me to start the client, receive the data and plot this graphically. In order to do this I created a delegate for an event handler. This means that when ever the program received a new gaze position it would fire an event. The main program will in turn register a listener for this event that would call for a function that draws a small box on the screen based on the X and Y coordinates collected.

[EyeTrackerServer.cs]


public delegate void GazeChangedEventHandler(object source, EventArgs e);
public GazeChangedEventHandler onGazeChange;

In addition to this I decided to create an object "GazeData" that would carry a function to extract the X and Y position from the datastream and set this as two integers. These were to be named GazePositionX and GazePositionY.

So, in the main loop of the program where the raw data string was previously just printed to the console I instead passed it on to a function.

[EyeTrackerServer.cs]


datareceived = System.Text.Encoding.ASCII.GetString(received);

if (datareceived.Length > 0)
{
extractGazePosition(datareceived);
}

And then the function itself, after parsing and setting the X and Y the function fires the event "OnGazeChange"

[EyeTrackerServer.cs]

 public void extractGazePosition(string data)
{
GazeData.extractTrackerData(data);

 if (onGazeChange != null)
     onGazeChange(this, EventArgs.Empty);
}

The GazeData object contains the function for extracting the X and Y and property set/getters
....

   public void extractTrackerData(string dataStr)
{
    char[] seperator = { ' ' };
    string[] tmp = dataStr.Split(seperator, 10);
    this.TimeStamp = Convert.ToInt32(tmp[1]);
    this.gazePositionX = Convert.ToInt32(tmp[2]);
    this.gazePositionY = Convert.ToInt32(tmp[4]);
 }

The main windows application would create a listener for the OnGazeChange event:
[Program.cs]

 myServer.onGazeChange +=
new EyeTrackerServer.GazeChangedEventHandler(onGazeChange);

And when the server would receive a new gaze reading it would signal to the eventHandler to fire this function that draws a rectangle on the screen

[Program.cs]

 public void onGazeChange(object source, EventArgs e)
{
PlotGazePosition(myServer.GazeData.GazePositionX,                                    
                 myServer.GazeData.GazePositionY);
}

public void PlotGazePosition(int x, int y)
{
Graphics G = this.CreateGraphics();
Pen myPen = new Pen(Color.Red);
Random rnd = new Random((int)DateTime.Now.Ticks);

// Little bit of random colors just for fun...
myPen.Color = Color.FromArgb(
             (int)rnd.Next(0, 255),
             (int)rnd.Next(0, 255),
             (int)rnd.Next(0, 200));

// The shape of the rectangles are slightly random too, just to make it artistic..
G.DrawRectangle(myPen, x, y, (int)rnd.Next(5,25), (int)rnd.Next(5,25));

Happily gazing at my own gaze pattern and trying to draw with it on the screen it was just about time to wrap up day two. So far, so good. However, I did notice that the gaze data was full of jitter. When ever I would fixate on one specific point the plotter would jump around in an area about the size of a.. coca-cola cap. Was this part of the tracker or normal eye behavior. In general, our eyes are never still. When we fixate on objects small eye movements called microsaccades takes place. Supposedly (there is a debate) this is how we are able to keep items in focus. If the eye would be completely fixated the image would slowly fade away (reminds me of some plasma tv screens that do the same so that the image would not "burn" into the panel)

Day One - Getting the gaze position

The SMI IViewX RED eye tracker steams the gaze coordinates on a UPD stream configured on port 4444. The following code is how I managed to hook the C# client up to access the stream and simply print it to the console window. This is the initial version that concluded the first day. I got the gaze coordinates from the tracker into my C# code. Good enough.

First output from the UPD Eye Tracker client/server program

using System;
using System.Collections.Generic;
using System.Text;
using System.Net;
using System.Net.Sockets;
using System.Threading;

namespace GazeStar
{

class EyeTrackerServer
{
 private const int udpPort = 4444;
 public Thread UDPThread;

public EyeTrackerServer()
{
   try
   {
     UDPThread = new Thread(new ThreadStart(StartListen));
     UDPThread.Start();
     Console.WriteLine("Started thread...");
   }
   catch (Exception e)
   {
     Console.WriteLine("An UDP error occured.." + e.ToString());
     UDPThread.Abort();
   }
}


public void StartListen()
{
   IPHostEntry localHost;

   try
   {

      Socket soUdp = new Socket(
                         AddressFamily.InterNetwork,
                         SocketType.Dgram, ProtocolType.Udp);

       try
       {
         Byte[] localIp = { 127, 0, 0, 1 };
         IPAddress localHostIp = new IPAddress(localIp);
         localHost = Dns.GetHostEntry(localHostIp);
       }

       catch (Exception e)
       {
         Console.WriteLine("Localhost not found!" + e.ToString());
         return;
       }

   IPEndPoint localIpEndPoint = new IPEndPoint(
                                    localHost.AddressList[0],
                                    udpPort);
   soUdp.Bind(localIpEndPoint);

   String datareceived = "";


 while ( true )
 {
   Byte[] received = new byte[256];
   IPEndPoint tmpIpEndPoint = new IPEndPoint(localHost.AddressList[0],
                                             udpPort);

   EndPoint remoteEp = (tmpIpEndPoint);
   int bytesReceived = soUdp.ReceiveFrom(received, ref remoteEp);

   datareceived = System.Text.Encoding.ASCII.GetString(received);
   Console.WriteLine("Sample client is connected via UDP!");

     if (datareceived.Length > 0)
         Console.WriteLine(datareceived);
   }
  }
  catch (SocketException se)
  {
    Console.WriteLine("A socket error has occured: " + se.ToString());
  }
}


 static void Main(string[] args)
 {
    EyeTrackerServer eyesrv = new EyeTrackerServer();
 }

}
}

Day One - The Eye Tracker

The choosen platform to develop the software on is Microsoft Visual Studio 2008 using .NET 3.5 and the C# (C-Sharp) programming language. Not that I´m very experienced with it (just one course) but it is similar to Sun Microsystems Java language. Besides the development environment is really nice and there´s a large amount of online resources available. Since the box is running XP all ready there is absolutely no reason to mess with it (personally I run MacPro/Os X but that´s another story)

The SMI IView RED eye tracker comes with the IView software where you can calibrate the system against points on the screen as well as other configuration aspects. After turning the tracker on and launching the calibration process I could see that the tracker is working.

Screenshot of the SMI IView program.

The calibration dots to the left usually are in full screen. To the left you can see how the eye tracker measures the reflection of the IR-lights and combines this with the location of the pupil to detect and measure eye movements. This is usually referred to as corneal reflection. The infra red light shined in my face is out of the spectrum that I can perceive. More information on eye tracking.

Clearly the the computer some how receives the data since it´s drawing circles on the screen indicating where my gaze is directed. How do I get hold of this data?

Upon an external inspection I find one firewire cable going from the tracker to the computer and two cables from the image processing box to the tracker. Seems that I must read from the firewire port. Time to Google that.

Turns out that there is an Universal Eye Tracking Driver which has been developed by Oleg Spakov at the University of Tampere, Finland. Should be a good solution so that I could easily move the application to any other supported system, including those from Swedish firm Tobii Technology. After downloading and installing the driver (which comes with source code, great!) I compiled the test application in Visual Studio to try it out. When trying to choose which tracker and what port I was using it turned out that there was no support for firewire. Seems like the previous version of IView was using USB. After some correspondence with Oleg and some tries to work around the issue it was time to stop banging my head against and RTFM like one should.

Never was much for manuals in the first place. Especially when they are 400 pages thick filled with tables of ASCII codes and references to other codes or pages. Suppose it is very much to the point, if you are a German engineer that is.. Ok. Found it. The tracker data can be sent via ethernet if IView is configured to do so. Said and done, configured IView to stream data by the UDP protocol on port 4444.

Had decided not to leave until I had the data. How do I open a datagram socket in C#? A quick search on Google should solve it. Found a pice of code that seemed to work, using a thread to open a socket to the designated port and the just read what ever data that came along. If I could print the data to the console window it.. would just be an awesome end to day one.

See next post for the C# solution..

Day One - Introduction

Today I met up with Kenneth Holmqvist who is the laboratory director of the HumLab at Lund Universtity. Kenneth, who have a long experience in the field, held a course last semester in Eye Tracking Methodology in which I participated as a part of my Masters in Cognitive Science at Lund University

The HumLab, or Humanities Laboratory, is located in the new language and litterature center which was build just a few years ago. The facilities certainly are top-notch. Modern Scandinavian design, high quality materials and have a high technical standard (wireless internet access, access control, perfect climate and air)

The laboratory matches this standard by providing Lund University with advanced technical solutions and expertise. A range of studies takes place here. A perfect home for someone into cognitive sciences including psychology, linguistics and why not Human-Computer Interaction.

My background lies in software development which I previously studied at the Department of Informatics, where I completed a BA in Software Design/Construction. My interest in Cognitive Science and Human Computer Interaction was developed during an EAP exchange to University of California, San Diego in 2006-2007. The blend of Cognition and Neuroscience, understanding of the bits and bolts that enables our perception and behavior combined with novel interface technology and interaction methods is a extremely interesting field. Many thanks to the cog.sci. faculty at UCSD for inspiring classes (Hollan, Boyle, Sereno, Chiba)

Kenneth have pratical experience with a range of eye trackers, they do come in many shapes. (head mounted, high speed, remote) all of which are present in the HumLab. He demonstrated a brand new SMI IView RED remote system connected to a powerful Windows XP machine. This is the setup that I will develop a Gaze Interaction Interface on.

Day one was far from over, lets get started in another post..