WebGazer

Eye Tracking

RUN

Process

Usually, eye gazing and eye tracking software is dependent on specialized hardware. The most common and effective way of eye tracking uses a method called Pupil Center Corneal Reflection. This technique uses infrared spectrum imaging to direct near-infrared light towards the center of the subject’s eyes. The cornea will reflect this light, whereas the pupil will not. Therefore, the associated software will be able to detect the location of the pupil and can detect where the user is looking. However, this is dependent on hardware being present in order to target this infrared light. The requirement of hardware is very inaccessible for many use cases of eye tracking. Because of this, several groups have developed and shared webcam based eye tracking, that uses a computer webcam or phone camera to be able to detect, albeit with less accuracy, the position of a user’s gaze. There are two different APIs that are currently implemented are WebGazer and GazeCloud which is owned by GazeRecorder. While GazeCloud has very little documentation over their methods, WebGazer has published a comprehensive overview of webcam based eye tracking, and it is reasonable to assume that GazeCloud uses a similar process in order to track eye movements with webcam. There are two components to tracking eye movements with a webcam: pupil detection and gaze estimation. Because there is no concrete way to detect a pupil, like there would be in the case of pupil tracking hardware, the program must use regression models that match pupil positions and eye features with a corresponding screen location. WebGazer in particular uses external facial feature detection libraries that return the position of different parts of the face. They take the eye component and from the detected location of the eye, they are able to detect the pupil. To find the pupil, they make the assumption that the iris is darker than the outside area of the eye, and that the pupil is generally at the center of the iris. From there, the mapping between pupil location and screen location is quite complex, and requires multidimensional vectors. It also depends on the position and rotation of the head with respect to the camera and the screen. In terms of calibration, GazeCloud has a robust calibration process in the beginning that moves the user through a series of screen locations and head movements. WebGazer does not have a pre-implemented calibration process, but both tools use self-calibration. This means that throughout the process of eye tracking, the user can click anywhere on the screen while looking at the curser, and the tools will use this point as a calibration point. WebGazer has published their methods of calibrating pupil movement to screen location, and while GazeCloud may not use the exact method, it is quite likely that they use a similar regression analysis. WebGazer has noted that they use a Ridge Regression model that maps a 120 dimensional eye feature vector to the display coordinates (X and Y coordinates), each time the user clicks to calibrate.

Explanation of Results

The final output for both GazeCloud and WebGazer is a series of X and Y coordinates in a CSV file. The X and Y coordinates correspond to locations on the screen where the user has viewed, as predicted by the models used by these tools. Both of these tools record an X and Y coordinate every few milliseconds so the data is recorded and saved frequently. The X and Y coordinates are also based on the user’s viewport (i.e. the size of their screen), so it is difficult to give an estimate for the ranges of the X and Y values possible. However, in both tools, the origin (0,0) is located at the center of the screen and the four coordinates are replicated by the user’s screen. The first column is the X coordinates and the second column is the Y coordinates. Here is an example of what some data might look like. Each data value is accurate to one digits past the decimal point.

References

Source Code