L2CS-Net

Real-Time Gaze Estimation

RUN

This tool runs the original L2CS-Net gaze estimation model (ResNet50 backbone) directly in the browser via ONNX Runtime Web. It uses WebGPU on Chrome / Edge for ~15–20 FPS inference, and falls back to WASM (CPU) on browsers without WebGPU support.

Compared to the lighter MobileOne-based gaze estimator, L2CS-Net is the reference architecture from the original paper (Abdelrahman et al., 2022) and was trained on the Gaze360 dataset for 360° gaze coverage. Faces are detected with face-api.js, cropped, resized to 448 x 448, normalised with ImageNet statistics, and decoded into yaw and pitch via the classic 90-bin softmax-expectation formulation.

Within the Toolkit, it follows the same run workflow as the other JS tools: webcam capture, recording controls, downloadable CSV predictions, and an overview panel summarising the session.

Output Description
yaw_degrees Horizontal gaze angle predicted by the L2CS-Net model.
pitch_degrees Vertical gaze angle predicted by the L2CS-Net model.
face_score Confidence score from the face detector used to crop the face.
face_index Per-frame face identifier for multi-person recordings.

Source Code