Loading Sharingan…
FPS
Infer
Faces
Samples 0

Sharingan uses a ViT-12 + conditional DPT decoder to predict where each person is looking. Up to 3 people share a single forward pass. Requires WebGPU.

Loading model...