YOLOv10 Object Detector


            Real-Time Object Detection

RUN

Background

Object detection locates objects in an image or video frame and returns a class label, confidence score, and bounding box for each match. YOLO models are designed for this real-time setting: they make dense predictions in a single forward pass, which makes them practical for interactive browser tools.

YOLOv10 is a recent YOLO architecture that removes the need for traditional non-maximum suppression at inference time. This browser implementation uses a YOLOv10n ONNX export so the model can run locally in the page with ONNX Runtime Web.

Process

The tool captures frames from either the webcam or an uploaded video. Each frame is letterboxed to a 640 by 640 model input, normalized to RGB values, and passed to the YOLOv10n ONNX model. The model returns end-to-end detections containing bounding box coordinates, confidence scores, and COCO class IDs.

Detected objects are drawn directly on the video canvas. While recording, the toolkit stores one row per detected object, including timing, frame number, tag count, class, confidence score, and bounding box geometry.

Results

The run page follows the toolkit's standard workflow. Users can run live detection from the webcam, record a session, tag moments, upload a video, view a session overview, and download both the captured video and prediction CSV. The confidence threshold is configurable from the settings panel.

References

YOLOv10 source: https://github.com/THU-MIG/yolov10
ONNX model: https://huggingface.co/onnx-community/yolov10n
ONNX Runtime Web: https://onnxruntime.ai/docs/tutorials/web/

Source Code

https://github.com/THU-MIG/yolov10