JS-Aruco Marker Detector

Fiducial Marker Detection

RUN

Background

Square fiducial markers have become a popular method for pose estimation in applications such as autonomous robots, unmanned vehicles or virtual trainers. The markers allow users to estimate the position of the camera used with minimal cost, high robustness, and speed. The markers can be easily created and made with a regular printer, allowing users to place them in desired environments and then register their locations from a set of images. There are several types of fiducial markers available, with each of them belonging to a dictionary, such as ArToolKit+, Chilitags, AprilTags, and ArUco. The dictionaries are designed to have markers as different and distinguishable as possible to avoid confusions and allow users to clearly classify them. More on how these markers are created can be found in reference 2 below.

The multimodal toolkit contains an implementation of fiducial markers tracking using ArUco, an OpenSource library developed using OpenCV for detecting squared fiducial markers in images. The application takes in a video sequence as an input and subsequently identifies all the markers present in each frame and returns their locations.

Process

The marker detection process of ArUco is as follows:

Results

To use the fiducial markers tracking function, simply select the ‘Fiducial Marker Detection’ option from the tools page. Once the page loads, ensure that the webcam is enabled and position fiducial trackers within the frame that the camera can detect. The function will automatically detect the locations of the fiducial trackers and record its location in the video frame. Users can also record a video and download the data captured.

The final output for fiducial markers tracking is a series of X and Y coordinates in a CSV file (see example in figure below). The X and Y coordinates are with respect to the top left of the video input (which is 0, 0). For each frame the model tracks, if the model finds one or more fiducial markers, it will return the X and Y coordinates of the four corners of each marker. We will then record the user's local time when the frame was tracked, the number of the frame (since the beginning of the recording), the number of the marker (with the most top left marker being 1), and the corner coordinates (corner1_x, corner1_y, etc). Each line in the output file corresponds to one marker in one frame. For frames where no markers are detected, we omit them in the final output.

References

Source Code