GV-Bench: Benchmarking Local Feature Matching for Geometric Verification of Long-term Loop Closure Detection

1 Hong Kong University of Science and Technology (HKUST)
2 Southern University of Science and Technology (SUSTech)
3 University College London (UCL)
IROS 2024

*Corresponding Author
MY ALT TEXT

TL;DR

This paper proposes a unified benchmark targeting geometric verification of loop closure detection under long-term conditional variations. We evaluate six representative local feature matching methods (handcrafted and learning-based) under the benchmark, with in-depth analysis for limitations and future directions.

Contributions

  1. Fair and accessible geometric verification evaluation. We open-source an out-of-box framework with a modular design as illustrated in Fig. 3, allowing for evaluating newly proposed methods on the common ground and extending to more diverse datasets.
  2. A systematic analysis of geometric verification. By employing the proposed benchmark, we point out possible future directions (e.g., training feature extractor and matcher with conditional variation data) through extensive experiments.

Key Insight

All methods suffer from illumination variations and perceptual aliasing. In order to improve the robustness of geometric verification, several potentally effective strategies are proposed:

  1. Training feature extractor and matcher with conditional variation data.
  2. Building a multi-condition image database.
  3. Exploiting more powerful outlier rejection.
Detailed analysis can be found in the paper, Section IV D. Discussion. Besides, we provide further experiments on false positive analysis in the supplementary material. The benchmark will be keep updated and extending to other loop closure verification methods (Contributions are welcome and appreciated!).

Abstract

Visual loop closure detection is an important module in visual simultaneous localization and mapping (SLAM), which associates current camera observation with previously visited places. Loop closures correct drifts in trajectory estimation to build a globally consistent map. However, a false loop closure can be fatal, so verification is required as an additional step to ensure robustness by rejecting the false positive loops. Geometric verification has been a well-acknowledged solution that leverages spatial clues provided by local feature matching to find true positives. Existing feature matching methods focus on homography and pose estimation in long-term visual localization, lacking references for geometric verification. To fill the gap, this paper proposes a unified benchmark targeting geometric verification of loop closure detection under long-term conditional variations. Furthermore, we evaluate six representative local feature matching methods (handcrafted and learning-based) under the benchmark, with in-depth analysis for limitations and future directions.

Geometric Verification

Loop Closure Detection consists of two stages: retrieval and verification. Potential loop closure pairs {qi , ci,j } detected by the retrieval stage are sent for verification. Each pair of images is examined under geometric constraints provided by local feature matching. RANSAC filters the matched correspondences to find the best inliers, which is used as the probability in binary classification.

MY ALT TEXT

Benchmark Introduction

The pipeline of open-sourced benchmark consists of:

  1. Pre-process dataset,
  2. Randomly select query set (if the dataset does not provide it),
  3. Retrieve verification candidates for each query,
  4. Match queries with candidates.
The dashed modules (Datasets, Retrieval Methods, and Local Feature Matching) are expendable in the open-sourced framework, enabling easy customization for research purposes (i.e., enlarging sequences, using other retrieval methods, and evaluating new feature matching methods.)

MY ALT TEXT

Benchmark Sequences

The benchmark consists of six sequences covering mainly three types of conditional changes: illumination (Night and UAcampus), seasonal (Season and Nordland), and weather changes in long-term loop closure detection. The “Day” sequence serves as the baseline challenge with moderate environmental changes over a short period.

MY ALT TEXT

Matching Methods

MY ALT TEXT

Experiment Results

In the proposed benchmark, we use two metrics for evaluation: maximum recall @100 precision (MR) and average precision (AP). The MR represents the highest recall while keeping the precision to 100%, representing the ability to find true loop closures without false positives.

MY ALT TEXT
MY ALT TEXT
Precision-recall curve of the “Day” sequence. The marker annotates the maximum recall @100 precision (MR). The area under the curve (AUC) represents average precision (AP).

Feature Matching Examples

In the figures below, we visualize LoFTR matches with vanilla RANSAC. The inliers (green lines) and outliers (red lines) are highlighted. The number of inliers of (c) and (d) are counter-intuitive because RANSAC fails when false matches are dominant (more detailed analysis are provided in Sec. IV-D of the paper).
MY ALT TEXT
(a) Negative pair in “Nordland” with inliers: 9
MY ALT TEXT
(b) Positive pair in “Nordland” with inliers: 37
MY ALT TEXT
(c) Negative pair in “Day” with inliers: 45
MY ALT TEXT
(d) Positive pair in “Day” with inliers: 41

BibTeX


        @article{yu2024gv,
        title={GV-Bench: Benchmarking Local Feature Matching for Geometric Verification of Long-term Loop Closure Detection},
        author={Yu, Jingwen and Ye, Hanjing and Jiao, Jianhao and Tan, Ping and Zhang, Hong},
        journal={arXiv preprint arXiv:2407.11736},
        year={2024}}