Summarising: ‘Open-set Object Detection by Aligning Known Class Representations’

By Vishal Chudasama, Senior Engineer at Sony Research India
19th February 2024

In this blog, Vishal Chudasama, Senior Engineer at Sony Research India summarises the paper titled ‘Open-set Object Detection By Aligning Known Class Representations’ that was co-authored by Hiran Sarkar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik and Professor Vineeth Balasubramanian (IIT Hyderabad) and accepted as an oral presentation at the prestigious IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024.

Open-Set Object Detection (OSOD) has emerged as a contemporary research direction to address the detection of unknown objects. Recently, few works have achieved remarkable performance in the OSOD task by employing contrastive clustering to separate unknown classes.
In contrast, this work introduces-
  • A novel semantic clustering module to group the features in the semantic space that facilitates improved cluster boundary separation, especially between semantically similar objects.
  • A class decorrelation module to further encourage separation of the formed clusters.
  • a new loss known as object focus loss to enable a more resilient learning process of Region Proposal Network that facilitates the unconstrained detection of unknown objects.
  • An evaluation technique that penalizes low-confidence outputs to mitigate the risk of misclassification of the unknown objects.
  • A new metric called HMP that combines known and unknown precision using harmonic mean.
The introduced modules and techniques help the proposed method to align known class representations effectively so that it can detect the unknown objects accurately. To validate this, we carried out extensive experiments & ablation studies and found that the proposed method outperforms existing SOTA methods with significant improvement on the MS-COCO & PASCAL VOC dataset for the OSOD task.

