Note: this post is only meant for personal digestion and interpretation. It is incomplete and may mislead readers.
This work introduces relation and attention strategy (Scaled Dot-Product Attention) as a module to model the location relation between objects.
Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, Yichen Wei
Scaled Dot-Product Attention
 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. arXiv preprint arXiv:1706.03762, 2017.
Given input set of objects
geometric feature: (4-dimensional object bounding box)
appearance feature: (task dependant)
realation feature of the whole object set with respect to the object is
( is extra compared to attention model in )
Duplicate removal network
Duplicate removal is a two class classiﬁcation problem. For each ground truth object, only one detected object matched to it is classiﬁed as correct. Others matched to it are classiﬁed as duplicate. This classiﬁcation is performed via a network, which output binary classiﬁcation probability (1 for correct and 0 for duplicate). The multiplication of two scores is the ﬁnal classiﬁcation score.