Fine-Grained Scene Graph Generation with Overlap Region and Geometrical Center

dc.contributor.authorZhao, Yong Qiangen_US
dc.contributor.authorJin, Zhien_US
dc.contributor.authorZhao, Hai Yanen_US
dc.contributor.authorZhang, Fengen_US
dc.contributor.authorTao, Zheng Weien_US
dc.contributor.authorDou, Cheng Fengen_US
dc.contributor.authorXu, Xin Haien_US
dc.contributor.authorLiu, Dong Hongen_US
dc.contributor.editorUmetani, Nobuyukien_US
dc.contributor.editorWojtan, Chrisen_US
dc.contributor.editorVouga, Etienneen_US
dc.date.accessioned2022-10-04T06:41:23Z
dc.date.available2022-10-04T06:41:23Z
dc.date.issued2022
dc.description.abstractScene graph generation refers to the task of identifying the objects and specifically the relationships between the objects from an image. Existing scene graph generation methods generally use the bounding boxes region features of objects to identify the relationships between objects. However, we feel that the overlap region features of two objects may play an important role in fine-grained relationship identification. In fact, some fine-grained relationships can only be obtained from the overlap region features of two objects. Therefore, we propose the Multi-Branch Feature Combination (MFC) module and Overlap Region Transformer (ORT) module to comprehensively obtain the visual features contained in the overlap regions of two objects. Concretely, the MFC module uses deconvolution and multi-branch dilation convolution to obtain high-pixels and multi-receptive field features in the overlap regions. The ORT module uses the vision transformer to obtain the self-attention of the overlap regions. The joint use of these two modules achieves the mutual complementation of local connectivity properties of convolution and the global connectivity properties of attention. We also design a Geometrical Center Augmented (GCA) module to obtain the relative position information of the geometric centers between two objects, to prevent the problem that only relying on the scale of the overlap region cannot accurately capture the relationship between two objects. Experiments show that our model ORGC (Overlap Region and Geometrical Center), the combination of the MFC module, the ORT module, and the GCA module, can enhance the performance of fine-grained relation identification. On the Visual Genome dataset, our model outperforms the current state-of-the-art model by 4.4% on the R@50 evaluation metric, reaching a state-of-the-art result of 33.88.en_US
dc.description.number7
dc.description.sectionheadersImage Detection and Understanding
dc.description.seriesinformationComputer Graphics Forum
dc.description.volume41
dc.identifier.doi10.1111/cgf.14683
dc.identifier.issn1467-8659
dc.identifier.pages359-370
dc.identifier.pages12 pages
dc.identifier.urihttps://doi.org/10.1111/cgf.14683
dc.identifier.urihttps://diglib.eg.org:443/handle/10.1111/cgf14683
dc.publisherThe Eurographics Association and John Wiley & Sons Ltd.en_US
dc.subjectCCS Concepts: Computing methodologies → Artificial Intelligence; Neural Networks; Computer Vision
dc.subjectComputing methodologies → Artificial Intelligence
dc.subjectNeural Networks
dc.subjectComputer Vision
dc.titleFine-Grained Scene Graph Generation with Overlap Region and Geometrical Centeren_US
Files
Original bundle
Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
v41i7pp359-370.pdf
Size:
29.53 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
appendix.pdf
Size:
75.03 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
figures.zip
Size:
24.29 MB
Format:
Zip file
Collections