ITSC 2024 Paper Abstract

Paper WeBT13.7

Shi, Yunpu (Sun Yat-sen University), Ren, Jiangtao (Sun Yat-sen University)

Integrating Transformer and Convolution for Vehicle Re-Identification

Scheduled for presentation during the Poster Session "Transformer networks" (WeBT13), Wednesday, September 25, 2024, 14:30−16:30, Foyer

2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), September 24- 27, 2024, Edmonton, Canada

This information is tentative and subject to change. Compiled on April 25, 2025

Keywords Sensing, Vision, and Perception

Abstract

Effectively extracting discriminative image features while reducing detail loss is a key issue in vehicle re-identification. However, current methods often have shortcomings. In methods based on convolutional neural networks (CNN), due to the limited receptive field of convolutional layers, it is difficult for them to capture the correlation between local features. In methods based on Transformer, when partitioning images into patches, it inevitably leads to a loss of image structure. To address these issues, this paper proposes a new vehicle re-identification model named TransVPEM, which combines convolution with the Transformer architecture, the self-attention of Transformer can effectively capture long-range dependencies and provides a global view. Before constructing the Transformer input sequence, we first generate feature maps with overlapping pixel regions through concatenated convolutional operations, addressing the loss of image information caused by image partitioning in Transformer. To further enhance the performance of the model, we introduce a viewpoint position embedding module (VPE) to convert viewpoint information into learnable embedding vectors, reducing the differences caused by different viewpoints of the same vehicle. Additionally, we extract local features and fuse them with global features to enhance the model’s robustness against interference. The performance of TransVPEM is validated by extensive experiments on VeRi-776 and Truck-ID datasets.