본문 바로가기 메뉴바로가기

Papers

Unified spatio-temporal attention mixformer for visual object tracking

http://10.1016/j.engappai.2024.108682

  • AuthorJinjoo Song,Minho Park,Sang Min Yoon,윤강준
  • JournalEngineering Applications of Artificial Intelligence (0952-1976), 134(2024), 108682 ~ -
  • Enrollment typeSCIE
  • publication date 20240801
In this paper, we present a unified spatio-temporal attention MixFormer framework for visual object tracking. Within the vision transformer framework, we design a cohesive network consisting of target template and search region feature extraction, cross-attention utilizing spatial and temporal information, and task-specific heads, all operating in an end-to-end manner.