JournalImage and Vision Computing (0262-8856), 150(2024), 105239 ~ -
Enrollment typeSCIE
publication date 20241001
Vision transformer models provide superior performance compared to convolutional neural networks for various computer vision tasks but require increased computational overhead with large datasets. This paper proposes a patch selective vision transformer that effectively selects patches to reduce computational costs while simulta neously extracting global and local self-representative patch information to maintain performance.