학술지Image and Vision Computing (0262-8856), 150(2024), 105239 ~ -
등재유형SCIE
게재일자 20241001
Vision transformer models provide superior performance compared to convolutional neural networks for various computer vision tasks but require increased computational overhead with large datasets. This paper proposes a patch selective vision transformer that effectively selects patches to reduce computational costs while simulta neously extracting global and local self-representative patch information to maintain performance.