Computer Vision
Computer vision, a field focused on enabling computers to "see" and interpret images and videos, aims to develop algorithms that can perform tasks such as object detection, image classification, and scene understanding. Current research heavily utilizes deep learning, particularly convolutional neural networks (CNNs) and vision transformers (ViTs), often combined with techniques like multi-modal fusion (integrating data from different sensors) and transfer learning to improve efficiency and accuracy. These advancements are driving significant progress in diverse applications, including precision agriculture, robotics, medical imaging analysis, and autonomous systems, by providing automated, efficient, and objective solutions to complex visual tasks.
Papers
Flaws of ImageNet, Computer Vision's Favourite Dataset
Nikita Kisel, Illia Volkov, Katerina Hanzelkova, Klara Janouskova, Jiri Matas
TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba
Xiaowen Ma, Zhenliang Ni, Xinghao Chen
Event-based Spiking Neural Networks for Object Detection: A Review of Datasets, Architectures, Learning Rules, and Implementation
Craig Iaboni, Pramod Abichandani
Machine vision-aware quality metrics for compressed image and video assessment
Mikhail Dremin (1), Konstantin Kozhemyakov (1), Ivan Molodetskikh (1), Malakhov Kirill (2), Artur Sagitov (2 and 3), Dmitriy Vatolin (1) ((1) Lomonosov Moscow State University, (2) Huawei Technologies Co., Ltd., (3) Independent Researcher Linjianping)
Can KAN Work? Exploring the Potential of Kolmogorov-Arnold Networks in Computer Vision
Yueyang Cang, Yu hang liu, Li Shi
Autoregressive Models in Vision: A Survey
Jing Xiong, Gongye Liu, Lun Huang, Chengyue Wu, Taiqiang Wu, Yao Mu, Yuan Yao, Hui Shen, Zhongwei Wan, Jinfa Huang, Chaofan Tao, Shen Yan, Huaxiu Yao, Lingpeng Kong, Hongxia Yang, Mi Zhang, Guillermo Sapiro, Jiebo Luo, Ping Luo, Ngai Wong
Cascaded Dual Vision Transformer for Accurate Facial Landmark Detection
Ziqiang Dang, Jianfang Li, Lin Liu