Computer Vision
Computer vision, a field focused on enabling computers to "see" and interpret images and videos, aims to develop algorithms that can perform tasks such as object detection, image classification, and scene understanding. Current research heavily utilizes deep learning, particularly convolutional neural networks (CNNs) and vision transformers (ViTs), often combined with techniques like multi-modal fusion (integrating data from different sensors) and transfer learning to improve efficiency and accuracy. These advancements are driving significant progress in diverse applications, including precision agriculture, robotics, medical imaging analysis, and autonomous systems, by providing automated, efficient, and objective solutions to complex visual tasks.
Papers
Spatial Context-based Self-Supervised Learning for Handwritten Text Recognition
Carlos Penarrubia, Carlos Garrido-Munoz, Jose J. Valero-Mas, Jorge Calvo-Zaragoza
CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect
Minh Tran, Sang Truong, Arthur F. A. Fernandes, Michael T. Kidd, Ngan Le
Performance of computer vision algorithms for fine-grained classification using crowdsourced insect images
Rita Pucci, Vincent J. Kalkman, Dan Stowell
VF-NeRF: Viewshed Fields for Rigid NeRF Registration
Leo Segre, Shai Avidan
LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity
Walid Bousselham, Angie Boggust, Sofian Chaybouti, Hendrik Strobelt, Hilde Kuehne