ViT Lens
ViT-Lens research focuses on adapting the strengths of Vision Transformers (ViTs) to diverse data modalities beyond standard images, aiming to create more versatile and powerful AI models. Current work centers on developing efficient methods for projecting various data types (e.g., EEG, 3D point clouds, audio) into a shared representation space processable by pre-trained ViTs, often incorporating novel attention mechanisms or hybrid CNN-ViT architectures. This approach promises to improve the efficiency and generalizability of AI systems across a wider range of applications, particularly in areas like medical imaging, video analysis, and robotics, where multimodal data is prevalent.
Papers
November 6, 2024
October 15, 2024
September 18, 2024
August 29, 2024
June 14, 2024
May 21, 2024
May 6, 2024
April 25, 2024
April 16, 2024
March 27, 2024
January 23, 2024
January 13, 2024
January 11, 2024
November 27, 2023
November 26, 2023
November 13, 2023
October 12, 2023
October 4, 2023
September 27, 2023
September 22, 2023