Text Modality
Text modality research explores how textual information can be effectively integrated with other data modalities (e.g., images, audio, video) to improve the performance and capabilities of AI models. Current research focuses on developing multimodal models using transformer architectures and diffusion models, often incorporating techniques like prompt tuning and meta-learning to enhance controllability and generalization. This work is significant because it enables more sophisticated AI systems capable of understanding and generating complex information across various data types, with applications ranging from improved medical diagnosis to more realistic virtual environments.
Papers
Bag of Tricks for Multimodal AutoML with Image, Text, and Tabular Data
Zhiqiang Tang, Zihan Zhong, Tong He, Gerald Friedland
Why language models collapse when trained on recursively generated text
Lecheng Wang, Xianjie Shi, Ge Li, Jia Li, Yihong Dong, Xuanming Zhang, Wenpin Jiao, Hong Mei
Overview of the 2024 ALTA Shared Task: Detect Automatic AI-Generated Sentences for Human-AI Hybrid Articles
Diego Mollá, Qiongkai Xu, Zijie Zeng, Zhuang Li
Bridging the Data Provenance Gap Across Text, Speech and Video
Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A. Alghamdi, Vu Minh Chien, Naana Obeng-Marnu, Da Yin, Kun Qian, Yizhi Li, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N. Lee, Campbell S. Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester JV Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara
LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering
Jinhe Bi, Yujun Wang, Haokun Chen, Xun Xiao, Artur Hecker, Volker Tresp, Yunpu Ma
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types
Xuanliang Zhang, Dingzirui Wang, Baoxin Wang, Longxu Dou, Xinyuan Lu, Keyan Xu, Dayong Wu, Qingfu Zhu, Wanxiang Che
Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation
Longyue Wang, Siyou Liu, Chenyang Lyu, Wenxiang Jiao, Xing Wang, Jiahao Xu, Zhaopeng Tu, Yan Gu, Weiyu Chen, Minghao Wu, Liting Zhou, Philipp Koehn, Andy Way, Yulin Yuan
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors
Xiaokun Sun, Zeyu Cai, Zhenyu Zhang, Ying Tai, Jian Yang
DART: An AIGT Detector using AMR of Rephrased Text
Hyeonchu Park, Byungjun Kim, Bugeun Kim
NoteContrast: Contrastive Language-Diagnostic Pretraining for Medical Text
Prajwal Kailas, Max Homilius, Rahul C. Deo, Calum A. MacRae
Text and Image Are Mutually Beneficial: Enhancing Training-Free Few-Shot Classification with CLIP
Yayuan Li, Jintao Guo, Lei Qi, Wenbin Li, Yinghuan Shi
Diffusion-Enhanced Test-time Adaptation with Text and Image Augmentation
Chun-Mei Feng, Yuanyang He, Jian Zou, Salman Khan, Huan Xiong, Zhen Li, Wangmeng Zuo, Rick Siow Mong Goh, Yong Liu
When Text Embedding Meets Large Language Model: A Comprehensive Survey
Zhijie Nie, Zhangchi Feng, Mingxin Li, Cunwang Zhang, Yanzhao Zhang, Dingkun Long, Richong Zhang
From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning
Pusen Dong, Tianchen Zhu, Yue Qiu, Haoyi Zhou, Jianxin Li
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Andreas Koukounas, Georgios Mastrapas, Bo Wang, Mohammad Kalim Akram, Sedigheh Eslami, Michael Günther, Isabelle Mohr, Saba Sturua, Scott Martens, Nan Wang, Han Xiao
TECO: Improving Multimodal Intent Recognition with Text Enhancement through Commonsense Knowledge Extraction
Quynh-Mai Thi Nguyen, Lan-Nhi Thi Nguyen, Cam-Van Thi Nguyen
Doubly-Universal Adversarial Perturbations: Deceiving Vision-Language Models Across Both Images and Text with a Single Perturbation
Hee-Seon Kim, Minbeom Kim, Changick Kim